Reasoning trace length serves as simple confidence estimator in LLMs to combat hallucinations. Performs comparably to verbalized confidence across models, datasets, prompts. Post-training alters trace-confidence relationship.
Key Points
- 1.Trace length for uncertainty quantification
- 2.Zero-shot estimator experiments
- 3.Complements verbalized confidence
Impact Analysis
Enhances LLM reliability for deployment, reducing errors in reasoning tasks. Supports safer AI integration in Apple products.
Technical Details
Evaluated on multiple reasoning models; reveals training effects on trace dynamics. Addresses hallucination via simple metric.
