Locomo-Plus benchmarks cognitive memory in LLM agents under cue-trigger disconnects, focusing on latent conversational constraints. It proposes constraint consistency evaluation over string-matching. Reveals gaps in existing memory systems.
Key Points
- 1.Beyond-factual recall evaluation
- 2.Long-context latent constraints
- 3.Public code and framework
Impact Analysis
Exposes limitations in LLM dialogue memory, guiding improvements for realistic agents.
Technical Details
Unified framework for semantic disconnect scenarios across models.