PAHF is a framework for continual personalization of AI agents, learning online from live human interactions via explicit per-user memory. It uses a three-step loop: pre-action clarification, preference-grounded actions, and post-action feedback for memory updates. Evaluated on new benchmarks for manipulation and shopping, it outperforms baselines in initial learning and adaptation to preference shifts.
Key Points
- 1.Introduces PAHF with three-step loop for online personalization
- 2.Develops benchmarks for embodied manipulation and online shopping
- 3.Integrates explicit per-user memory and dual feedback channels
- 4.Outperforms no-memory and single-channel baselines empirically
- 5.Theoretical analysis validates faster learning and adaptation
Impact Analysis
PAHF advances user-aligned AI agents, enabling rapid adaptation to evolving preferences without static datasets. This could transform personalized applications like assistants and robotics, reducing misalignment errors in real-world deployments.
Technical Details
Framework operationalizes pre-action clarification, memory-retrieved preference grounding, and feedback-driven memory updates. Evaluation uses four-phase protocol quantifying initial learning and persona shift adaptation. Results show critical role of explicit memory and dual channels.