A self-evolving system uses Google's Gemini LLMs to autonomously generate, train, and deploy recommendation model improvements. It features an Offline Agent for hypothesis generation and an Online Agent for production validation. Deployed successfully at YouTube, surpassing manual workflows.
Key Points
- 1.Autonomous end-to-end optimization with LLM agents
- 2.Offline/Online loops for proxy and business metrics
- 3.Novel discoveries in optimizers, architectures, rewards
Impact Analysis
Accelerates development velocity and boosts model performance at scale, like YouTube, reducing manual engineering efforts.
Technical Details
Leverages Gemini LLMs as MLE agents; inner loop uses proxy metrics, outer loop validates live metrics.