Scaling Agent Learning via Experience Synthesis (Nov 2025)
- Link: http://arxiv.org/abs/2511.03773v1
- Date: November 2025
Abstract
DREAMGYM is a unified framework addressing challenges in LLM agent reinforcement learning by synthesizing diverse, scalable experiences. It replaces costly real-environment rollouts with a reasoning-based experience model that derives consistent state transitions and feedback through step-by-step reasoning. The framework incorporates an experience replay buffer, initialized with offline data and enriched with fresh interactions, and adaptively generates challenging tasks for curriculum learning. Experiments show DREAMGYM significantly improves RL training in synthetic and sim-to-real settings, outperforming baselines on non-RL-ready tasks like WebArena and matching performance on costly RL-ready tasks with synthetic data alone, providing a scalable warm-start for general-purpose RL.
Key Topics:
- Reinforcement Learning (RL)
- Large Language Models (LLMs)
- Experience Synthesis
- Reasoning-based Experience Model
- Curriculum Learning
- Sim-to-Real Transfer
- WebArena
- ALFWorld
- Sample Efficiency