Scaling Agent Learning via Experience Synthesis (Nov 2025)

Link: http://arxiv.org/abs/2511.03773v1
Date: November 2025

Abstract

DREAMGYM is a unified framework addressing challenges in LLM agent reinforcement learning by synthesizing diverse, scalable experiences. It replaces costly real-environment rollouts with a reasoning-based experience model that derives consistent state transitions and feedback through step-by-step reasoning. The framework incorporates an experience replay buffer, initialized with offline data and enriched with fresh interactions, and adaptively generates challenging tasks for curriculum learning. Experiments show DREAMGYM significantly improves RL training in synthetic and sim-to-real settings, outperforming baselines on non-RL-ready tasks like WebArena and matching performance on costly RL-ready tasks with synthetic data alone, providing a scalable warm-start for general-purpose RL.

Key Topics:

Reinforcement Learning (RL)
Large Language Models (LLMs)
Experience Synthesis
Reasoning-based Experience Model
Curriculum Learning
Sim-to-Real Transfer
WebArena
ALFWorld
Sample Efficiency

Stop Thinking, Just Do!