Measuring Agents in Production (Dec 2025)
- Title: Measuring Agents in Production (Dec 2025)
- Link: http://arxiv.org/abs/2512.04123v1
- Date: December 2025
Abstract
This paper presents “Measuring Agents in Production” (MAP), a large-scale systematic study of AI agents actively deployed in production across industries like finance, healthcare, and software development. Through a survey of 306 practitioners and 20 in-depth case studies, the authors analyze the motivations, architectures, and challenges of real-world agent deployments. The findings reveal that production agents prioritize simplicity and controllability: 70% rely on off-the-shelf models without fine-tuning, and most execute fewer than 10 steps before requiring human intervention. While productivity gains drive adoption, reliability remains the primary technical bottleneck, leading to a heavy reliance on human-in-the-loop evaluation strategies.
Key Topics:
- AI Agents in Production
- Software Engineering for AI
- Large Language Models (LLMs)
- Human-in-the-loop Evaluation
- Agent Reliability
- System Architecture
- Deployment Challenges