Stop Thinking, Just Do!

Sungsoo Kim's Blog

Measuring Agents in Production

tagsTags

8 December 2025


Measuring Agents in Production (Dec 2025)

Abstract

This paper presents “Measuring Agents in Production” (MAP), a large-scale systematic study of AI agents actively deployed in production across industries like finance, healthcare, and software development. Through a survey of 306 practitioners and 20 in-depth case studies, the authors analyze the motivations, architectures, and challenges of real-world agent deployments. The findings reveal that production agents prioritize simplicity and controllability: 70% rely on off-the-shelf models without fine-tuning, and most execute fewer than 10 steps before requiring human intervention. While productivity gains drive adoption, reliability remains the primary technical bottleneck, leading to a heavy reliance on human-in-the-loop evaluation strategies.

Key Topics:

  • AI Agents in Production
  • Software Engineering for AI
  • Large Language Models (LLMs)
  • Human-in-the-loop Evaluation
  • Agent Reliability
  • System Architecture
  • Deployment Challenges