Stop Thinking, Just Do!

Sungsoo Kim's Blog

Unlocking Iterative Reasoning for Any Image Editor

tagsTags

9 December 2025


EditThinker: Unlocking Iterative Reasoning for Any Image Editor (Dec 2025)

Abstract

This paper introduces EditThinker, a framework designed to enhance instruction-based image editing by implementing a “Think-while-Edit” paradigm. Addressing the limitations of single-turn editing models which often act as reactive executors, EditThinker employs a Multimodal Large Language Model (MLLM) to mimic human deliberation through an iterative cycle of critiquing results and refining instructions. The authors construct a large-scale dataset, THINKEDIT-140k, and utilize reinforcement learning to align the model’s reasoning with practical editing outcomes. Extensive experiments demonstrate that this approach significantly improves the instruction-following capabilities of various state-of-the-art image editors without requiring fine-tuning of the editors themselves.

Key Topics:

  • Image Editing
  • Iterative Reasoning
  • Multimodal Large Language Models (MLLM)
  • Reinforcement Learning
  • Instruction Following
  • Prompt Refinement