Engineering a 30B Mixture-of-Experts model that matches human Gold Medalists in the USAMO 2026.
SU-01 addresses the fundamental question: Can AI replicate the persistence and logical depth of human mathematicians? Beyond mere memorization, SU-01 represents a breakthrough in specialized scaling laws. By applying a 4-step growth guide, we evolved a generally capable AI into an elite mathematical reasoning expert.
The foundation wasn't built on data volume, but on rigor. We used 338,000 highly refined trajectories from elite sources like Evan Chen's Olympiad materials.
Prioritizes learning patterns the AI finds most challenging, correcting superficial habits early.
Not just answers, but complete thinking processes: exploration, self-checking, and error correction.
| Feature | General AI | SU-01 SFT |
|---|---|---|
| Learning Goal | Broad Info | Rigorous Proof |
| Difficulty Order | Random/Low | Reverse-Perplexity |
| Data Scale | Billions (General) | 338k (Elite Logical) |
Transitioning from "getting it right" to "doing it elegantly" through two-stage Reinforcement Learning.
Simulating tens of thousands of sessions to build "Mathematical Intuition." The reward is binary: 1 for right, 0 for wrong.
Transitioning from student to scholar. Evaluating proof quality and elegance through self-critique and experience replay.
The ultimate demonstration of persistence. SU-01 doesn't put down its pen until the final bell rings, scaling its thinking process to massive lengths.
"This signifies not just verbosity, but a high degree of concentration involving countless hypothesis formations, verifications, and backtracking for corrections."
The growth of SU-01 delivers a powerful message: true genius stems not from high intelligence alone, but from the attitude of questioning one's own thought processes. Even with a compact 30B architecture, we can surpass giants through rigorous logic and unwavering persistence.