Research · Blogs & Independent Thinkers
Back to sweepResearch sweep · shallow · 2025 – present
The Karpathy Loop - AI Agents Running Autonomous Training Experiments
The "Karpathy loop" - autonomous AI agent research cycles that run and evaluate ML training experiments to discover improvements, April 2025–April 19 2026, including Karpathy's own explanations, independent commentary, and real-world implementations
- Claude Opus 4.5
- frontier
- blogs
- tech
Synthesised 2026-04-19
Narrative
The Karpathy loop emerges from March 2026 coverage as a watershed automation of ML research itself. Karpathy released autoresearch - a 630-line Python script that embodies the research loop: an AI agent reads its own training code, proposes hypotheses for improvement, modifies the script, runs time-boxed experiments (5 minutes per GPU), evaluates results against a locked metric, and iterates unsupervised. Over 48 hours, it ran ~700 experiments and discovered ~20 genuine optimizations that, stacked, delivered an 11% speedup in training time on GPT-2 (2.02 → 1.80 hours) - improvements on code already heavily optimized by one of the field's top researchers. Shopify's Tobias Lütke achieved 19% gains on proprietary data with 37 overnight experiments, evidence the pattern generalizes. AI Scientist (Substack) positions this within a comparator ecosystem of autonomous research systems (Sakana AI Scientist v2, CycleResearcher, data-to-paper), confirming autonomous research is operational, not experimental. Shared Sapience reveals OpenAI's roadmap: autonomous research interns by September 2026, full multi-agent research systems by 2028 - showing the loop is foundational to industry strategy, not isolated innovation. Practitioners (Balu Kosuri, Medium) extend the framework beyond ML training to universal optimization; NextBigFuture connects it to Karpathy's long-standing vision of 'the self-improvement loopy era.' Emerging commentary surfaces the tension between autonomy and guardrails - ensuring autonomous cycles don't drift from intended objectives as they scale to scientific discovery, chip design, and pharmaceutical research.
Sources
| ID | Title | Outlet | Date | Significance |
|---|---|---|---|---|
| b1 | An early experiment in autonomous science | AI Scientist (Substack) | 2026 | Comparative benchmarking of autonomous research systems including Sakana AI Scientist v1/v2, CycleResearcher, and data-to-paper, positioning Karpathy's work within a broader ecosystem of autonomous researchers. |
| b2 | OpenAI targets an autonomous researcher by September | Shared Sapience (Substack) | 2026-03 | Reveals OpenAI's autonomous researcher roadmap (Sept 2026 timeline for research interns, 2028 for full multi-agent system), situating Karpathy's loop within industry-wide agentic engineering strategy. |
| b3 | I Turned Andrej Karpathy's Autoresearch Into a Universal Skill | Medium | 2026-03 | Practitioner implementation extending autoresearch beyond ML training to business optimization, advancing the thesis that the loop is a universal pattern for autonomous optimization. |
| b4 | Karpathy Just Turned One GPU Into a Research Lab | Garry's List (independent tech blog) | 2026 | Independent technical commentary on autoresearch's capabilities and implications for the future of ML research methodology. |
| b5 | Andrej Karpathy on Code Agents, AutoResearch and the Self Improvement Loopy Era of AI | NextBigFuture | 2026-03 | Frames autoresearch within Karpathy's long-standing vision of the 'self-improvement loopy era' and agentic engineering, connecting technical innovation to broader AI development philosophy. |