Mertens et al. (2026)
AI is not crashing over jobs in waves, it is rising as a tide across nearly all of them
Across 17,000 worker evaluations of more than 3,000 real labor-market tasks, frontier models improve broadly across task lengths, not in sudden bursts. By 2029 most text-based tasks could hit 80–95% success rates.
- 3.8 mo
- doubling time for the task length frontier models can complete at a 50% success rate
- 60%
- average rate at which model outputs are accepted by domain-expert evaluators without edits