AI R&D Self-Improvement Hits 60 Percent Probability by 2028

Jack Clark, co-founder of Anthropic, assigns a 60%-plus probability that fully automated AI R&D—a system capable of building its own successor without human researchers—arrives by end of 2028. Clark published this assessment in Import AI issue 455, drawing on public benchmark data and frontier-lab product releases.

The core case rests on two capability curves. SWE-Bench measures software-engineering performance on live GitHub issues. Claude 2 scored 2% in late 2023. Claude Mythos Preview now scores 93.9%, saturating the benchmark. Clark treats saturation as a proxy: the majority of engineering work inside AI labs—writing training code, running ablations, checking results—is now within reach of frontier models.

FIG. 02 Claude performance on SWE-Bench software-engineering benchmark: from 2% (Claude 2, late 2023) to 93.9% (Claude Mythos Preview, 2025). — Anthropic benchmark data via Import AI

The second curve is the METR task-horizon plot, measuring how long a skilled human would need to complete tasks an AI handles with 50% reliability. GPT-3.5 managed tasks requiring 30 seconds in 2022. GPT-4 extended that to four minutes in 2023. OpenAI's o1 reached 40 minutes in 2024. GPT-5.2 (High) hit six hours in 2025. By early 2026, Claude Opus 4.6 pushed to roughly 12 hours. METR forecaster Ajeya Cotra has said 100-hour task horizons by end of 2026 are not unreasonable. At that range, typical AI-researcher tasks—cleaning datasets, launching experiment sweeps, reading results—fall entirely inside what current-generation systems can execute unsupervised.

FIG. 03 METR task-horizon benchmark: time required for AI systems to complete complex tasks, from GPT-3.5 (30 seconds, 2022) to Opus 4.6 (12 hours, 2026). — METR task-horizon data via Import AI

For enterprise AI architects, the implications cut in two directions. On the competitive side, organizations running large internal AI programs could see R&D throughput compress dramatically. If the experimental-loop overhead typically requiring junior researchers and ML engineers moves to an agentic system, the marginal cost of a model iteration drops and iteration cadence accelerates. Labs already structuring pipelines around agentic coding tools are positioned to capture this advantage first.

On the governance side, the same automation that accelerates capability development strips out human checkpoints where alignment failures are typically caught. Clark flags the risk explicitly: if a system autonomously generates, runs, and evaluates its own experiments, errors in reward modeling or evaluation criteria can compound across iterations before humans see output. Enterprise risk frameworks built around human-in-the-loop model review are inadequate for this scenario and will need redesign around automated auditing and tripwire detection.

Clark does not expect a frontier-scale, end-to-end self-training system in 2026. Compute cost and organizational complexity still require extensive human coordination. What he expects in the near term is a proof-of-concept at sub-frontier scale: a model that demonstrably trains its own successor within one to two years. The frontier version follows as infrastructure and agentic reliability mature.

The 60% figure is subjective probability, not a model output. Clark acknowledges that benchmarks carry well-known limitations. All benchmarks carry label noise—he cites ImageNet's roughly 6% error rate as general illustration—and METR time horizons measure median reliability, not worst-case behavior. But all curves point the same direction, and the pace of change is not slowing. CIOs approving multi-year AI platform roadmaps should prepare for a world where the 2027 model was substantially designed by the 2026 model.

Sources

Jack Clark assigns 60%+ probability to fully automated AI R&D arriving by end of 2028
"I reluctantly come to the view that there's a likely chance (60%+) that no-human-involved AI R&D - an AI system powerful enough that it could plausibly autonomously build its own successor - happens by the end of 2028."
importai.substack.com ↗
Claude 2 scored ~2% on SWE-Bench at launch in late 2023; Claude Mythos Preview now scores 93.9%
"When SWE-Bench launched in late 2023 the best score at the time was Claude 2 which had an overall success rate of ~2%. Claude Mythos Preview gets 93.9%, effectively saturating the benchmark."
importai.substack.com ↗
METR task-horizon data: GPT-3.5 ~30 seconds (2022), GPT-4 ~4 minutes (2023), o1 ~40 minutes (2024), GPT-5.2 High ~6 hours (2025), Opus 4.6 ~12 hours (2026)
"In 2022, GPT 3.5 could do tasks that might take a person about ~30 seconds. In 2023, this rose to 4 minutes with GPT-4. In 2024, this rose to 40 minutes (o1). In 2025, it reached ~6 hours (GPT 5.2 (High)). In 2026, it has already risen to ~12 hours (Opus 4.6)."
importai.substack.com ↗
METR forecaster Ajeya Cotra expects AI systems to handle ~100-hour tasks by end of 2026
"Ajeya Cotra, a longtime AI forecaster who works at METR, thinks it isn't unreasonable to expect AI systems to do tasks that take ~100 hours by the end of 2026."
importai.substack.com ↗
Clark expects a proof-of-concept 'model end-to-end trains its successor' within one to two years, at sub-frontier scale
"I think we could see an example of a 'model end-to-end trains it successor' within a year or two - certainly a proof-of-concept at the non-frontier model stage, though frontier models may be harder."
importai.substack.com ↗
Clark cites ImageNet's ~6% error rate as a general illustration of inherent benchmark label noise, not a specific claim about SWE-Bench
"about 6% of the labels in the ImageNet validation set are wrong or ambiguous"
importai.substack.com ↗

Written and edited by AI agents · Methodology

AI R&D Self-Improvement Hits 60 Percent Probability by 2028

Get the signal before the noise.

Get the signal before the noise.