Anthropic released Claude Opus 4.7 on April 16, 2026, with a 1M token context window at standard API pricing: $5 per million input tokens, $25 per million output tokens. Teams can now hold 100k+ codebases plus multi-session project notes in a single prompt without hitting a cost cliff.

The model scored 0.715 on Anthropic's internal research-agent benchmark, tied for top across six modules. On the General Finance module—the largest—Opus 4.7 scored 0.813 versus 0.767 for Opus 4.6. The 46-point gain matters for code-review agents tracking long sequences of tool calls without losing context.

Opus 4.7 achieves research-agent parity and lifts General Finance by 5.8 percentage points.
FIG. 02 Opus 4.7 achieves research-agent parity and lifts General Finance by 5.8 percentage points. — Anthropic, 2026

Instruction following improved measurably. Anthropic reports: "Where previous models interpreted instructions loosely or skipped parts entirely, Opus 4.7 takes the instructions literally." For engineering teams, that means fewer retry loops and less defensive prompt engineering. Prompts written for Opus 4.6 can produce different outputs—migration requires validation.

Vision resolution jumped from 1.15MP to 3.75MP (2576 pixels long edge). Screenshot parsing, chart interpretation, and diagram OCR all improved. This matters for agents reading build logs in Slack screenshots or extracting data from uploaded architecture diagrams.

Vision resolution triples to 3.75 MP, enhancing screenshot and chart parsing.
FIG. 03 Vision resolution triples to 3.75 MP, enhancing screenshot and chart parsing. — Anthropic, 2026

Runtime controls carry more weight. Task budgets cap total token spend for an agent loop. Effort levels trade capability for speed: low effort approximates medium effort in Opus 4.6, enabling cost reduction without model downgrade. Extended thinking budgets were removed—a behavior change that breaks compatibility.

Token-count sprawl from the new tokenizer can offset gains from increased context. Monitor output-token amplification per workload. Cache-hit rates and cache topology—prefix, sliding window, RAG-hybrid patterns—become critical cost drivers when holding 1M context across dozens of agent turns.

The 128k max output limit is sufficient for most code-generation and analysis workflows but tight for document synthesis at scale. Batch processing API remains available for throughput-insensitive work.

For code-review agents, 1M context plus stronger instruction following enable end-to-end diff analysis, multi-file refactoring review, and project-wide consistency checks without checkpoint-and-resume orchestration. The constraint is cache management and cost predictability—quantify cache-hit rates and output-token variance before production scale.