Claude Sonnet 5 Is Now Default in Claude Code: 85.2% SWE-bench at $2/$10
Claude Sonnet 5 launched June 30 as Anthropic's most agentic Sonnet model, now the default in Claude Code with $2/$10 introductory pricing. SWE-bench Verified: 85.2%, near Opus 4.8 performance. But a new tokenizer means long agent coding sessions may not see the full cost savings.
2026年7月1日 · 阅读约 4 分钟
TL;DR
Claude Sonnet 5 launched June 30 as Anthropic's most agentic Sonnet model yet. It's now the default in Claude Code, scores 85.2% on SWE-bench Verified, and comes with $2/$10 per million tokens introductory pricing. But there's a catch: a new tokenizer means your Claude Code bill may not drop as much as the sticker price suggests — especially on long agent runs.
What Changed
On June 30, Anthropic released Claude Sonnet 5 and immediately made it the default model for Free and Pro users on claude.ai and in Claude Code. This is the first Sonnet-class model to seriously challenge Opus on agentic coding benchmarks.
The key numbers:
- SWE-bench Verified: 85.2% (significant improvement over Sonnet 4.6)
- SWE-bench Pro: 63.2% (Opus 4.8: 69.2%)
- Terminal-Bench: 80.4%
- OSWorld: 81.2%
- BrowseComp: 84.7% single-agent — best among non-Opus models
- Knowledge work (GDPval-AA v2): 1,618, slightly edging Opus 4.8 at 1,615
- Cursor production benchmark: Sonnet 5 Max 61.2% vs Opus 4.8 Max 63.8%
In plain terms: Sonnet 5 gets you roughly 90-95% of Opus 4.8's agentic coding capability at Sonnet-class pricing. On the BenchLM provisional leaderboard, it even leads Opus 4.8 at 94 vs 92.
The model shares Opus 4.8's 1M-token context window and 128K output cap, and was trained on data through January 2026.
The Tokenizer Catch
Here's what the benchmark tables don't show. Sonnet 5 introduces a new tokenizer that counts tokens differently from Sonnet 4.6. Combined with adaptive thinking — where the model spends more compute on harder problems — a long agentic coding session can produce output that costs the same or more as Opus 4.8 despite the lower per-token rate.
The $2/$10 promotional pricing (through August 31) is real — but it's most beneficial for bounded-output tasks like code review, single-file generation, and structured completions. For open-ended Claude Code sessions where the agent iterates through multiple debugging cycles, the savings shrink or vanish.
Developers should benchmark their specific workflows rather than assuming a 60% cost reduction.
Why It Matters for AI Coding
This release matters for three reasons:
1. Sonnet reclaims the agentic coding crown. The Sonnet line — starting with Sonnet 3.5 — pioneered practical AI coding agents. But recent gains were concentrated in Opus-class models. Sonnet 5 brings that capability back to the mid-tier, which is what most developers actually use daily.
2. Claude Code just got upgraded silently. If you opened Claude Code on July 1, you're already using Sonnet 5. It defaults to high effort and ships with the 1M context window. No migration needed — but also no opt-out if you preferred Sonnet 4.6's behavior for certain tasks.
3. Free-tier users get Opus-adjacent coding. Sonnet 5 as the default for Free plans means anyone can access near-Opus coding quality without a Pro subscription. This raises the floor for AI-assisted development across the board.
What You Should Do
- If you're a Claude Code user: Run a few representative coding sessions this week and compare your token consumption. If your workflow is output-heavy, monitor whether the new tokenizer inflates your effective cost.
- If you're on Pro/Max: Sonnet 5 is your new default. Opus 4.8 is still better at complex multi-step reasoning — use it for architecture design and debugging intricate systems.
- If you're on the free tier: This is the biggest free-tier upgrade since Sonnet 3.5. The 1M context window alone is a game-changer for working with large codebases.
- If you're evaluating models: The August 31 pricing deadline is a narrow window. Test now while costs are low, and plan your budget assuming standard pricing afterward.
The Bigger Picture
Sonnet 5 launches the same week Anthropic lifted export controls on Fable 5 (available globally starting July 1) and released Claude Science for life sciences. The company is executing on multiple fronts simultaneously — consumer models, frontier research models, and domain-specific tools. For developers, the takeaway is straightforward: the model you use in Claude Code just got meaningfully better at no additional cost (for now), but don't assume your bill stays the same.
主题中心
2026 AI 编程工具全景指南
从 Copilot 改版到 Claude Code / DeepSeek 低成本方案——把分散资讯收成可搜索、可对比的工具矩阵。
进入「2026 AI 编程工具全景指南」 →赚钱视角
这个趋势怎么赚钱?
WayToClawEarn 的差异在可验证的赚钱案例,而不只是资讯。从这些复盘开始:
浏览全部案例 →