Anthropic released Claude Opus 4 and Sonnet 4 today, claiming the #1 spot for coding performance. There are going to be a lot of articles floating around with exaggerations and marketing talk, but here is an executive summary of everything you need to know.
Performance Numbers
Claude Opus 4:
- SWE-bench: 72.5% (world's best)
- Terminal-bench: 43.2%
- Sustained performance for hours on complex tasks
- $15/$75 per million tokens
Claude Sonnet 4:
- SWE-bench: 72.7% (matches Opus 4)
- 3x faster than Opus 4 for most tasks
- $3/$15 per million tokens
Two key slides from the announcement:
Key Technical Features
- Hybrid Architecture: Instant responses + extended thinking mode (up to 64K tokens)
- Extended Thinking with Tools: Can use web search, code execution during reasoning
- Parallel Tool Execution: Multiple tools simultaneously
- Memory Files: Creates persistent memory when given file access
- 65% Reduction: Less shortcut/loopholes behavior vs Sonnet 3.7
Industry Adoption
- GitHub: Integrating Sonnet 4 into GitHub Copilot
- Cursor: "State-of-the-art for coding"
- Rakuten: Validated 7-hour autonomous refactor
- Sourcegraph: "Substantial leap in software development"
New API Capabilities
- Code execution tool
- MCP connector
- Files API
- Prompt caching (1 hour)
Claude Code Generally Available
- VS Code and JetBrains extensions (beta)
- GitHub Actions integration (demo)
- Claude Code SDK for custom agents
- GitHub PR integration via
/install-github-app
Access
Already available via Anthropic API.
If you want to skip the new model restrictions, you can try it via Glama Gateway and OpenRouter.
So, is it hype?
Claude 4 models lead coding benchmarks and offer sustained performance for complex agent workflows. Opus 4 for maximum capability, Sonnet 4 for speed/cost balance. Both already available to test.
Source: Official Announcement
Will update this article to add interesting insights and facts as the day progresses.
Top comments (1)
Super helpful roundup, thank you! Has anyone tried Sonnet 4 in VS Code yet for real projects?