Claude Sonnet 5 is out and the gap with Opus 4.8 is smaller than I expected
这条记录涉及编程工具或代码能力更新,适合开发者评估工作流变化和可复用价值。
Anthropic just released Claude Sonnet 5.
https://preview.redd.it/po9v9yxt4hah1.jpg?width=2600&format=pjpg&auto=webp&s=acc28da937d36724795190b133aff95424e6edb9
On the benchmark table, Sonnet 5's score was very close to Opus 4.8. The gap is small enough that I am questioning whether I should keep routing agentic tasks to Opus. For messy, multi-step coding work, finish rate matters more than peak accuracy. Early testers are saying Sonnet 5 finally completes tasks where older Sonnets would stall halfway.
Then there is the price. Introductory pricing is $2/$10 per million tokens through August 31, then $3/$15. That is far below Opus. If the real-world gap on my workflows is only a few points, the cost savings could be significant.
One thing to keep in check: benchmarks are not your repo. A 5-6 point gap on a leaderboard often shrinks on real work. But at this price, even a small real gap makes Sonnet 5 interesting.