
You'll find many AI model comparison articles that mainly test the models on a few selected questions, but not many articles show how the models re...
For further actions, you may consider blocking this person and/or reporting abuse
I recently read it somewhere, that Antropic now claims that they don't really know how AI works nowadays. Imagine a company that big doing this claim.
But, even with that, when it comes to AI in general, I prefer Anthropic models than this cheap Google and Sam's one.
Overall, based on your observation, which one do you pick for coding? Keep pricing apart.
Claude Sonnet 4 would be it.
Gemini 2.5 is not something you can simply ignore. But, I agree Anthropic is just too goated for AI models. 💥
Folks, let me know your thoughts on this comparison. Do you prefer real-world coding tests or smaller ones?
I prefer this method. This is how it should be done to test how they perform in the real world. It's a good read. 🤍
Have you had a chance to check out my recent post on starting with DevOps?
Start with DevOps in 3 simple steps 🐳
Lara Stewart - DevOps Cloud Engineer ・ Jun 7
Great read, @larastewart_engdev ✌️
Anthropic’s Claude Sonnet 4 is a refined continuation of the Sonnet lineage—tuned for strong general capabilities, consistent coding, and nuanced reasoning—while remaining cost-effective and available on free tiers (theverge.com). It features the same “thinking summaries” and hybrid “extended thinking” mode introduced in Claude Opus 4, and Anthropic claims that it is about 65 % less prone to shortcutting and better at retaining long-term context (theverge.com). Meanwhile, Gemini 2.5 Pro represents Google DeepMind’s latest major leap, offering a mammoth 1 million‑token context window, a new “Deep Think” reasoning mode, and standout benchmark performance—especially in multi-step reasoning, math, science, and coding (tomsguide.com). Side‑by‑side user reports echo this: many note Gemini outperforms Claude on big coding tasks, thanks to its deep context and precision—but some still prefer Claude for cleaner reasoning trails or narrative flexibility (reddit.com).
In summary, Claude 4 Sonnet is a smart, reliable, and relatively lightweight companion—great for generalist use and precise reasoning—while Gemini 2.5 Pro pushes the envelope on context capacity, reasoning depth, and technical tasks, though occasionally at the cost of verbosity or over‑extension. Choosing between them depends on whether you prioritize nimble, instruction‑following consistency (Claude) or heavyweight reasoning and tool‑capable prowess (Gemini).
Thanks!
Nice one, sathi! ❤️
Thanks!
Good comparison. Always a nice read for model comparisons from you. Shrijal 💯
Thanks! 🙌
Let's be clear about what this article actually is. This isn't a "Claude Sonnet 4 vs. Gemini 2.5 Pro" comparison. It's a poorly structured and biased comparison of two completely different development tools: Anthropic's local command-line tool, "Claude Code", and Google's web-based agent, "Jules".
Because the author tests the wrapper tools instead of the models themselves in a controlled environment, the entire premise is flawed and the conclusions about which model is better are meaningless.
The comparison's credibility collapses further from there: