New version of Gemini beats other AIs at math, science, and reasoning

0
3كيلو بايت

Gemini 2.5 Pro beats other AIs at math, science, and reasoning

Don't try to beat it at chess.

 By 

Stan Schroeder

 on 

Share on Facebook Share on Twitter Share on Flipboard

Gogle Gemini

Google's new LLM is the king of the hill when it comes to nearly everything. Credit: Sopa Images / Getty Images

Google's new Gemini Pro is smarter than other AIs at reasoning, science, and coding.

This is according to a series of benchmark results posted by Google on Thursday. In short, Gemini 2.5 Pro beats chief competitors at nearly everything — though we're sure the companies behind those competitors would disagree.

This Tweet is currently unavailable. It might be loading or has been removed.

According to Google's data, Gemini 2.5 Pro has a healthy lead over OpenAI o3, Claude Opus 4, Grok 3 Beta, and DeepSeek R1, in the Humanity's Last Exam benchmark, which evaluates a model's math, science, knowledge, and reasoning. It's also better at code editing (per the Aider Polyglot benchmark), and it wins over all competitors in several factuality benchmarks including FACTS Grounding, meaning it's less likely to provide factually inaccurate text.

Mashable Light Speed

The only benchmark in which Gemini 2.5 Pro isn't a clear winner is the mathematics-focused AIME 2025, and even there the differences between results are pretty small.

As a result of all the improvements in Gemini 2.5 Pro, this model is now on top of the LMArena leaderboard with a score of 1470.

There's a catch, though: The final version of Gemini 2.5 Pro isn't widely available yet. Google calls this latest version an "upgraded preview," with a stable version coming "in a couple of weeks." The preview should now be available in the Gemini app, though.

Stan Schroeder

Stan is a Senior Editor at Mashable, where he has worked since 2007. He's got more battery-powered gadgets and band t-shirts than you. He writes about the next groundbreaking thing. Typically, this is a phone, a coin, or a car. His ultimate goal is to know something about everything.


These newsletters may contain advertising, deals, or affiliate links. By clicking Subscribe, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.

البحث
الأقسام
إقرأ المزيد
القصص
Anesthesia Video Laryngoscope Market Revenue Analysis: Growth, Share, Value, Size, and Insights
"In-Depth Study on Executive Summary Anesthesia Video Laryngoscope Market Size and...
بواسطة Aryan Mhatre 2025-11-06 08:59:59 0 2كيلو بايت
أخرى
Global Vitiligo Drugs Market Strategic Analysis
The Global Vitiligo Drugs Market Size Was Valued at USD 1.48 Billion in 2023 and is Projected to...
بواسطة Priyanka Bhingare 2025-12-23 06:05:30 0 777
Technology
20+ early Black Friday deals on power stations: Shop Jackery, Anker, Bluetti, and more
20+ Black Friday power station deals: Jackery, Anker, and more on sale...
بواسطة Test Blogger7 2025-11-24 11:00:37 0 584
Food
The Best And Worst Costco Bakery Cakes, Tried And Ranked
The Best And Worst Costco Bakery Cakes, Tried And Ranked...
بواسطة Test Blogger1 2025-10-19 14:00:12 0 1كيلو بايت
Technology
The Echo Spot has dropped below $55 at Amazon ahead of Black Friday
Best Echo Spot deal: Save $25 at Amazon SAVE $25:...
بواسطة Test Blogger7 2025-11-07 09:00:24 0 767