Anthropic releases Claude Opus 4.7: How to try it, benchmarks, safety

0
41

Anthropic releases Claude Opus 4.7: How to try it, benchmarks, safety

Anthropic has been shipping products and making news at a blistering pace in 2026, and on Thursday, the AI company announced the launch of Claude Opus 4.7.

Claude Opus 4.7 is Anthropic's most intelligent model available to the general public. Notably, Anthropic said in a press release that Opus 4.7 is not as powerful as Claude Mythos, which Anthropic deemed too dangerous for public release.

Claude Opus is a family of hybrid reasoning models capable of multi-step reasoning and advanced coding. Until the announcement of Claude Mythos on April 7, Claude Opus was considered Anthropic's most advanced series of AI models.

Don’t miss out on our latest stories: Add Mashable as a trusted news source in Google.

How to try Claude Opus 4.7

Claude Opus 4.7 is available now via Claude AI, the Claude API, and Anthropic partners such as Microsoft Foundry. The new model is priced the same as Claude Opus 4.6.

However, Anthropic noted that because "Opus 4.7 thinks more at higher effort levels," it uses more ouput tokens than its predecessor. Users can read more about how to optimize token usage in the Opus 4.7 migration guide.

How Claude Opus 4.7 improves over 4.6

As expected, Claude Opus 4.7 offers improved capabilities across the board.

In particular, Anthropic says Claude Opus 4.7 is better at advanced coding tasks, visual intelligence, and document analysis. Anthropic also says Opus 4.7 is "more tasteful and creative when completing professional tasks, producing higher-quality interfaces, slides, and docs."

Mashable Light Speed

"Users report being able to hand off their hardest coding work — the kind that previously needed close supervision — to Opus 4.7 with confidence. Opus 4.7 handles complex, long-running tasks with rigor and consistency, pays precise attention to instructions, and devises ways to verify its own outputs before reporting back," reads an Anthropic blog post.

Claude Opus 4.7: Benchmark performance

Anthropic released a detailed model card outlining how Claude Opus 4.7 compares to other Anthropic models and frontier models from OpenAI, Google, and xAI.

Opus 4.7 lags behind the unreleased Claude Mythos, which Anthropic reports scores significantly higher on common benchmarks such as Humanity's Last Exam. "Claude Opus 4.7 is less capable than Claude Mythos Preview on every relevant axis we measured and does not advance our capability frontier," the model card states." That means Claude Opus 4.7 is not evidence that AI development has accelerated beyond existing trend lines.

On Humanity's Last Exam (without tools), Anthropic reports that Claude Opus 4.7 outperforms all other frontier models except Claude Mythos.

  • Claude Mythos scored 56.8 percent on HLE

  • Claude Opus 4.7 scored 46.9 percent

  • Gemini 3.1 Pro scored 44.4 percent

  • GPT-5-4 Pro scored 42.7 percent

  • Claude Opus 4.6 scored 40.0 percent

With tools, GPT-5-4-Pro scored 58.7 percent compared to Opus 4.7’s 54.7 percent. Mythos beat them both with 64.7 percent.

Mashable has not independently verified these benchmark results. Full results are available in the Opus 4.7 model card.

table comparing claude opus 4.7 to other frontier models on benchmark tests

Credit: Anthropic

Overall, Anthropic scored Opus 4.7 above other leading models in some benchmarks, though Gemini 3.1 Pro and GPT-5-4 score higher in some areas.

Claude Opus 4.7: Safety and hallucinations

Anthropic also reports that Opus 4.7 shows a low risk of misaligned behaviors, with a similar risk profile as Opus 4.6.

For example, Anthropic says Opus 4.7 is less likely to hallucinate and shows lower rates of reward hacking.

"Claude Opus 4.7 is more reliably honest than Opus 4.6 or Sonnet 4.6, with large reductions in the rate of important omissions, and moderate improvements in factuality and rates of hallucinated input," the model card states.

Want to learn more about getting the best out of your tech? Sign up for Mashable's Top Stories and Deals newsletters today.

Поиск
Категории
Больше
Technology
Green Packaging Market by 2031 Competitive Landscape Analysis Driving Innovation, Sustainability and Global Industry Expansion
  Green packaging refers to environmentally responsible packaging solutions made from...
От Shital Wagh 2026-04-13 18:46:47 0 398
Technology
Claude subscribers will now have to pay extra to use OpenClaw
Claude subscribers will have to pay extra to use OpenClaw...
От Test Blogger7 2026-04-05 19:00:10 0 348
Игры
Sick of The Witcher 3's Roach? Its MMO mod now lets you ride other players instead
Sick of The Witcher 3's Roach? Its MMO mod now lets you ride other players instead I'll...
От Test Blogger6 2026-04-05 13:00:13 0 336
Home & Garden
Do You Have a ‘Monica Closet’? Here Are 7 Steps to Take It from Chaos to Calm
Got a Secret Mess? 7 Steps to Fix Your ‘Monica Closet’ for Good Fans of the iconic TV show...
От Test Blogger9 2026-01-23 20:03:01 0 2Кб
Игры
I tried the new Valorant e-dating website, and I think I'm scarred for life
I tried the new Valorant e-dating website, and I think I'm scarred for life Let's roleplay...
От Test Blogger6 2026-04-13 12:00:07 0 163