Anthropic releases Claude Opus 4.7: How to try it, benchmarks, safety

Anthropic has been shipping products and making news at a blistering pace in 2026, and on Thursday, the AI company announced the launch of Claude Opus 4.7.

Claude Opus 4.7 is Anthropic's most intelligent model available to the general public. Notably, Anthropic said in a press release that Opus 4.7 is not as powerful as Claude Mythos, which Anthropic deemed too dangerous for public release.

Claude Opus is a family of hybrid reasoning models capable of multi-step reasoning and advanced coding. Until the announcement of Claude Mythos on April 7, Claude Opus was considered Anthropic's most advanced series of AI models.

Don’t miss out on our latest stories: Add Mashable as a trusted news source in Google.

How to try Claude Opus 4.7

Claude Opus 4.7 is available now via Claude AI, the Claude API, and Anthropic partners such as Microsoft Foundry. The new model is priced the same as Claude Opus 4.6.

However, Anthropic noted that because "Opus 4.7 thinks more at higher effort levels," it uses more ouput tokens than its predecessor. Users can read more about how to optimize token usage in the Opus 4.7 migration guide.

How Claude Opus 4.7 improves over 4.6

As expected, Claude Opus 4.7 offers improved capabilities across the board.

In particular, Anthropic says Claude Opus 4.7 is better at advanced coding tasks, visual intelligence, and document analysis. Anthropic also says Opus 4.7 is "more tasteful and creative when completing professional tasks, producing higher-quality interfaces, slides, and docs."

Mashable Light Speed

"Users report being able to hand off their hardest coding work — the kind that previously needed close supervision — to Opus 4.7 with confidence. Opus 4.7 handles complex, long-running tasks with rigor and consistency, pays precise attention to instructions, and devises ways to verify its own outputs before reporting back," reads an Anthropic blog post.

Claude Opus 4.7: Benchmark performance

Anthropic released a detailed model card outlining how Claude Opus 4.7 compares to other Anthropic models and frontier models from OpenAI, Google, and xAI.

Opus 4.7 lags behind the unreleased Claude Mythos, which Anthropic reports scores significantly higher on common benchmarks such as Humanity's Last Exam. "Claude Opus 4.7 is less capable than Claude Mythos Preview on every relevant axis we measured and does not advance our capability frontier," the model card states." That means Claude Opus 4.7 is not evidence that AI development has accelerated beyond existing trend lines.

On Humanity's Last Exam (without tools), Anthropic reports that Claude Opus 4.7 outperforms all other frontier models except Claude Mythos.

Claude Mythos scored 56.8 percent on HLE
Claude Opus 4.7 scored 46.9 percent
Gemini 3.1 Pro scored 44.4 percent
GPT-5-4 Pro scored 42.7 percent
Claude Opus 4.6 scored 40.0 percent

With tools, GPT-5-4-Pro scored 58.7 percent compared to Opus 4.7’s 54.7 percent. Mythos beat them both with 64.7 percent.

Mashable has not independently verified these benchmark results. Full results are available in the Opus 4.7 model card.

table comparing claude opus 4.7 to other frontier models on benchmark tests

Credit: Anthropic

Overall, Anthropic scored Opus 4.7 above other leading models in some benchmarks, though Gemini 3.1 Pro and GPT-5-4 score higher in some areas.

Claude Opus 4.7: Safety and hallucinations

Anthropic also reports that Opus 4.7 shows a low risk of misaligned behaviors, with a similar risk profile as Opus 4.6.

For example, Anthropic says Opus 4.7 is less likely to hallucinate and shows lower rates of reward hacking.

"Claude Opus 4.7 is more reliably honest than Opus 4.6 or Sonnet 4.6, with large reductions in the rate of important omissions, and moderate improvements in factuality and rates of hallucinated input," the model card states.

Want to learn more about getting the best out of your tech? Sign up for Mashable's Top Stories and Deals newsletters today.