Anthropic's Claude Opus 4.8 is four times more truthful, with Mythos coming next.

      **TL;DR** Anthropic has launched Claude Opus 4.8, an enhanced version of its primary AI model, which is reported to be four times less likely to overlook errors in code. The company also mentioned the upcoming Mythos-class models, which have already identified over 10,000 critical software vulnerabilities through Project Glasswing, and revealed a $65 billion Series H funding round at a $965 billion post-money valuation.

      Anthropic has introduced Claude Opus 4.8, an upgrade to its main AI model, claiming it is more trustworthy, better at performing autonomous tasks, and improved in recognizing its mistakes. This new model is available now at the same pricing as its predecessor: $5 per million input tokens and $25 per million output tokens, and will be integrated into all Anthropic products, including claude.ai, Claude Code, and the API.

      The key enhancement of this version is its honesty. Anthropic states Opus 4.8 is about four times less likely than Opus 4.7 to miss code flaws it generates. Initial testers report that the model is more likely to express uncertainties about its outputs and is less inclined to make unsubstantiated assertions, a common issue with AI models that usually exude confidence, irrespective of its validity.

      **Benchmark improvements across the board**

      Opus 4.8 surpasses its predecessor in all of Anthropic's published benchmarks. In agentic coding (Terminal-Bench 2.1), the score has increased from 64.3% to 69.2%. Multidisciplinary reasoning with tools improved from 54.7% to 57.9%. Agentic computer usage rose from 82.8% to 83.4%, while knowledge work scores climbed from 1,753 to 1,890.

      Anthropic's alignment evaluation found that Opus 4.8 achieves unprecedented heights in prosocial traits, including enhancing user autonomy and acting in the user’s best interest. Instances of misaligned behavior, such as deception or aiding misuse, are significantly lower than in Opus 4.7, aligning closely with Claude Mythos Preview, Anthropic’s most well-aligned model.

      **Early testers report practical benefits**

      The release comes with endorsements from companies already utilizing the model. Cognition, the company behind the AI coding agent Devin, indicated that Opus 4.8 effectively uses tools and resolves issues with comment verbosity and tool-calling that were present in Opus 4.7. Cursor, the AI-driven code editor, noted improvements across all engagement levels on its CursorBench assessment.

      Harvey, which develops AI for legal applications, stated Opus 4.8 achieved the highest score on its Legal Agent Benchmark, becoming the first model to exceed 10% overall on the all-pass criterion. Databricks mentioned that Opus 4.8 addresses more complex multistep queries faster, at a cost of 61% less on tokens than Opus 4.7.

      Thomson Reuters observed that CoCounsel Legal showed significant improvements in consistency and reasoning quality. Hebbia, which specializes in AI for financial document analysis, reported enhanced citation accuracy and better token efficiency in retrieval tasks.

      **New features accompanying the model**

      Anthropic is launching various features with Opus 4.8. A new effort control in claude.ai and Cowork allows users to determine how much processing Claude allocates to a response, balancing speed against quality. Claude Code has introduced a dynamic workflows feature, enabling it to organize tasks and operate numerous parallel subagents within a single session for extensive codebase migrations.

      For developers, the Messages API now supports system entries in the messages array, enabling updates of instructions during tasks without disrupting the prompt cache. The fast mode for Opus 4.8 operates at 2.5 times faster than previous versions and is now three times more cost-effective.

      **Mythos is the larger narrative**

      The more noteworthy announcement pertains to future plans. Anthropic mentioned its intention to launch a new class of models with greater intelligence than Opus, based on the Claude Mythos architecture. A select group of organizations is currently using Claude Mythos Preview via Project Glasswing, an initiative focused on applying the model for cybersecurity. Anthropic, along with around 50 partners such as Apple, Google, Microsoft, and Amazon Web Services, has identified over 10,000 critical vulnerabilities in significant software infrastructure.

      Although Mythos-class models require enhanced cybersecurity measures before their general release, Anthropic anticipates making them available to all customers shortly. This model is a full capability tier above Opus 4.7 and can autonomously discover zero-day vulnerabilities and develop exploits, thus generating both excitement and caution regarding its deployment.

      **A company nearing $1 trillion**

      The rollout of Opus 4.8 coincides with Anthropic's growing valuation. On the same day, the company announced a $65 billion Series H funding round,

Other articles

You ought to experience Marathon before the internet makes the decision for you. Marathon's Open Play Week is coming at the perfect moment, making it an excellent opportunity for players to experience Bungie's sleek extraction shooter firsthand.

The Oura Ring 5 is now 40% more compact and features its most scratch-resistant design to date. Oura has reduced the size of its new smart ring by 40%, while incorporating more powerful sensors, a more durable exterior, and enhanced battery longevity.

Oura Ring 5 is introduced as the smallest smart ring in the world, priced at $399. Oura's Ring 5 has shrunk by 40% to a width of 6.09mm, includes blood pressure pattern detection, and features AI health monitoring. It will be available starting June 4, with prices beginning at $399. An IPO filing is on the way.

The Oura Ring 5 is 40% smaller and features its most scratch-resistant design to date. Oura has reduced the size of its new smart ring by 40%, while also incorporating more robust sensors, a more durable finish, and improved battery life.

Oura Ring 5 is 40% more compact and features its most scratch-resistant design to date. Oura has reduced the size of its latest smart ring by 40%, while incorporating more robust sensors, a sturdier finish, and improved battery life.

You should experience Marathon before the internet makes that choice for you. Marathon's Open Play Week is just around the corner, making it the perfect moment for players to experience Bungie's sleek extraction shooter firsthand.

Anthropic's Claude Opus 4.8 is four times more truthful, with Mythos coming next.

Anthropic has launched Claude Opus 4.8, featuring improved judgement and reduced uncaught code errors. Mythos-class models are set to be released in the coming weeks. Series H has successfully raised $65 billion with a valuation of $965 billion.