Google, Microsoft, and xAI have consented to provide evaluations of AI models to the government before their release, as the Mythos crisis necessitates an increase in oversight.

Google, Microsoft, and xAI have consented to provide evaluations of AI models to the government before their release, as the Mythos crisis necessitates an increase in oversight.

      **TL;DR** Google, Microsoft, and xAI have joined OpenAI and Anthropic in granting the US Commerce Department pre-release access to assess their AI models. This initiative establishes voluntary oversight over these five major AI labs through an office lacking formal authority and staffed with fewer than 200 employees. The expansion follows the Mythos crisis and discussions of an executive order that would formalize the review process.

      The Mythos crisis prompted the U.S. government to address a previously avoided issue: what occurs when an AI model becomes powerful enough to endanger national security, and the government lacks a formal evaluation mechanism before public release? On Tuesday, the Commerce Department disclosed that Google, Microsoft, and xAI will provide the U.S. government with pre-release access to their AI models for evaluation, joining OpenAI and Anthropic, which have been providing models for assessment since 2024. Collectively, these five companies dominate the global frontier AI landscape and have consented to allow a single government office to evaluate their systems prior to deployment. This arrangement is voluntary and not backed by any statutory authority, meaning the government cannot prohibit any release. It represents the closest approach to an AI oversight system in the U.S., established in under two years by a small office.

      **The Office**

      The Center for AI Standards and Innovation operates within the Commerce Department's National Institute of Standards and Technology. Initiated under President Biden in 2023 as the AI Safety Institute and restructured under Trump with an emphasis on standards and national security, the center has conducted over 40 evaluations of AI models, including advanced systems yet to be made public. Developers often present versions with reduced safety measures, allowing assessors to investigate relevant capabilities related to national security, such as biological weapon synthesis, cyberattack automation, and uncontrollable autonomous behaviors at scale.

      Chris Fall now oversees the center after the sudden exit of Collin Burns, a former AI researcher at Anthropic chosen for the role but removed by the White House after just four days. Burns had left his position and relocated for this job, but his ousting, linked to his ties with a company the administration opposed, illustrates the political intricacies of creating an oversight system in an industry where evaluators and those being evaluated come from the same talent pool. Trump's broader regulatory approach for AI has favored federal preemption over state regulations and a lenient stance toward the industry, yet the model evaluation program shows a tougher side; the government wants to understand these systems' capabilities before they are made available to the public.

      **The Agreements**

      The new collaborations with Google, Microsoft, and xAI broaden what was previously a two-company agreement to nearly comprehensive coverage of frontier AI. OpenAI and Anthropic have adjusted their agreements to conform with Trump's AI Action Plan, which tasks the center with leading national security-related assessments and positions it within a wider “evaluations ecosystem.” These agreements are not legally binding; they are voluntary commitments that companies can withdraw from at will. There is no legal requirement for pre-release evaluations, nor does any regulation authorize the center to delay or block deployment. The entire system hinges on the AI companies deciding that providing the government with early access aligns with their strategic interests.

      From the companies' viewpoint, the alternative to voluntary cooperation could be legislation. Various proposed bills would grant the center permanent statutory authority, impose mandatory evaluation requirements, and enable it to set conditions on deployment. The Pentagon has shown readiness to blacklist AI companies that do not comply with government expectations, categorizing Anthropic as a supply-chain risk after it refused to allow its models to be used for autonomous weapons or mass domestic surveillance. The voluntary evaluation agreements serve partly as a way for the other companies to show their willingness to work with the government before compliance becomes obligatory.

      **The Catalyst**

      The expansion of the evaluation program occurs amid the Mythos crisis. Anthropic's significant model, revealed in April, can autonomously identify and exploit zero-day vulnerabilities in major operating systems and web browsers, having uncovered thousands of serious bugs, including long-standing vulnerabilities. The White House has opposed Anthropic’s intent to widen Mythos’s availability beyond its initial launch partners. The NSA is utilizing it despite the Pentagon's sanctions against Anthropic. The EU is demanding access to Mythos for European cybersecurity efforts, arguing that this crucial tool should not be solely controlled by an American company that the U.S. government has partially blacklisted.

      Mythos highlights the need for the evaluation program: a model with capabilities that pose immediate national security risks that cannot be assessed post-deployment. The center's evaluations since 2024 have presumably uncovered capabilities in unreleased models influencing policy decisions, although these assessments only involved two companies until now. Google's Gemini, Microsoft's models, and xAI's Grok were not previously subjected to pre-release government evaluation, but the new agreements address this gap, ensuring any future models with similar levels of capability are assessed by evaluators before public release.

      **The Limits**

      The program's key vulnerability is

Other articles

Founders of IronSource secure $60 million at a valuation of $500 million for Zyg, an agentic AI platform designed to automate advertising in e-commerce. Founders of IronSource secure $60 million at a valuation of $500 million for Zyg, an agentic AI platform designed to automate advertising in e-commerce. Zyg secured $60 million in funding, led by Accel, at a valuation of $500 million just two months following their stealth phase. The IronSource team is developing AI agents designed to take the place of human ad buyers for direct-to-consumer brands. ServiceNow estimates that by 2030, it will reach $30 billion, with one-third of its annual contract value coming from AI. ServiceNow estimates that by 2030, it will reach $30 billion, with one-third of its annual contract value coming from AI. ServiceNow anticipates $30 billion in subscription revenue by 2030, with 30% of that annual contract value (ACV) coming from Now Assist, the company's premier AI product. The presentation during the investor day addresses concerns regarding potential displacement by AI-SaaS. Intel has appointed Qualcomm veteran Alex Katouzian to head a new Client Computing and Physical AI division. Intel has appointed Qualcomm veteran Alex Katouzian to head a new Client Computing and Physical AI division. Intel has brought on Alex Katouzian, who spent 25 years at Qualcomm, to head a newly formed Client Computing and Physical AI division. This marks the second high-profile recruitment from Qualcomm during CEO Lip-Bu Tan's leadership. Workers at Google DeepMind have voted in favor of unionizing following a Pentagon AI agreement that supersedes eight years of commitments to ethical practices. Workers at Google DeepMind have voted in favor of unionizing following a Pentagon AI agreement that supersedes eight years of commitments to ethical practices. Staff at DeepMind UK voted 98% in favor of joining CWU and Unite following Google's agreement on a confidential Pentagon contract for "any lawful purpose." They are advocating for the cessation of military AI applications and the reinstatement of ethical standards. Inside QuantWare's €152 million funding round aimed at developing KiloFab. Inside QuantWare's €152 million funding round aimed at developing KiloFab. QuantWare has completed a €152 million Series B funding round, spearheaded by Intel Capital, In-Q-Tel, and ETF Partners, marking the largest deep tech funding round in the Netherlands to date. Intel has recruited Qualcomm veteran Alex Katouzian to head a newly formed group focused on Client Computing and Physical AI. Intel has recruited Qualcomm veteran Alex Katouzian to head a newly formed group focused on Client Computing and Physical AI. Intel has brought on Alex Katouzian, who spent 25 years at Qualcomm, to head a newly formed group that combines Client Computing and Physical AI. This marks the second senior recruitment from Qualcomm during CEO Lip-Bu Tan's time in office.

Google, Microsoft, and xAI have consented to provide evaluations of AI models to the government before their release, as the Mythos crisis necessitates an increase in oversight.

Five leading AI laboratories are now presenting their models for assessment by the US government. This voluntary program lacks legal authority but includes all major AI developers following the Mythos crisis.