Microsoft introduces three proprietary AI models as a direct competitor to OpenAI.
Six months after renegotiating the contract that previously restricted it from pursuing frontier AI independently, Microsoft has unveiled three in-house models that directly compete with the partner it invested $13 billion in developing. The models—MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2—are now accessible in Microsoft Foundry and do not feature OpenAI’s branding.
These models are the first publicly available products from the MAI Superintelligence team, which Mustafa Suleyman, CEO of Microsoft AI, established in November 2025 with the explicit goal of achieving what the company terms "humanist superintelligence." In a March internal memo, later reported by Business Insider, Suleyman expressed his intention to devote all of his efforts to the area of superintelligence and provide top-tier models for Microsoft over the upcoming five years. This vision has now begun to take tangible shape.
MAI-Transcribe-1 appears to be the most immediately impactful of the trio. This speech-to-text model boasts the lowest word error rate across 25 languages on the FLEURS benchmark, averaging 3.8 percent. Microsoft claims it outperforms OpenAI’s Whisper-large-v3 in all 25 languages, Google’s Gemini 3.1 Flash in 22 of 25, and ElevenLabs’ Scribe v2 in 15 of 25. It operates 2.5 times faster than Microsoft’s previous Azure Fast transcription service and is priced at $0.36 per hour of audio. Notably, the team responsible for its development comprises only 10 individuals.
MAI-Voice-1 completes the audio circle. This text-to-speech model can generate 60 seconds of natural-sounding audio in under one second on a single GPU and is capable of creating custom voices from just a few seconds of sample audio. Together with MAI-Transcribe-1 and a large language model of the client’s choice, it establishes a comprehensive voice pipeline that operates entirely on Microsoft infrastructure without reliance on OpenAI’s technology.
MAI-Image-2, the most established of the three, had previously debuted in March ranked third on the Arena.ai text-to-image leaderboard, behind only Google’s Gemini 3.1 Flash and OpenAI’s GPT Image 1.5. This model was developed in collaboration with photographers, designers, and visual storytellers, and WPP, one of the globe's largest marketing organizations, is among the early enterprise partners utilizing it at scale.
The strategic context surrounding these developments is significant. Prior to the September 2025 renegotiation, Microsoft’s original partnership with OpenAI legally prohibited it from independently pursuing general AI development. The revised memorandum of understanding fundamentally altered this situation. Microsoft retained licensing rights to all of OpenAI's creations through 2032, secured $250 billion in new Azure cloud business commitments, and, crucially, gained the autonomy to create competing models. Suleyman directly acknowledged this shift, stating that the contract renegotiation empowered Microsoft to independently develop its own superintelligence.
The timing of these releases was intentional. Jacob Andreou, who previously served as a senior vice president at Snap, became the executive vice president of Copilot on March 17, allowing Suleyman to step back from daily product management. The MAI models were introduced shortly after, just over two weeks later. Additionally, Microsoft recruited Ali Farhadi, the former chief executive of the Allen Institute for AI, for Suleyman’s superintelligence team in March, signaling that the ambitions extend far beyond just transcription and image generation.
For OpenAI, this development introduces a complicated dynamic. Microsoft remains its largest investor and main cloud infrastructure provider, and both companies continue to share the Foundry platform, which hosts models from both OpenAI and Microsoft. However, OpenAI is simultaneously ramping up its own commercial monetization efforts, leading the relationship to resemble two entities navigating the same market with overlapping offerings, rather than a partnership with a distinct division of responsibilities. OpenAI’s $110 billion raise in February, supported by SoftBank, Nvidia, and Amazon, positioned the company independently of Microsoft at a valuation that renders the original partnership framework increasingly outdated.
The broader AI model market is also fragmenting in similar ways. Anthropic’s $30 billion raise at a $380 billion valuation has positioned it as a credible third player in enterprise AI, with a revenue run-rate of $14 billion. Google continues to rapidly iterate on Gemini. The period when OpenAI was the sole contender for frontier AI capabilities while Microsoft was merely its exclusive distribution partner has definitively come to an end.
Microsoft Foundry, which was previously known as Azure AI Foundry and before that Azure AI Studio (the second rebranding in twelve months), now serves developers at over 80,000 enterprises, including 80 percent of Fortune 500 companies. This distribution advantage is what makes the MAI family of models strategically important: Microsoft does not need
Other articles
Microsoft introduces three proprietary AI models as a direct competitor to OpenAI.
Microsoft launched MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 through Foundry, developed by Mustafa Suleyman's superintelligence team. These models directly rival those of OpenAI.
