Microsoft introduces three proprietary AI models as a direct competition to OpenAI.
Six months after renegotiating a contract that previously restricted it from independently pursuing frontier AI, Microsoft has unveiled three in-house models that challenge the partnership it invested $13 billion in. MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 are now available through Microsoft Foundry, and none of them feature OpenAI’s branding.
These models are the initial public releases from the MAI Superintelligence team established by Mustafa Suleyman, CEO of Microsoft AI, in November 2025, with the aim of achieving what the company refers to as “humanist superintelligence.” In a March internal memo reported by Business Insider, Suleyman expressed his intention to focus fully on superintelligence and develop top-tier models for Microsoft over the next five years, marking the ambition with its first concrete outcome.
Among the three, MAI-Transcribe-1 appears to be the most immediately disruptive. This speech-to-text model claims the lowest word error rate across 25 languages on the FLEURS benchmark, averaging 3.8 percent, and Microsoft asserts it outperforms OpenAI’s Whisper-large-v3 in all 25 languages, Google’s Gemini 3.1 Flash in 22 of 25, and ElevenLabs’ Scribe v2 in 15 of 25. It operates 2.5 times faster than Microsoft’s former Azure Fast transcription service and costs $0.36 per hour of audio. Notably, it was developed by a team of just 10 individuals.
MAI-Voice-1 completes the audio processing cycle. This text-to-speech model generates 60 seconds of realistic audio in under one second using a single GPU and allows for the creation of custom voices from just a few seconds of sample audio. When paired with MAI-Transcribe-1 and a large language model of the customer’s choice, it creates a comprehensive voice pipeline that functions entirely on Microsoft’s infrastructure with no reliance on OpenAI’s technology.
MAI-Image-2, the oldest of the trio, had already ranked third on the Arena.ai text-to-image leaderboard in March, trailing only behind Google’s Gemini 3.1 Flash and OpenAI’s GPT Image 1.5. This model was developed alongside photographers, designers, and visual storytellers, with WPP, one of the largest marketing groups globally, as one of its initial enterprise partners implementing it at scale.
The strategic context is more significant than the benchmarks. Until the September 2025 renegotiation, Microsoft's original agreement with OpenAI legally barred the company from independently pursuing general AI development. The revised memorandum of understanding fundamentally changed this dynamic. Microsoft retained licensing rights to all projects OpenAI develops until 2032, secured $250 billion in new Azure cloud business commitments, and, importantly, gained the ability to create competing models. Suleyman acknowledged this shift, stating that the renegotiated contract allowed Microsoft to independently engage in its own superintelligence development.
The timing of these developments is intentional. Jacob Andreou, formerly a senior vice-president at Snap, became the executive vice-president of Copilot on March 17, releasing Suleyman from day-to-day product responsibilities, shortly before the MAI models were released. Additionally, Microsoft hired Ali Farhadi, the former CEO of the Allen Institute for AI, for Suleyman’s superintelligence team that same month, indicating that their ambitions extend beyond just transcription and image generation.
For OpenAI, these developments create a complicated situation. Microsoft remains its largest investor and main cloud infrastructure provider, with the two companies continuing to share a platform in Foundry, which accommodates both OpenAI and Microsoft models. However, OpenAI is also accelerating its push for commercial monetization, and the relationship appears to be evolving into that of two companies operating in the same market with overlapping products rather than a partnership with distinct roles. OpenAI’s $110 billion fundraising in February, supported by SoftBank, Nvidia, and Amazon, valued the company independently of Microsoft's at a level that renders the original partnership framework increasingly outdated.
The broader AI model market is similarly fragmenting. Anthropic’s $30 billion fundraising at a $380 billion valuation has positioned it as a significant player in enterprise AI, with an annual revenue run rate of $14 billion. Google continues rapid development on Gemini. The time when OpenAI was the sole leader in frontier AI capabilities, with Microsoft as its exclusive distributor, is clearly behind us.
Microsoft Foundry, previously known as Azure AI Foundry and Azure AI Studio (the second rebranding within twelve months), now serves developers at over 80,000 enterprises, including 80 percent of Fortune 500 companies. This distribution advantage makes the MAI model family strategically important, as Microsoft does not need to outperform OpenAI on every metric to redirect enterprise spending toward its in-house models. It only needs to be competitive enough that customers opt for the integrated solution over the third-party alternative, a scenario
Other articles
Microsoft introduces three proprietary AI models as a direct competition to OpenAI.
Microsoft launched MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 through Foundry, developed by the superintelligence team led by Mustafa Suleyman. These models are in direct competition with OpenAI.
