Microsoft's subtle Claude Code retreat and the true expense of enterprise AI
In December of the previous year, Microsoft informed a large number of its engineers, product managers, and designers that they could utilize Claude Code, the command-line coding agent from Anthropic, at the company's expense. By spring, the usage of this tool had extended well beyond the engineering sector and into non-technical roles that traditionally would have taken years to adopt new software. Internally, Microsoft positioned the rollout as a learning initiative, while externally, the message was simpler: the largest software company in the world, equipped with its own foundational models and coding assistant, was financing a competitor to provide an alternative product to its employees.
Fast forward six months, and the trial is being terminated. Reports from Windows Central and other outlets, following The Verge’s initial report, state that Microsoft is canceling most direct licenses for Claude Code within its Experiences and Devices group, which is responsible for developing Windows, Microsoft 365, Outlook, Teams, and Surface products. Engineers who are affected have been instructed to transition to GitHub Copilot CLI by June 30, the closing date of Microsoft's fiscal year. The stated reason for this change is toolchain unification, while the unspoken reason seems to be related to the timing.
The pullback from Claude is the most convincing indication yet that the economic model for enterprise AI coding is not sustainable at current token prices. This isn’t due to the quality of the tools, which are effective enough that engineers rely on them frequently, but rather that this continuous usage disrupts the financial calculations.
Uber serves as a clear example, as it is not financially as robust as Microsoft. Praveen Neppalli Naga, Uber’s CTO, revealed in April to The Information that the company had exhausted its entire AI coding budget for 2026 within just four months. By March, the utilization of Claude Code within his team had surged from 32% to 84% among around 5,000 engineers, with individual engineers spending between $500 and $2,000 monthly on tokens. Currently, approximately 70% of the code written at Uber originates from AI, with around one in ten live backend updates being executed by an agent without any human oversight.
Naga expressed frustration, stating, "I’m back to the drawing board," as the budget he anticipated was already exceeded. This succinctly summarizes the issue: the predictions were incorrect because token consumption does not behave in the same way as user licenses and seats that financial teams typically model. Traditional enterprise software agreements are user-based, while token-priced arrangements depend on the model's computational needs. Agent-driven coding requires extensive computation and generates considerable context that diverges from the simplistic autocomplete interactions that influenced initial pricing.
This disconnect has been observed for several months. In November, GitHub paused new sign-ups for Copilot Pro and Pro+ because the workloads from paying customers were incurring costs higher than their monthly plan fees. The company admitted that cost structures designed for minimal assistance no longer apply.
This issue is not isolated to Uber or Microsoft; it reflects a broader industry challenge. Bryan Catanzaro, Nvidia's VP of applied deep learning, stated in April that compute costs now far exceed the expenses of the personnel using it. Fortune echoed this sentiment in May, noting that intensive use of token-based AI tools can outstrip the cost-effectiveness of the human engineers they were intended to supplement. A 2024 MIT analysis, widely discussed in financial circles, suggests that current pricing for AI automation is only cheaper than human labor for about 25% of the jobs it was expected to replace.
This context contrasts sharply with spending forecasts. Gartner predicts that global AI expenditure will reach $2.5 trillion in 2025, marking a 69% increase from the previous year. However, Gartner also places generative AI in what it calls the "trough of disillusionment," foreseeing that 25% of the planned AI budget for 2026 will be delayed until 2027 as proof-of-concept projects stall in the procurement process. An additional Gartner report from April found that only 28% of AI infrastructure initiatives deliver fully on their business cases, highlighting a revaluation of the market rather than merely growing pains for the technology.
Microsoft's withdrawal aligns with this market reassessment, and it can be interpreted in two ways. The first perspective is the one provided by Microsoft: that Copilot CLI is the strategic goal, that engineers will still have access to Claude models through Copilot, and that the company prefers to develop a product it can influence directly with GitHub. This narrative is accurate but could have been presented at any point in the last six months; what has changed is the financial implications.
The second interpretation is more difficult to overlook. Microsoft is particularly positioned to understand the actual costs associated with enterprise-level use of Claude because its engineers were the primary users outside of Anthropic's customer base. Within the Experiences and Devices group, Claude Code was reportedly the preferred tool. If scalability had led to improved
Other articles
Microsoft's subtle Claude Code retreat and the true expense of enterprise AI
Microsoft is reducing Claude Code licenses within its main product teams. The motivation isn't strategic; it's financial. Has the AI coding experiment come to an end?
