AI token prices dropped by 98%, while expenses for enterprises tripled.

      TL;DREnterprise AI expenses are experiencing a threefold increase despite a 98% decrease in per-token costs, as autonomous tools lead to an 18.6-fold rise in consumption per developer. The Linux Foundation is establishing the Tokenomics Foundation to enforce cost discipline in AI spending.

      Uber exhausted its entire AI coding budget for 2026 by April. Microsoft rescinded Claude Code licenses from its developers six months after initially granting them. One company allegedly incurred a $500 million bill for Claude in just one month after neglecting to set usage limits. A Priceline employee informed TechCrunch that a standard Cursor contract renewal turned out to be four to five times more costly.

      This trend is observed across the board. While per-token prices have plummeted, the demand for autonomous AI agents has surged, leading to increased consumption. Companies that indulged in unlimited subscriptions in early 2025 are now trying to figure out their expenses and whether any returns were generated.

      The numerical paradox

      Achieving performance equivalent to GPT-4 now costs about $0.40 per million tokens, a significant drop from $20 per million in late 2022, representing a 98% decrease. Nevertheless, enterprise AI costs have reportedly risen by around 320%, according to various industry reports. The average enterprise AI budget has increased from $1.2 million annually in 2024 to $7 million in 2026.

      The primary reason is volume. Since November 2025, agentic AI tools such as Anthropic’s Claude Opus 4.5, OpenAI’s GPT-5.1, and Google’s Gemini 3 Pro have drastically increased token use per task. A straightforward linear workflow in 2023 cost about $0.04 per interaction, while an orchestrated agentic system in 2026 costs approximately $1.20, nearly 30 times more. Individual Microsoft engineers reportedly spent between $500 and $2,000 monthly on tokens before their licenses were revoked.

      Nicholas Arcolano, research head at engineering management platform Jellyfish, told TechCrunch that per-developer consumption has risen by around 18.6 times in just nine months. Engineers who utilized the most tokens were about twice as productive as those who used fewer tokens but incurred 10 times the cost to achieve that productivity. “Whether extreme spending pays off depends on the ultimate business value of the delivered code, which many companies still struggle to quantify,” Arcolano remarked.

      From tokenmaxxing to guardrails

      “Six months ago, discussions with customers focused on ‘What can it do? Is it effective enough?’” said Alexander Embiricos, OpenAI’s head of enterprise. “Now, conversations revolve around concerns of excessive spending and inquiries about visibility and token controls.”

      J.R. Storment, executive director of the FinOps Foundation, noted the change in tone, stating, “In April and May, I started receiving feedback from companies shocked at being three times over their entire 2026 token budget by April.” The discussion transitioned from maximizing token use and rapid deployment to emphasizing the need for controls and guidelines.

      Chris Reed, Priceline’s senior director of IT finance, likened the situation to the telecom billing era, commenting, “It’s akin to the crack-cocaine epidemic. They initially offer you a taste to get you hooked, and now you’re somewhat dependent on it.” The company has begun imposing token limits on specific groups, and Reed is already noticing inconsistencies between vendor-reported usage and Priceline’s internal records.

      The Tokenomics Foundation

      In light of this situation, the Linux Foundation recently announced the formation of the Tokenomics Foundation, a new standards organization aimed at instilling the same cost accountability in AI tokens that FinOps introduced to cloud expenditure.

      The Foundation intends to create a standardized definition of “tokenomics,” establish open standards for AI token usage and billing, and develop new metrics, such as cost-per-intelligence and tokens-per-watt. A formal launch is scheduled for July. Nishant Gupta, Salesforce’s chief availability officer, commented that “token economics is inherently more abstract and opaque than anything we’ve managed at this scale before.”

      The undertaking is monumental. “Tracking cloud costs involves handling hundreds of millions of rows of data each month,” Storment stated. “Tracking token costs presents a data challenge in the trillions of rows each month.”

      A market emerges to address the issue

      Startups and existing companies are racing to address this gap. Pay-i tracks and optimizes AI expenditures. Paid allows developers to charge based on actual value rather than subscription fees. Jellyfish, Waydev, and Faros AI offer agent monitoring to demonstrate the ROI of developer tools, while Ramp has entered the AI spend management sector. Datadog and New Relic have incorporated token-level observability.

      Model routing is becoming a key cost management strategy. Factory, an enterprise AI coding startup, recently introduced a model router that automatically selects the most cost-effective model for each task.

Other articles

Japan faces the danger of becoming an "AI colony," cautions its digital minister. Digital Minister Hisashi Matsumoto cautioned that Japan might become an 'AI colony' if it lags behind, advocating for a bill aimed at simplifying data-use consent regulations.

Die Ernennung von Von der Leyens KI-Beauftragtem wird wegen möglicher Interessenkonflikte kritisiert. The EU designated Siemens chairman Jim Hagemann Snabe as an AI envoy shortly after the company contributed to the modification of the AI Act. Detractors argue that this grants policy power to industry lobbyists.

Japan may face the danger of becoming an 'AI colony', cautions its digital minister. Digital Minister Hisashi Matsumoto cautioned that Japan might turn into an ‘AI colony’ if it lags behind, justifying a bill aimed at relaxing data-use consent regulations.

Claude generates 80% of its code and advocates for an AI pause. Claude currently generates 80% of Anthropic's production code. The organization's recent paper outlines a strategy for recursive self-improvement and advocates for the implementation of a global pause mechanism.

AirTrunk aims to invest $30 billion in a 5GW data center initiative in India by 2030. AirTrunk, supported by Blackstone, intends to invest $30 billion in India by 2030 to establish 5GW of data center capacity, shortly after its entry into the market via the acquisition of Lumina CloudInfra.

Russia prepares a compact version of Starlink and continues to shift its 2027 deadline. Russia's Bureau 1440 intends to launch commercial satellite internet in 2027 using its Rassvet constellation, which is a purposely smaller alternative to Starlink.

AI token prices dropped by 98%, while expenses for enterprises tripled.

Uber, Microsoft, and Priceline are reacting swiftly as AI token legislation has surged. The Linux Foundation is establishing a Tokenomics Foundation aimed at instilling financial discipline in AI expenditures.