AI token prices have dropped by 98%, while expenses for enterprises have surged threefold.
TL;DR: Despite a 98% decrease in per-token costs, enterprise AI expenses have tripled, with agentic tools leading to an 18.6 times increase in consumption per developer. The Linux Foundation is establishing the Tokenomics Foundation to enforce spending discipline in AI.
Uber rapidly exhausted its entire AI coding budget for 2026 by April. Microsoft cancelled its developers’ Claude Code licenses six months after granting them. One organization allegedly faced a $500 million bill in a single month due to failing to set usage limits. A Priceline employee informed TechCrunch that a standard Cursor contract renewal ended up costing four to five times more than expected.
This trend is noticeable across the board. While per-token prices have plummeted, the demand for autonomous AI agents has surged consumption. Companies that indulged in subscription services in early 2025 are now struggling to trace their spending and assess any return on investment.
The numerical paradox
Achieving GPT-4-equivalent performance now costs about $0.40 per million tokens, a stark decrease from $20 per million in late 2022, representing a 98% reduction. Nevertheless, enterprise AI expenditures have reportedly surged by around 320%, based on various industry assessments. The average annual AI budget for enterprises has escalated from $1.2 million in 2024 to $7 million in 2026.
The issue lies in the volume. Agentic AI tools launched since November 2025, like Anthropic’s Claude Opus 4.5, OpenAI’s GPT-5.1, and Google’s Gemini 3 Pro, have drastically increased token usage per task. A straightforward workflow in 2023 cost around $0.04 per interaction, while a coordinated agentic system in 2026 costs roughly $1.20, which is nearly 30 times more. Individual engineers at Microsoft were reportedly spending between $500 and $2,000 monthly on tokens before the licenses were revoked.
Nicholas Arcolano, head of research at Jellyfish, stated to TechCrunch that per-developer consumption has notably increased by about 18.6 times in nine months. Engineers utilizing the highest number of tokens were approximately twice as productive as those using fewer tokens but expended ten times more to achieve that productivity. “The effectiveness of extreme spending is contingent on the ultimate business value of the deployed code, which most companies still struggle to quantify,” Arcolano commented.
From tokenmaxxing to guardrails
“Six months ago, customer discussions revolved around ‘What can it do? Is it effective?’” remarked Alexander Embiricos, OpenAI’s enterprise lead, to TechCrunch. “Now, the focus has shifted to ‘Our spending is excessive. What oversight do you have? What token controls can you provide?’”
J.R. Storment, executive director of the FinOps Foundation, plainly described the transition: “In April and May, I began receiving feedback from companies saying, ‘We’re already three times over our total 2026 token budget, and it’s only April.’ The dialogue shifted from maximizing token usage and rapid deployment to establishing guardrails and determining how to manage this.”
Chris Reed, Priceline’s senior director of IT finance, likened the situation to the telecom billing crisis. “It’s similar to a crack-cocaine addiction. They give you a taste to reel you in, and now you feel trapped by it.” The company has started imposing token limits on specific teams, noting discrepancies between reported vendor usage and internal data.
The Tokenomics Foundation
In response to these challenges, the Linux Foundation recently announced plans for the Tokenomics Foundation, a new organization aimed at introducing the same cost management to AI tokens that FinOps has brought to cloud expenditures.
The Foundation intends to create a standard definition of “tokenomics,” establish open standards for AI token usage and billing, and develop new metrics such as cost-per-intelligence and tokens-per-watt. A formal launch is scheduled for July. Nishant Gupta, chief availability officer at Salesforce, noted that "token economics is inherently more abstract and obscure than anything we have managed at this scale before."
The task is colossal. “Monitoring cloud costs is a data issue involving hundreds of millions of rows each month,” Storment explained. “Tracking token costs is a data challenge that encompasses trillions of rows monthly.”
A market arises to address the issue
Both startups and established companies are racing to address this need. Pay-i focuses on tracking and optimizing AI expenditures. Paid allows developers to charge based on actual value instead of subscription fees. Jellyfish, Waydev, and Faros AI offer monitoring solutions to demonstrate the ROI of developer tools. Ramp has begun specializing in AI spending management, while Datadog and New Relic have incorporated token-level monitoring.
Model routing is becoming a crucial cost management feature. Factory, an enterprise AI coding startup, recently introduced a model router that automatically selects the most cost-effective model for individual tasks. Vitaly Gordon
Other articles
AI token prices have dropped by 98%, while expenses for enterprises have surged threefold.
Uber, Microsoft, and Priceline are rushing to respond as AI token legislation escalates. The Linux Foundation is establishing a Tokenomics Foundation to enforce financial discipline in AI expenditures.
