AI token prices have dropped by 98%, while expenses for enterprises have surged threefold.

AI token prices have dropped by 98%, while expenses for enterprises have surged threefold.

      TL;DR: Despite a 98% decrease in per-token costs, enterprise AI expenses have tripled, with agentic tools leading to an 18.6 times increase in consumption per developer. The Linux Foundation is establishing the Tokenomics Foundation to enforce spending discipline in AI.

      Uber rapidly exhausted its entire AI coding budget for 2026 by April. Microsoft cancelled its developers’ Claude Code licenses six months after granting them. One organization allegedly faced a $500 million bill in a single month due to failing to set usage limits. A Priceline employee informed TechCrunch that a standard Cursor contract renewal ended up costing four to five times more than expected.

      This trend is noticeable across the board. While per-token prices have plummeted, the demand for autonomous AI agents has surged consumption. Companies that indulged in subscription services in early 2025 are now struggling to trace their spending and assess any return on investment.

      The numerical paradox

      Achieving GPT-4-equivalent performance now costs about $0.40 per million tokens, a stark decrease from $20 per million in late 2022, representing a 98% reduction. Nevertheless, enterprise AI expenditures have reportedly surged by around 320%, based on various industry assessments. The average annual AI budget for enterprises has escalated from $1.2 million in 2024 to $7 million in 2026.

      The issue lies in the volume. Agentic AI tools launched since November 2025, like Anthropic’s Claude Opus 4.5, OpenAI’s GPT-5.1, and Google’s Gemini 3 Pro, have drastically increased token usage per task. A straightforward workflow in 2023 cost around $0.04 per interaction, while a coordinated agentic system in 2026 costs roughly $1.20, which is nearly 30 times more. Individual engineers at Microsoft were reportedly spending between $500 and $2,000 monthly on tokens before the licenses were revoked.

      Nicholas Arcolano, head of research at Jellyfish, stated to TechCrunch that per-developer consumption has notably increased by about 18.6 times in nine months. Engineers utilizing the highest number of tokens were approximately twice as productive as those using fewer tokens but expended ten times more to achieve that productivity. “The effectiveness of extreme spending is contingent on the ultimate business value of the deployed code, which most companies still struggle to quantify,” Arcolano commented.

      From tokenmaxxing to guardrails

      “Six months ago, customer discussions revolved around ‘What can it do? Is it effective?’” remarked Alexander Embiricos, OpenAI’s enterprise lead, to TechCrunch. “Now, the focus has shifted to ‘Our spending is excessive. What oversight do you have? What token controls can you provide?’”

      J.R. Storment, executive director of the FinOps Foundation, plainly described the transition: “In April and May, I began receiving feedback from companies saying, ‘We’re already three times over our total 2026 token budget, and it’s only April.’ The dialogue shifted from maximizing token usage and rapid deployment to establishing guardrails and determining how to manage this.”

      Chris Reed, Priceline’s senior director of IT finance, likened the situation to the telecom billing crisis. “It’s similar to a crack-cocaine addiction. They give you a taste to reel you in, and now you feel trapped by it.” The company has started imposing token limits on specific teams, noting discrepancies between reported vendor usage and internal data.

      The Tokenomics Foundation

      In response to these challenges, the Linux Foundation recently announced plans for the Tokenomics Foundation, a new organization aimed at introducing the same cost management to AI tokens that FinOps has brought to cloud expenditures.

      The Foundation intends to create a standard definition of “tokenomics,” establish open standards for AI token usage and billing, and develop new metrics such as cost-per-intelligence and tokens-per-watt. A formal launch is scheduled for July. Nishant Gupta, chief availability officer at Salesforce, noted that "token economics is inherently more abstract and obscure than anything we have managed at this scale before."

      The task is colossal. “Monitoring cloud costs is a data issue involving hundreds of millions of rows each month,” Storment explained. “Tracking token costs is a data challenge that encompasses trillions of rows monthly.”

      A market arises to address the issue

      Both startups and established companies are racing to address this need. Pay-i focuses on tracking and optimizing AI expenditures. Paid allows developers to charge based on actual value instead of subscription fees. Jellyfish, Waydev, and Faros AI offer monitoring solutions to demonstrate the ROI of developer tools. Ramp has begun specializing in AI spending management, while Datadog and New Relic have incorporated token-level monitoring.

      Model routing is becoming a crucial cost management feature. Factory, an enterprise AI coding startup, recently introduced a model router that automatically selects the most cost-effective model for individual tasks. Vitaly Gordon

Other articles

AirTrunk aims to invest $30 billion in a 5GW data center initiative in India by 2030. AirTrunk aims to invest $30 billion in a 5GW data center initiative in India by 2030. AirTrunk, supported by Blackstone, intends to invest $30 billion in India by 2030 to create 5GW of data center capacity, shortly after entering the market via its acquisition of Lumina CloudInfra. AirTrunk aims to invest $30 billion in a 5GW data center initiative in India by 2030. AirTrunk aims to invest $30 billion in a 5GW data center initiative in India by 2030. AirTrunk, supported by Blackstone, intends to invest $30 billion in India by 2030 to establish 5GW of data center capacity, shortly after its entry into the market via the acquisition of Lumina CloudInfra. Scams related to the FIFA World Cup 2026 are currently ongoing, including counterfeit websites and malware. Scams related to the FIFA World Cup 2026 are currently ongoing, including counterfeit websites and malware. More than 4,300 counterfeit FIFA websites, banking malware in streaming applications, and compromised login credentials are currently aimed at World Cup 2026 supporters, with both the FBI and researchers alerting to the possibility of billions in potential losses. Mira Murati reappears with a caution regarding AI governance and introduces a new product. Mira Murati reappears with a caution regarding AI governance and introduces a new product. The former CTO of OpenAI shares insights on Thinking Machines' interaction models, discusses the firing of Altman, and contends that artificial intelligence lacks sufficient structural safeguards in her first public appearance in a year and a half. Japan may face the danger of becoming an 'AI colony', cautions its digital minister. Japan may face the danger of becoming an 'AI colony', cautions its digital minister. Digital Minister Hisashi Matsumoto cautioned that Japan might turn into an ‘AI colony’ if it lags behind, justifying a bill aimed at relaxing data-use consent regulations. Scams related to the FIFA World Cup 2026 are currently active, including fraudulent websites and malware. Scams related to the FIFA World Cup 2026 are currently active, including fraudulent websites and malware. More than 4,300 counterfeit FIFA websites, banking malware within streaming applications, and compromised login credentials are currently aiming at fans of World Cup 2026, with the FBI and experts alerting about potential losses in the billions.

AI token prices have dropped by 98%, while expenses for enterprises have surged threefold.

Uber, Microsoft, and Priceline are rushing to respond as AI token legislation escalates. The Linux Foundation is establishing a Tokenomics Foundation to enforce financial discipline in AI expenditures.