Uber becomes part of Amazon's Trainium lineup through an AWS expansion agreement.
In summary, Uber has broadened its AWS contract to utilize real-time ride-matching infrastructure powered by Amazon's Graviton4 processor and is experimenting with AI model training on Trainium3. This development aligns Uber with Anthropic, OpenAI, and Apple, which serves as strong evidence of the success of Amazon's custom silicon strategy.
Uber's infrastructure operates on a millisecond basis. Whenever a rider opens the app, a system known as Trip Serving Zones quickly assesses which drivers to consider, how to prioritize them, and how to provide a match, all before the loading animation is finished. Given Uber's scale, exceeding 40 million daily trips across 72 countries by 2025, the computing costs are significant, and the allowed latency is virtually non-existent. On April 7, 2026, the company revealed plans to shift more of this workload to AWS, leveraging the Graviton4 for Trip Serving Zones and initiating a pilot for AI model training on Trainium3. This marks the latest instance of major tech firms opting for Amazon's custom silicon instead of conventional options, representing perhaps the most operationally significant client for Amazon's chip program to date.
Details on Uber's transition and its reasons
The announcement pertains to two specific workloads. The Trip Serving Zones, which facilitate the real-time matching of riders with drivers, will be powered by Graviton4, an ARM-based processor engineered for high-throughput, low-latency computing. This workload doesn't involve AI in a generative sense; rather, it relates to infrastructure where requirements are similar to telecommunications switching, necessitating responsiveness, especially during demand spikes when ride requests surge, demanding the system to scale efficiently without delays.
In a separate pilot initiative, Uber will begin training AI models on Trainium3 using data from its extensive trip history. The company has documented 13.567 billion trips over its existence and caters to over 200 million active monthly users, producing a constant influx of behavioral data regarding driver assignments, estimated arrival times, demand trends, and route optimization. Training AI on this dataset is a long-term project, yet the economic viability of using Trainium3 justified the pilot even without proven performance.
Kamran Zargahi, Uber's vice president of engineering, clearly articulated the operational reasoning. “Uber operates at a scale where milliseconds are crucial. Transitioning more Trip Serving workloads to AWS enables us to quickly match riders to drivers and effectively manage delivery demand spikes without interruptions.” Regarding the AI aspect, Zargahi noted the company is “establishing a technology foundation that will enhance every Uber experience, keeping our focus on the users who rely on Uber daily.” Rich Geraffo, vice president and managing director for North America at AWS, emphasized the partnership by highlighting Uber's real-time requirements: “Uber is one of the most challenging real-time applications globally, and we take pride in being a key component of the infrastructure that supports their worldwide operations.”
Uber’s intricate cloud strategy
The AWS agreement marks the third significant cloud partnership Uber has formed in the past three years. In 2023, the company entered into two seven-year contracts, one with Oracle Cloud Infrastructure and another with Google Cloud, as part of its strategy to exit its own data centers. This multicloud approach is intended as a safeguard against vendor lock-in and to match specific workloads to the most suitable cloud providers. With AWS on board, Uber effectively becomes a major customer of all three leading hyperscalers simultaneously.
The practical implications of this strategy grant Uber distinctive leverage in negotiations with each provider and the freedom to direct workloads towards the platform that offers the best performance-to-cost ratio for individual tasks. Transitioning Trip Serving Zones to Graviton4 signifies AWS's current standing for latency-sensitive, high-frequency infrastructure. The Trainium3 pilot serves as a tentative exploration of whether Amazon’s AI training economics can rival the GPU-based infrastructure Uber currently uses through its existing cloud relationships.
Insights on the chip involved in the deal
Trainium3 represents Amazon's third-generation AI training accelerator, presenting a clear cost argument. Each chip offers 2.517 petaflops in MXFP8 precision, alongside 144 GB of HBM3e memory and a memory bandwidth of 4.9 terabytes per second. When scaled, Trainium3 operates at about 30 to 50 percent of the cost when compared to equivalent Nvidia H100 or H200 hardware. The UltraServer configuration allows for networking up to 144 accelerators, resulting in roughly 362 MXFP8 petaflops — a cluster capable of training state-of-the-art models.
While the cost difference is the main highlight, the fundamental argument concerns workload compatibility. Training extensive models using proprietary trip data doesn't require the same interoperability as inference within production settings, where software ecosystems, CUDA toolchains, and integration dependencies have historically placed Nvidia hardware in a dominant position. In controlled training scenarios, where workflows are more manageable and the expenses of training accumulate over numerous experiments, the case
Other articles
Uber becomes part of Amazon's Trainium lineup through an AWS expansion agreement.
Uber is broadening its contract with AWS to operate real-time ride-matching using Graviton4 and to test AI training on Trainium3, aligning with Anthropic, OpenAI, and Apple among those utilizing Amazon's chips.
