Uber partners with Amazon's Trainium lineup through an agreement to expand AWS services.
In summary: Uber has broadened its contract with AWS to operate its real-time ride-matching framework on Amazon's Graviton4 processor and is initiating a pilot for AI model training on Trainium3. This integration places Uber alongside other notable companies like Anthropic, OpenAI, and Apple, providing clear evidence that Amazon's strategy for custom silicon is yielding positive results.
Uber's infrastructure relies on milliseconds for efficiency. Whenever a rider opens the app, a system known as Trip Serving Zones identifies which drivers to include, how to prioritize them, and how quickly to deliver a match, all while the user is still observing the loading animation. Given Uber's operational scale, which surpassed 40 million trips daily in 2025 across 72 countries, the computational expense of this process is significant, with a near-zero tolerance for latency. On April 7, 2026, the company revealed plans to migrate more of this workload to AWS, running Trip Serving Zones on Graviton4 and starting a pilot for training AI models on Trainium3. This marks another strong endorsement of Amazon's custom silicon, with Uber being one of the most consequential clients to date.
What Uber is transitioning and the reasons behind it
The announcement pertains to two specific workloads. Trip Serving Zones, Uber's real-time matching system for riders and drivers, will utilize Graviton4, Amazon's ARM-based processor optimized for high-throughput, low-latency tasks. This workload is fundamentally infrastructural rather than generative AI; its requirements resemble those of telecommunications switching more than model inference, necessitating high responsiveness during periods of increased demand when ride requests surge, all without delays.
Additionally, Uber is launching a pilot for training AI models on Trainium3, using data from their extensive trip history. Having logged 13.567 billion trips over its existence and serving more than 200 million monthly active users, Uber generates a constant flow of behavioral data regarding driver allocation, estimated arrival times, demand patterns, and route optimization. While training AI on this dataset is a long-term project, the economic viability of Trainium3 makes the pilot financially justifiable even prior to establishing any performance benefits.
According to Kamran Zargahi, Uber’s vice-president of engineering, the operational reasoning is straightforward. “Uber operates at a scale where milliseconds are crucial. Shifting more Trip Serving workloads to AWS provides us with the agility necessary to match riders and drivers more swiftly and manage delivery demand spikes without interruptions.” He also highlighted that the AI initiative aims to establish a technological foundation that enhances every Uber user experience, allowing the company to focus on its everyday users. Rich Geraffo, AWS’s vice-president and managing director for North America, emphasized Uber’s real-time needs by stating, “Uber exemplifies one of the most demanding real-time applications globally, and we are pleased to play a vital role in the infrastructure supporting their worldwide operations.”
Uber's complex cloud strategy
The AWS agreement marks the third major cloud partnership for Uber within a span of three years. In 2023, the company signed two separate seven-year contracts, one with Oracle Cloud Infrastructure and another with Google Cloud, while transitioning away from its own data centers. This multicloud strategy is designed to avoid vendor lock-in and allows Uber to allocate specific workloads to the clouds best suited for them. With the addition of AWS, Uber effectively becomes a significant customer of all three major hyperscalers simultaneously.
This structure gives Uber unique leverage in negotiating with each provider and greater freedom to manage workloads based on the best performance-cost balance for various functions. Moving Trip Serving Zones to Graviton4 signals AWS's current standing in relation to high-frequency, latency-sensitive infrastructure. In contrast, the Trainium3 pilot serves as a tentative assessment of whether Amazon’s AI training economics can compete against the GPU-based systems available through Uber’s other cloud partnerships.
The chip at the center of the agreement
Trainium3, Amazon’s third-generation AI training accelerator, offers a clear economic advantage in its specifications. Each chip provides 2.517 petaflops in MXFP8 precision, along with 144 GB of HBM3e memory and 4.9 terabytes per second of memory bandwidth. At scale, Trainium3 operates at approximately 30 to 50 percent of the cost of comparable Nvidia H100 or H200 hardware. The UltraServer configuration can network up to 144 accelerators, achieving nearly 362 MXFP8 petaflops—a potent cluster for training advanced models.
While the cost disparity is notable, the underlying argument focuses on workload suitability. Training large models on proprietary trip data does not require the same interoperability standards as production environment inference, where software ecosystems, CUDA toolchains, and integration dependencies have historically made Nvidia hardware the norm. In training scenarios, where the workflow is more controlled and costs per training run accumulate over numerous experiments, advocating for custom silicon becomes more straightforward. The AI chip acceleration that characterized 2025 generated the volume
Other articles
Uber partners with Amazon's Trainium lineup through an agreement to expand AWS services.
Uber is extending its AWS agreement to utilize Graviton4 for real-time ride-matching and to pilot AI training on Trainium3, alongside Anthropic, OpenAI, and Apple within Amazon's chip lineup.
