Nebius purchases Eigen AI, comprising 20 employees, for $643 million as inference optimization emerges as the most crucial component of AI infrastructure.
Nebius, the Dutch neocloud that separated from Yandex in 2024, has reached an agreement to acquire Eigen AI for $643 million, valuing the 20-member MIT-alumni startup at approximately $32 million per employee. Eigen’s technology for optimizing inference maximizes the number of tokens generated per Nvidia GPU, which is a crucial capability in AI infrastructure. This acquisition enhances Nebius's Token Factory inference platform as the neocloud market grows rapidly, with companies like CoreWeave and FluidStack raising billions.
Nebius Group, the Dutch cloud computing firm that split from the Russian internet company Yandex in 2024, has announced it will buy Eigen AI for around $643 million in cash and stock. The deal, revealed on May 1, involves a 20-person startup established by graduates from MIT’s HAN Lab. In a landscape where major AI firms are valued in the hundreds of billions and significant acquisitions involve thousands of engineers, paying $643 million for just 20 people requires clarification. The reason lies in their expertise in inference. Eigen AI’s technology maximizes the number of tokens, essential units of data in large language models, generated by each Nvidia chip when running AI models. Nebius co-founder and chief business officer Roman Chernin stated, “This is akin to the Olympic competition of the current market: who can extract more tokens for the same price?” He likened the Eigen team to “Olympic runners in this field,” indicating the value of such expertise at $32 million per individual.
The economics of the situation highlight that the most costly aspect of the AI industry is not training models but running them. Training a cutting-edge model incurs a one-time capital expenditure in the hundreds of millions of dollars to produce weights. Inference, on the other hand, is a recurring operational cost that scales with each user query, API call, and generated token. For AI-as-a-service companies, inference represents the primary expense. Every improvement in inference efficiency, each additional token produced from the same Nvidia GPU, leads to reduced costs or increased margins. Eigen AI focuses on optimizing the performance of open-source models from OpenAI, Alibaba, Meta, and Nvidia, ensuring each chip produces more output for the same electrical and silicon input.
The notable method employed by Eigen AI’s founders is activation-aware weight quantization, which compresses AI models from higher-precision to lower-precision formats with minimal loss in output quality. Co-founder Wei-Chen Wang received the MLSys 2024 Best Paper Award for this innovation. In practical terms, quantization enables a model that normally needs four GPUs to operate on just two, or allows a model on a single GPU to generate tokens at double the speed. For Nebius, which has raised $700 million from Nvidia and Accel to expand its GPU resources, deriving greater value from each chip alters the unit economics of the entire operation.
Nebius has a unique role in the AI infrastructure market, identifying itself among the "neoclouds," which are cloud providers renting AI computing capacity to businesses instead of developing consumer products. While established hyperscalers like AWS, Microsoft Azure, and Google Cloud dominate the overall market, neoclouds like Nebius have found a niche by offering AI-optimized infrastructure with reduced overhead and faster deployment times. Nebius has been significantly increasing its Nvidia GPU capacity in its Finnish data center and has launched a data center in Paris as part of a $1 billion investment strategy in Europe. In November, it introduced Token Factory, a managed inference solution competing with startups like Fireworks and Baseten, along with offerings from hyperscalers.
By acquiring Eigen AI, Nebius aims to enhance Token Factory, making it the most efficient inference platform available. With Eigen’s optimization technology integrated, Nebius can provide clients lower per-token costs or increased throughput from the same hardware, granting a competitive advantage in a market characterized by price transparency and low switching costs. The neocloud sector is rapidly growing, evidenced by companies like CoreWeave securing infrastructure deals valued in the tens of billions. FluidStack, another neocloud, is reportedly in discussions to raise $1 billion at an $18 billion valuation. The competitive landscape is clear: the organization that can deliver the most tokens per dollar per GPU will prevail.
The acquisition of Eigen is Nebius’s second within three months, following its February purchase of Tavily, an AI agent search company for $275 million. Chernin mentioned that the company is exploring additional acquisition opportunities. This pattern indicates a strategy of acquiring small teams with exceptional technological skills that would take significant time to develop internally. Eigen AI contributes 20 talented individuals and a production-grade optimization stack, while Tavily provided search infrastructure for AI agents. Both deals enhance Nebius’s capabilities, shifting the company from merely renting GPU capacity to offering higher-value services that directly engage with clients.
Chernin articulated the neocloud dilemma succinctly: “We don’t want to be the infrastructure
Other articles
Nebius purchases Eigen AI, comprising 20 employees, for $643 million as inference optimization emerges as the most crucial component of AI infrastructure.
Nebius acquires Eigen AI, a 20-person spinout from MIT, for $643 million. This company specializes in maximizing tokens per GPU. In the neocloud competition, optimizing inference is the key advantage.
