Spirit AI surpasses Nvidia in the RoboArena robotics benchmark.

Spirit AI surpasses Nvidia in the RoboArena robotics benchmark.

      TL;DR: The Chinese startup Spirit AI has taken the top position on the RoboArena leaderboard, scoring 1,924 compared to Nvidia’s 1,881, signaling a shift in the tech battleground of physical AI.

      Nvidia's latest robotics model held the top spot on the RoboArena leaderboard for just two days before being surpassed by the Hangzhou-based startup Spirit AI. On Wednesday, Spirit AI revealed that its foundation model for embodied intelligence, Spirit v1.6, achieved a score of 1,924, beating Nvidia's Cosmos3-Nano-Policy score of 1,881, while another Nvidia project, DreamZero, ranked third with a score of 1,763. This marks the first occurrence of a Chinese model leading the RoboArena leaderboard, which was co-developed by Nvidia in collaboration with Stanford University and the University of California, Berkeley.

      This development comes at a particularly poignant moment, as Nvidia introduced its Cosmos 3 omnimodel at Computex in Taipei on June 1, branding it as the “open frontier foundation model for physical AI.” Trained on 20 trillion tokens of multimodal data, Cosmos 3 was intended to showcase Nvidia’s leadership in a field it effectively created, but Spirit AI had different intentions.

      Understanding what physical AI measures

      RoboArena does not assess the fluency of chatbots or the quality of image generation. Instead, it tests how proficiently a generalist robot policy can translate into physical actions: object manipulation, navigation, tool usage, perception, planning, and adaptability in new environments. Essentially, it gauges whether a machine can conceptualize and then execute actions.

      Physical AI depends on two primary capabilities. The policy capabilities gauge a model’s capacity to act based on observations, precisely what RoboArena evaluates. World capabilities assess a model’s ability to simulate and predict outcomes based on particular actions.

      The industry is increasingly prioritizing the integration of both. Last September, Chinese researchers presented a unified “Policy World Model” architecture that combines world modeling and trajectory planning into a single framework. This convergence is accelerating across the industry.

      China’s advancements on multiple fronts

      Spirit AI’s success in RoboArena is not an isolated instance. In the wider landscape of physical AI benchmarks, Chinese companies dominate nearly every category.

      In the WorldArena benchmark for assessing embodied world models, Manifold AI's WorldScape-0.2 holds the top position, surpassing Nvidia’s Cosmos-Predict 2.5 in the policy evaluator segment. The perception track is led by China's AgiBot with its recently unveiled GenieEnvisioner-Sim2.0-2B model. Meanwhile, the data engine track is topped by another Chinese startup, DexForce.

      In the WorldScore benchmark, which measures a model’s ability to generate worlds from text inputs, Manifold AI's WorldScape-0.2 again outperforms WonderJourney, a collaboration between Stanford and Google.

      Investment influx

      These impressive technical results are backed by a remarkable influx of funding. On Wednesday, Spirit AI announced a financing round of 1.5 billion yuan ($222 million), marking the fourth round in just three months. This rapid pace is considered the most aggressive funding activity seen in the embodied AI sector. Prior rounds have raised the company’s valuation to over 10 billion yuan ($1.4 billion).

      On the same day, XYZ Embodied AI, developed by the Beijing Academy of Artificial Intelligence, announced it had completed its pre-A round, raising 1 billion yuan within just 10 months to create “embodied brains” and world models. Manifold AI has secured funding through five rounds in 10 months, with its latest round in April reportedly raising hundreds of millions of yuan.

      The broader Chinese robotics sector attracted $3.4 billion in venture capital in 2025, outpacing the United States by 42 percent. This gap appears to be widening in 2026.

      Nvidia’s response strategy

      Nvidia is also taking action. At Computex, CEO Jensen Huang announced a partnership with Chinese robotics company Unitree, which is planning a $7 billion IPO, along with Singaporean robotic hand manufacturer Sharpa to create a humanoid robot reference design. This platform will integrate Unitree’s H2 Plus humanoid body, Sharpa’s Wave tactile hands, and Nvidia’s Jetson AGX Thor T5000 processor.

      Huang also launched the Cosmos Coalition, bringing together AI labs such as Agile Robots, Black Forest Labs, Runway, and Skild AI to enhance open-world models. The intent is clear: Nvidia aims to serve as the foundational infrastructure for the entire physical AI ecosystem, even if individual models face challenges in benchmarks.

      However, Huang acknowledged the fundamental bottleneck in the sector: “For robotic systems and physical AI, data is the hardest problem,” he stated at Computex. This admission highlights why China may have a structural advantage.

      The data dilemma

      Alexandr Wang, the founder of Scale AI who joined

Other articles

Die Ernennung von von der Leyen zum KI-Beauftragten stößt auf Kritik wegen Interessenkonflikten. Die Ernennung von von der Leyen zum KI-Beauftragten stößt auf Kritik wegen Interessenkonflikten. The EU designated Siemens chairman Jim Hagemann Snabe as an AI envoy just weeks after the company assisted in rolling back the AI Act. Detractors argue that this gives policy power to industry lobbyists. Chesky is establishing an AI lab, competing with Altman's OpenAI. Chesky is establishing an AI lab, competing with Altman's OpenAI. Airbnb CEO Brian Chesky is supporting a new AI laboratory dedicated to user interaction and design, indicating that the leading founders in Silicon Valley no longer rely on frontier labs to create what they require. Scams related to the FIFA World Cup 2026 are active: fraudulent websites and malware. Scams related to the FIFA World Cup 2026 are active: fraudulent websites and malware. More than 4,300 counterfeit FIFA websites, banking malware in streaming applications, and compromised login credentials are currently aimed at World Cup 2026 supporters, with the FBI and analysts cautioning about potential losses in the billions. Scams related to the FIFA World Cup 2026 are currently active, including fraudulent websites and malware. Scams related to the FIFA World Cup 2026 are currently active, including fraudulent websites and malware. More than 4,300 counterfeit FIFA websites, banking malware within streaming applications, and compromised login credentials are currently aiming at fans of World Cup 2026, with the FBI and experts alerting about potential losses in the billions. Claude completes 80% of its coding and requests a pause on AI development. Claude completes 80% of its coding and requests a pause on AI development. Claude currently writes 80% of the production code for Anthropic. The company's latest paper outlines a strategy for recursive self-improvement and advocates for a global pause mechanism. Mira Murati reappears with a caution regarding AI governance and introduces a new product. Mira Murati reappears with a caution regarding AI governance and introduces a new product. The former CTO of OpenAI shares insights on Thinking Machines' interaction models, discusses the firing of Altman, and contends that artificial intelligence lacks sufficient structural safeguards in her first public appearance in a year and a half.

Spirit AI surpasses Nvidia in the RoboArena robotics benchmark.

Chinese startup Spirit AI has taken the lead on the RoboArena leaderboard co-developed by Nvidia, achieving a score of 1,924 compared to Nvidia's 1,881, marking the rise of physical AI as the new frontier in technology.