A startup claims it has solved the obstacle that was hindering AI progress.

      A Miami-based startup claims to have solved a mathematical issue that has hindered AI models, making them slow and power-intensive for nearly a decade. This assertion was bold enough to draw parallels with Theranos. However, the company now has independent testing results that substantiate much of its claim.

      The startup, named Subquadratic, emerged from stealth mode in May with $29 million in seed funding and introduced a new language model called SubQ. The company asserts that SubQ is faster, more cost-effective, and significantly less energy-draining than the current leading models. Additionally, it can process up to 12 times more text simultaneously.

      The decade-old bottleneck

      To understand the significance of this development, it's essential to know how most large language models operate. At their foundation is a “transformer,” a concept introduced by Google researchers in 2017. This transformer uses a method known as dense attention.

      While dense attention is comprehensive, it is also resource-intensive. It checks every word in a text against every other word, meaning that doubling the text length increases the workload by approximately four times. This “quadratic” scaling is the primary reason LLMs consume so much computational power and energy.

      Subquadratic’s solution

      Subquadratic’s approach involves replacing dense attention with “sparse attention.” Instead of making comparisons between every word, sparse attention focuses only on the relevant pairs. This idea is not new, and many teams have attempted it, yet none have achieved the quality of dense attention until now.

      The company asserts that its version finally matches that quality. Importantly, it dynamically selects which words to emphasize based on the context rather than a predetermined approach. “That’s where the secret sauce lies,” explains co-founder and CTO Alex Whedon.

      The evidence

      Initially, the claims were supported by a few self-reported scores, leading to skepticism. An AI engineer summarized the situation on X, stating that SubQ is “either the biggest breakthrough since the Transformer … or it’s AI Theranos.”

      In response, the company enlisted a third party, Appen, which evaluates models from other firms, to conduct tests. The findings were impressive. In a raw speed test, SubQ operated 56 times faster than FlashAttention, a prominent existing method. On a challenging coding benchmark, it achieved a score of 89.7 percent, close to the best models available.

      The cost differential appears just as significant. According to the startup, running one long-context test on Anthropic’s top model costs approximately $2,600, while SubQ claims the equivalent test costs just eight dollars.

      Skepticism remains

      Despite these assertions, caution is warranted. Benchmarks do not equate to real-world application, and SubQ isn’t widely accessible at this time. While tens of thousands have signed up for the waitlist, only a small number currently have access.

      There is also an aspect in the development process worth noting. Instead of building SubQ from the ground up, Subquadratic utilized an existing open-weight model and integrated its new attention mechanism. This is a typical practice, yet it feels in conflict with the claim of entirely reinventing LLM functionality.

      Independent researcher Will Depue, who previously worked at OpenAI, commented, “They may have created something real and valuable, but the public evidence does not yet support the stronger assertion that they have resolved the quadratic attention bottleneck.”

      The importance of these developments

      Should the results be verified, the implications are significant. More affordable and efficient long-context models could analyze entire codebases, sets of contracts, or large volumes of documents in a single instance. They would also decrease the cost and energy associated with running AI.

      This challenge represents a key objective for the entire industry. As AI grapples with rising operational costs, other startups, like Thomas Reardon's Flourish, are seeking to improve efficiency from different angles. Subquadratic, however, believes the entire field will pivot in its direction. “We don’t think anyone will continue building on transformers in a few years,” declared CEO Justin Dangel.

Other articles

This research discovered an unexpected mental health benefit lurking in your game collection. A recent study discovered that adults who engage in open-world games or casual titles tend to feel less lonely and exhibit greater emotional resilience compared to those who do not play games at all.

Jio submits application for the largest IPO in India and a competitor to Starlink. Reliance Jio has submitted an application for what might be the largest IPO in India’s history and utilized the same Annual General Meeting to introduce a $15 billion satellite network to compete with Starlink.

AI's memory crisis has led to the downfall of a phone. Even Apple took a step back. The demand for memory chips from AI led Nothing to discard a phone, prompting Tim Cook to describe the shortage as 'unsustainable.' The smartphone market could decline by 15%.

Google is following Nvidia's strategy to loosen its hold on AI chips. Google is adopting Nvidia's strategy by employing financial assurances and circular financing to attract customers for its TPUs and to ease its control over AI chips.

OpenAI recruits the AI architect behind Trump and Google's Shazeer. Before its IPO, OpenAI appointed Dean Ball, the creator of Trump's AI Action Plan, to head a new policy team, shortly after acquiring Shazeer from Google.

Google's AI Overviews are identifying fan-fiction creatures as real. Futurism discovered that Google's AI Overviews repeatedly presented horror fan-fiction from the SCP Foundation as factual information in at least 20 instances, without including any disclaimer about its fictional nature.

A startup claims it has solved the obstacle that was hindering AI progress.

Miami-based startup Subquadratic asserts that its SubQ model overcomes the limitations of 'quadratic attention.' Independent evaluations support many of its claims, though skepticism persists.