A startup claims it has solved the obstacle that was hindering AI progress.
A Miami-based startup claims to have solved a mathematical issue that has hindered AI models, making them slow and power-intensive for nearly a decade. This assertion was bold enough to draw parallels with Theranos. However, the company now has independent testing results that substantiate much of its claim.
The startup, named Subquadratic, emerged from stealth mode in May with $29 million in seed funding and introduced a new language model called SubQ. The company asserts that SubQ is faster, more cost-effective, and significantly less energy-draining than the current leading models. Additionally, it can process up to 12 times more text simultaneously.
The decade-old bottleneck
To understand the significance of this development, it's essential to know how most large language models operate. At their foundation is a “transformer,” a concept introduced by Google researchers in 2017. This transformer uses a method known as dense attention.
While dense attention is comprehensive, it is also resource-intensive. It checks every word in a text against every other word, meaning that doubling the text length increases the workload by approximately four times. This “quadratic” scaling is the primary reason LLMs consume so much computational power and energy.
Subquadratic’s solution
Subquadratic’s approach involves replacing dense attention with “sparse attention.” Instead of making comparisons between every word, sparse attention focuses only on the relevant pairs. This idea is not new, and many teams have attempted it, yet none have achieved the quality of dense attention until now.
The company asserts that its version finally matches that quality. Importantly, it dynamically selects which words to emphasize based on the context rather than a predetermined approach. “That’s where the secret sauce lies,” explains co-founder and CTO Alex Whedon.
The evidence
Initially, the claims were supported by a few self-reported scores, leading to skepticism. An AI engineer summarized the situation on X, stating that SubQ is “either the biggest breakthrough since the Transformer … or it’s AI Theranos.”
In response, the company enlisted a third party, Appen, which evaluates models from other firms, to conduct tests. The findings were impressive. In a raw speed test, SubQ operated 56 times faster than FlashAttention, a prominent existing method. On a challenging coding benchmark, it achieved a score of 89.7 percent, close to the best models available.
The cost differential appears just as significant. According to the startup, running one long-context test on Anthropic’s top model costs approximately $2,600, while SubQ claims the equivalent test costs just eight dollars.
Skepticism remains
Despite these assertions, caution is warranted. Benchmarks do not equate to real-world application, and SubQ isn’t widely accessible at this time. While tens of thousands have signed up for the waitlist, only a small number currently have access.
There is also an aspect in the development process worth noting. Instead of building SubQ from the ground up, Subquadratic utilized an existing open-weight model and integrated its new attention mechanism. This is a typical practice, yet it feels in conflict with the claim of entirely reinventing LLM functionality.
Independent researcher Will Depue, who previously worked at OpenAI, commented, “They may have created something real and valuable, but the public evidence does not yet support the stronger assertion that they have resolved the quadratic attention bottleneck.”
The importance of these developments
Should the results be verified, the implications are significant. More affordable and efficient long-context models could analyze entire codebases, sets of contracts, or large volumes of documents in a single instance. They would also decrease the cost and energy associated with running AI.
This challenge represents a key objective for the entire industry. As AI grapples with rising operational costs, other startups, like Thomas Reardon's Flourish, are seeking to improve efficiency from different angles. Subquadratic, however, believes the entire field will pivot in its direction. “We don’t think anyone will continue building on transformers in a few years,” declared CEO Justin Dangel.
Other articles
A startup claims it has solved the obstacle that was hindering AI progress.
Miami-based startup Subquadratic asserts that its SubQ model overcomes the limitations of 'quadratic attention.' Independent evaluations support many of its claims, though skepticism persists.
