A startup claims it has solved the issue that was hindering AI progress.

      A startup in Miami claims to have solved a mathematical issue that has caused AI models to be inefficient and consume excessive power for nearly ten years. The assertion has drawn bold parallels to Theranos, but the company now has independent test results that largely support it.

      The startup is named Subquadratic. It emerged from stealth mode in May, securing $29 million in seed funding along with a new language model called SubQ. According to the company, SubQ is faster, more cost-effective, and significantly less energy-intensive than the leading models in use today. Additionally, it can process up to 12 times the amount of text simultaneously.

      Understanding the decade-old bottleneck is essential to grasp why this is significant. Most large language models are built around a “transformer” architecture, which was developed by Google researchers in 2017. At the heart of the transformer is a process known as dense attention.

      While dense attention is comprehensive, it is costly, as it compares each word in a text to every other word. Therefore, when the text length is doubled, the required computational work increases roughly fourfold. This “quadratic” scaling is a primary reason large language models require substantial computational resources and energy.

      Subquadratic’s solution involves replacing dense attention with “sparse attention.” Instead of comparing every word to all others, sparse attention retains only the relevant word pairs. Although this concept is not new and various teams have attempted it, none have previously achieved the same level of quality as dense attention.

      The company asserts that their version finally meets that benchmark. Importantly, it selects which words to concentrate on dynamically, depending on the content, rather than adhering to a predetermined pattern. “That’s kind of where the secret sauce is,” explains co-founder and CTO Alex Whedon.

      Initially, the claims were based on a few self-reported scores, leading to skepticism in the community. One AI engineer summarized the sentiment on X, stating that SubQ is “either the biggest breakthrough since the Transformer… or it’s AI Theranos.”

      To address the skepticism, the company enlisted a third party, Appen—a firm that assesses other companies’ models—to conduct the tests. The results were impressive. In a raw speed test, SubQ performed 56 times faster than FlashAttention, a leading method. On a challenging coding benchmark, it achieved a score of 89.7%, nearing the performance of the top models.

      The cost difference is similarly significant. According to Subquadratic, running one long-context test on Anthropic’s top model costs about $2,600, while the same test on SubQ costs just eight dollars.

      Nonetheless, there are reasons for caution. Benchmarks do not necessarily translate to real-world applications. Moreover, SubQ is not yet widely accessible; although tens of thousands are on the waitlist, only a select few have access to it.

      Additionally, there is a nuance in the company’s origin story. Instead of developing SubQ from the ground up, Subquadratic built upon an existing open-weight model and integrated its new attention method. While this is a common practice, it contrasts with the claim of completely reinventing LLM functionality.

      “They may have built something real and useful,” comments Will Depue, an independent researcher and former OpenAI employee. “But the public evidence does not yet substantiate the stronger assertion that they have resolved the quadratic attention bottleneck.”

      If the claims prove accurate, the implications could be substantial. More affordable, faster long-context models could analyze entire codebases, sets of contracts, or large document collections in a single pass, thereby reducing the cost and energy demands of running AI.

      This ambition is one that the entire industry is pursuing, as AI currently grapples with the escalating costs associated with AI operations. Other startups, like Thomas Reardon’s Flourish, are exploring efficiency from different perspectives. However, Subquadratic is positioning itself as a leader in the field's future. “We don’t think anybody will be building on transformers in a few years,” asserts CEO Justin Dangel.

Other articles

OpenAI recruits the AI architect behind Trump and Google's Shazeer. Before its IPO, OpenAI appointed Dean Ball, the creator of Trump's AI Action Plan, to head a new policy team, shortly after acquiring Shazeer from Google.

Jio submits application for India's largest IPO and a competitor to Starlink. Reliance Jio has submitted an application for what may become India's largest IPO in history, and at the same Annual General Meeting, it also introduced a $15 billion satellite network aiming to compete with Starlink.

This research uncovered an unexpected benefit for mental health lurking in your game collection. A recent study discovered that adults who engage in open-world games or casual titles feel less lonely and exhibit greater emotional resilience compared to those who do not play games at all.

Jio submits application for the largest IPO in India and a competitor to Starlink. Reliance Jio has submitted an application for what might be the largest IPO in India’s history and utilized the same Annual General Meeting to introduce a $15 billion satellite network to compete with Starlink.

Google's AI Overviews are declaring fan-fiction creatures as real. Futurism discovered that Google's AI Overviews has repeatedly presented SCP Foundation horror fan-fiction as reality in at least 20 instances, without any disclaimer indicating it was fiction.

The UK's leading data and AI regulator resigns in an unprecedented move. John Edwards has stepped down from his position as the UK's information commissioner after acknowledging the use of 'inappropriate' humor, marking the first resignation of its kind in the organization's 40-year history.

A startup claims it has solved the issue that was hindering AI progress.

Miami-based startup Subquadratic asserts that its SubQ model overcomes the 'quadratic attention' limitation. While independent evaluations support much of this claim, there are still lingering uncertainties.