Mistral OCR 4: an affordable, self-hosted AI for document processing.
Mistral OCR 4 interprets a document like a structured map rather than a dense block of text. It is affordable, supports 170 languages, and can operate entirely on your own servers. The AI leader from Europe is targeting the enterprise back office.
On June 23, Mistral unveiled its latest model, OCR 4, which transforms documents into structured data, as noted in a blog post. The model is designed to remain compact and focused, with a primary goal of tackling the vast amount of paperwork globally.
Optical character recognition has existed for many years. The selling point of this new model lies in its output. Traditional systems convert pages into plain text, while OCR 4 returns a mapped representation of the page, labeling and locating each section. According to Mistral, independent reviewers preferred it over all competing systems tested, with an impressive average success rate of 72%.
From a page to a structured map
OCR 4 introduces three new features simultaneously. It creates bounding boxes around every element so that software can pinpoint exactly where each line is located. It classifies each section by type, identifying titles, tables, equations, and even signatures. Additionally, it provides a confidence score for each page and word, allowing humans to know which areas require verification.
Mistral reports that customers requested bounding boxes more than any other feature, as they help an application specify the precise source of an answer. With the combination of block types and confidence scores, users can effectively manage citations, redactions, and human reviews. The output is also formatted as clean markdown.
The importance of this shift lies in what follows. While a chatbot can summarize a contract, an agent needs to file it. For this, software must distinguish a signature from a subtotal and know their specific locations. OCR 4 provides that foundational structure, which older tools failed to supply by offering only flat blocks of text.
This represents a significant departure from the previous version. OCR 3 emphasized converting pages into clean text and organized tables, whereas OCR 4 provides the entire structure. Each block includes its location, type, and score. Subsequent systems can thus understand not only the content of a document but also its architecture.
Designed for the back office
OCR 4 focuses on alleviating tedious enterprise tasks. It supports retrieval systems and the “RAG” pipelines that enable chatbots to respond based on a company’s internal documents. It also equips AI agents with the necessary structure to perform tasks, such as filling out forms, processing invoices, and conducting compliance checks.
Its capabilities are extensive. The model can process PDFs, Word, PowerPoint, and OpenDocument files, supporting 170 languages from 10 language families. Mistral claims it performs well with low-resource languages where competitors struggle. Initial users are digitizing archives, converting invoices into structured fields, and extracting clear text from scientific reports.
Furthermore, OCR 4 integrates with Mistral’s newly introduced Search Toolkit, an open-source framework revealed at their AI Now Summit. The model’s structured output can seamlessly feed into this pipeline, aiming to provide developers with citation-ready inputs for their responses.
Speed is another selling point. Anaqua, which handles intellectual property filings, reported that OCR 4 operates approximately four times faster per page than its prior tool. For high-volume document processing, particularly where deadlines are strict, this efficiency can significantly influence workflow scalability.
This solution aligns with Mistral’s transition beyond chatbots. The company already supplies industrial AI to clients such as Airbus, BMW, and EDF, and its work with documents represents another form of their enterprise investment.
The sovereignty angle
A key feature for European customers is the model's hosting capability. OCR 4 is compact enough to fit into a single container, allowing companies to run it on their own infrastructure and keep sensitive documents secure.
This aligns with Mistral’s core message. The company positions itself as a sovereign European alternative to American AI solutions, with self-hosting addressing the data-residency concerns arising from increasingly strict sovereignty regulations in Europe. For banks, hospitals, and government entities, maintaining documents on domestic soil is crucial.
Affordable and widely accessible
The pricing appears competitive. The API charges $4 per 1,000 pages, dropping to $2 in batch mode. A more advanced Document AI product, which adapts output into customized fields, costs $5 per 1,000 pages. One client, the financial research firm Rogo, reported similar accuracy to its previous provider at roughly one-eighth the cost.
Distribution is also broad. OCR 4 is available via Mistral’s studio, Amazon SageMaker, and Microsoft’s Foundry, with Snowflake support on the horizon. With Mistral's valuation nearing €20 billion in current funding discussions, the company is ensuring its tools are integrated into the cloud environments already used by its customers.
Microsoft has described the launch as a significant milestone in its partnership with Mistral. This endorsement is valuable,
Other articles
Mistral OCR 4: an affordable, self-hosted AI for document processing.
Mistral OCR 4 transforms documents into structured data, operates on your own servers, and is priced from $2 for every 1,000 pages. It’s Europe's choice for back-office operations.
