Zhipu AI released GLM-OCR in February. It has 0.9 billion parameters. It scored 94.62 on OmniDocBench V1.5, which puts it at the top of the leaderboard ahead of Qwen3-VL-235B, a model with over 260 times more parameters. It got 3 million downloads on Hugging Face in its first month, and it costs $0.03 per million tokens.
That last part is the one worth paying attention to.
What the pricing actually means
Processing 1,000 A4 scanned pages through GLM-OCR's API costs about $0.07 USD. Traditional OCR services charge roughly 10x that for equivalent work. Enterprise platforms with annual contracts and per-page fees are even worse.
The model is MIT-licensed. You can self-host the entire thing at zero API cost if you want to. A 0.9B parameter model runs on basically any modern server or a decent workstation in 2026. Ollama supports it, so local setup takes maybe ten minutes.
Speed is reasonable too: 1.86 pages per second for PDFs, 0.67 per second for images. Not the fastest thing ever built, but at this price point it doesn't need to be.
It's not cheap because it's bad
I want to be clear about this because the price makes people suspicious. GLM-OCR handles the hard stuff: handwritten text, printed text in multiple languages, mathematical formulas with LaTeX output, nested tables with merged cells, structured extraction from invoices and receipts and forms.
The architecture pairs a CogViT visual encoder with a GLM-0.5B language decoder connected by a lightweight cross-modal connector that does token downsampling. They used multi-token prediction loss and reinforcement learning during training to keep the model small without tanking accuracy. It worked. The benchmark numbers speak for themselves.
You can deploy through vLLM, SGLang, or Ollama. Two years ago you would have needed serious hardware for this kind of performance. Now you don't.
The timing matters
Large OCR vendors are still charging what they charged last year. Their customers mostly don't know GLM-OCR exists, and even the ones who do are locked into contracts with long sales cycles. That gap between what's available and what most organizations are paying for is real, and it won't last forever.
If you're a 10-to-50 person operation processing invoices, permits, intake forms, compliance paperwork, or scanned records, you can cut document processing costs by 90% or more starting this week. The model is released. The SDK installs with pip install glmocr. Set your API key and you're running from CLI or Python. No GPU needed for the cloud path.
Government offices sitting on rooms full of paper forms can actually afford to digitize them now. Nonprofits that gave up on OCR because the per-page pricing didn't fit their budget should look again. Consultancies quoting document digitization projects based on last year's tooling are leaving money on the table or overcharging clients, maybe both.
Why a 90% cost drop changes more than the budget line
When OCR was expensive, organizations made decisions about which documents were worth digitizing. Plenty of projects got shelved because the math didn't work at old prices. A 90% reduction doesn't just make existing workflows cheaper. It makes workflows viable that weren't before.
Big enterprises will get here eventually. They always do. But they'll spend a year on vendor evaluations and security reviews and phased rollouts. Smaller organizations that move now get to operate at a cost structure their larger competitors won't match for a while.
Getting started
The practical path: start with the cloud API to test accuracy on your specific documents. GLM-OCR handles most document types well but every org has its own weird edge cases, and you want to find yours before you commit to anything. Once you've confirmed it works, move to self-hosting if your volume justifies it. The layout analysis pipeline uses PP-DocLayoutV3 under Apache 2.0, so the whole stack is open.
Total migration time for most organizations is days, not months.
We've been running GLM-OCR on real document workflows for the past few weeks: government forms, multilingual records, messy scans with coffee stains and bad handwriting. We know where it performs well and where it struggles. If your org processes documents at any real volume and you want to know what a 90% cost reduction looks like for your specific situation, get in touch.