Dots MOCR
rednote-hilab/dots.mocr
published Mar 2026 · updated Jul 2026
Dots MOCR is a vlm model that jointly parses text and graphics from documents into structured outputs, including SVG code for charts, diagrams, and UI layouts.
specs
| Task | Multimodal OCR / Document Parsing |
| Architecture | DotsOCRForCausalLM (custom VLM based on Qwen3-VL-4B) |
| Parameters | 3B |
| License | MIT |
about this model
dots.mocr is a 3B-parameter multimodal OCR model that jointly parses text and structured graphics from documents into unified textual representations, including SVG code for visual elements. It is hosted on Gigarouter as an OpenAI-compatible API.
Document Parsing Performance
On the OCR Arena Elo leaderboard, dots.mocr achieves an average Elo score of 1124.7 across olmOCR-Bench, OmniDocBench (v1.5), and XDocParse, ranking second only to Gemini 3 Pro (1210.7). It sets a new state of the art on olmOCR Bench with an overall score of 83.9±0.9, outperforming all open-source document parsing systems. On OmniDocBench (v1.5), it achieves the lowest TextEdit error rate (0.031) and Read Order Edit error rate (0.029) among specialized VLMs.
Structured Graphics Parsing
dots.mocr converts structured graphics such as charts, UI layouts, scientific figures, and chemical diagrams directly into SVG code. On the Unisvg benchmark, it achieves a score of 0.894 (0.850 low-level, 0.923 high-level), surpassing Gemini 3 Pro (0.735). It also scores 0.801 on Design2Code and 0.772 on Chartmimic.
Benchmark Summary
| Benchmark | dots.mocr Score | Context |
|---|---|---|
| OCR Arena Elo (Average) | 1124.7 | Second only to Gemini 3 Pro (1210.7) |
| olmOCR Bench (Overall) | 83.9±0.9 | State-of-the-art among open-source systems |
| OmniDocBench v1.5 (TextEdit) | 0.031 | Lowest error rate among specialized VLMs |
| Unisvg (Score) | 0.894 | Outperforms Gemini 3 Pro (0.735) |
dots.mocr is released under the MIT license. It is a 3B-parameter model trained via staged pretraining and supervised fine-tuning on a data engine built from PDFs, rendered webpages, and native SVG assets. A variant optimized for image-to-SVG tasks, dots.mocr-svg, is also available.
best for
- ·Multilingual document parsing (text, tables, headers)
- ·Converting charts, UI layouts, and scientific figures to SVG
- ·Interactive dialogue and semantic understanding of document content
FAQ
It excels at multimodal document parsing, converting structured graphics (charts, diagrams) to SVG, and interactive dialogue about documents.
It has 3B parameters, making it compact and efficient for inference.
It is released under the MIT license.
Input is an image plus a text prompt; output is text that can include structured SVG code.
Use the gigarouter OpenAI-compatible endpoint with your API key.
We're benchmarking and onboarding Dots MOCR as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.