LightOnOCR 1B 1025
lightonai/LightOnOCR-1B-1025
published Oct 2025 · updated Jun 2026
LightOnOCR 1B 1025 is a compact end-to-end vision-language model for OCR and document understanding that extracts text from document images.
specs
| Task | Optical Character Recognition (OCR) and Document Understanding |
| Architecture | Vision Transformer (Pixtral-based) encoder + Qwen3-based text decoder |
| Parameters | 1B |
| License | Apache 2.0 |
about this model
LightOnOCR-1B-1025 is an end-to-end vision-language model for Optical Character Recognition (OCR) and document understanding that converts high-resolution document images into layout-aware, naturally ordered text. It combines a Pixtral-based Vision Transformer encoder with a Qwen3-based lightweight text decoder, distilled from high-quality open VLMs, making it fully differentiable with no external OCR pipeline.
The model is optimized for speed and efficiency: it is 5× faster than dots.ocr, 2× faster than PaddleOCR-VL-0.9B, and 1.73× faster than DeepSeekOCR. On a single H100, it processes 5.71 pages per second (approximately 493,000 pages per day) at a cost below $0.01 per 1,000 pages. It handles tables, receipts, forms, multi-column layouts, and mathematical notation. All benchmarks are evaluated using vLLM on Olmo-Bench.
best for
- ·Extracting text from scanned PDFs and documents
- ·Parsing tables, receipts, and forms
- ·Processing multi-column layouts and math notation
- ·High-throughput document digitization at low cost
FAQ
PNG and JPEG images, typically rendered from PDF at 1540px longest dimension.
Plain text extracted from the document image.
It is 9x smaller and several times faster than prior best-performing models on OlmOCR-Bench, while achieving state-of-the-art accuracy in its weight class.
Yes, under Apache 2.0 license, model weights are available on Hugging Face.
Use the OpenAI-compatible endpoint with your API key to send image URLs or base64-encoded images.
We're benchmarking and onboarding LightOnOCR 1B 1025 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.