PP-OCRv6 Medium Rec
PaddlePaddle/PP-OCRv6_medium_rec
published Jun 2026 · updated Jun 2026
PP-OCRv6 Medium Rec is a lightweight image-to-text model for optical character recognition (OCR) that achieves 83.2% recognition accuracy with only 19M parameters.
specs
| Task | image-to-text (OCR recognition) |
| Architecture | MetaFormer-style backbone with structural reparameterization, decoupled spatial and channel mixing |
| Parameters | 19M |
| License | Apache 2.0 |
about this model
PP-OCRv6_medium_rec is a lightweight image-to-text recognition model (19M parameters) that converts printed or handwritten text from images into machine-readable strings, designed for server-side OCR deployment.
Built on a unified MetaFormer-style backbone with structural reparameterization, it decouples spatial token mixing from channel mixing. The model achieves 83.2% weighted-average recognition accuracy on in-house benchmarks, outperforming the previous PP-OCRv5_server by +5.1% and surpassing billion-scale vision-language models such as Qwen3-VL-235B, GPT-5.5, and Gemini-3.1-Pro on dedicated OCR tasks—despite orders of magnitude fewer parameters.
Per-category recognition accuracy (PP-OCRv6_medium) on in-house test sets:
| Category | Accuracy (%) |
|---|---|
| Handwritten CN | 62.1 |
| Handwritten EN | 67.8 |
| Printed CN | 91.5 |
| Printed EN | 94.1 |
| TC | 78.6 |
| Ancient | 72.4 |
| JP | 90.5 |
| Confusable | 64.9 |
| Special | 61.7 |
| General | 87.5 |
| Pinyin | 78.1 |
| Artistic | 71.2 |
| Industrial | 77.4 |
| Screen | 82.5 |
| Card | 88.1 |
Hosted on gigarouter as a managed API, the model is available via an OpenAI-compatible endpoint—no infrastructure setup required.
best for
- ·Extracting text from scanned documents
- ·Recognizing handwritten text in forms
- ·Real-time OCR on mobile or edge devices
FAQ
It is best for high-accuracy OCR on printed and handwritten text in images, especially when low latency or small model size is required.
With 19M parameters, it is much smaller than billion-scale VLMs yet surpasses them on OCR benchmarks. Its tiny tier runs 3.9x faster than PP-OCRv5_mobile on CPU.
It is released under the Apache 2.0 license, allowing free use, modification, and distribution.
Input: an image containing a single text line (e.g., cropped from a detector). Output: the recognized text string.
Use the gigarouter OpenAI-compatible endpoint with your API key. Send an image URL or base64-encoded image in the request.
We're benchmarking and onboarding PP-OCRv6 Medium Rec as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.