skip to content
gigarouter gigarouter
models / image-to-text · coming soon

PP-OCRv6 Medium Rec

PaddlePaddle/PP-OCRv6_medium_rec

published Jun 2026 · updated Jun 2026

PP-OCRv6 Medium Rec is a lightweight image-to-text model for optical character recognition (OCR) that achieves 83.2% recognition accuracy with only 19M parameters.

status
coming soon
API providers
0
downloads / mo
79.9K
license
apache-2.0

specs

Taskimage-to-text (OCR recognition)
ArchitectureMetaFormer-style backbone with structural reparameterization, decoupled spatial and channel mixing
Parameters19M
LicenseApache 2.0

about this model

PP-OCRv6_medium_rec is a lightweight image-to-text recognition model (19M parameters) that converts printed or handwritten text from images into machine-readable strings, designed for server-side OCR deployment.

Built on a unified MetaFormer-style backbone with structural reparameterization, it decouples spatial token mixing from channel mixing. The model achieves 83.2% weighted-average recognition accuracy on in-house benchmarks, outperforming the previous PP-OCRv5_server by +5.1% and surpassing billion-scale vision-language models such as Qwen3-VL-235B, GPT-5.5, and Gemini-3.1-Pro on dedicated OCR tasks—despite orders of magnitude fewer parameters.

Per-category recognition accuracy (PP-OCRv6_medium) on in-house test sets:

CategoryAccuracy (%)
Handwritten CN62.1
Handwritten EN67.8
Printed CN91.5
Printed EN94.1
TC78.6
Ancient72.4
JP90.5
Confusable64.9
Special61.7
General87.5
Pinyin78.1
Artistic71.2
Industrial77.4
Screen82.5
Card88.1

Hosted on gigarouter as a managed API, the model is available via an OpenAI-compatible endpoint—no infrastructure setup required.

best for

FAQ

What is PP-OCRv6 Medium Rec best used for?

It is best for high-accuracy OCR on printed and handwritten text in images, especially when low latency or small model size is required.

How does it compare in size and speed to other models?

With 19M parameters, it is much smaller than billion-scale VLMs yet surpasses them on OCR benchmarks. Its tiny tier runs 3.9x faster than PP-OCRv5_mobile on CPU.

What are the license terms?

It is released under the Apache 2.0 license, allowing free use, modification, and distribution.

What input and output format does it expect?

Input: an image containing a single text line (e.g., cropped from a detector). Output: the recognized text string.

How can I call it via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key. Send an image URL or base64-encoded image in the request.

not yet live

We're benchmarking and onboarding PP-OCRv6 Medium Rec as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related image-to-text models

compare all →