PP-OCRv5 Mobile Rec

PaddlePaddle/PP-OCRv5_mobile_rec

published Jun 2025 · updated Jul 2025

PP-OCRv5 Mobile Rec is an image-to-text model that recognizes text lines in multiple languages including Simplified Chinese, Traditional Chinese, English, and Japanese, as well as handwriting, vertical text, pinyin, and rare characters.

status

coming soon

API providers

downloads / mo

74.5K

license

apache-2.0

specs

Task	Image-to-text (text line recognition)
Framework	PaddlePaddle
Language Support	Simplified Chinese, Traditional Chinese, English, Japanese, plus handwriting, vertical text, pinyin, rare characters
Average Accuracy	0.8015 (line-level accuracy)

about this model

PP-OCRv5_mobile_rec is an image-to-text model that performs text line recognition, designed to extract and transcribe text from images across multiple languages and complex scenarios. Developed by the PaddleOCR team, it is a single-model solution supporting Simplified Chinese, Traditional Chinese, English, and Japanese, as well as handwriting, vertical text, pinyin, rare characters, and distorted or artistic text.

Key Strengths

The model delivers strong accuracy across a diverse set of challenging conditions. Its performance is measured using a strict line-level metric where a single character error marks the entire line as incorrect, ensuring high practical reliability. Key accuracy results include:

Scenario	Accuracy
Printed Chinese	0.8605
Printed English	0.8753
Traditional Chinese	0.7199
Japanese	0.7577
Handwritten Chinese	0.4166
Handwritten English	0.4944
Ancient Text	0.5786
Artistic Text	0.5398
General Scenario	0.5570
Pinyin	0.7703
Rotation	0.7248
Distortion	0.8089
Average	0.8015

This model is part of the PP-OCRv5 series, which builds on earlier generations (PP-OCRv2, PP-OCRv3) that demonstrated improvements of 5% on Chinese scene text and 11% on English scene text over prior versions. The mobile variant is optimized for efficient deployment while maintaining high recognition quality.

Example output from the model demonstrates reliable transcription, such as recognizing the text "day as a reminder of the" with a confidence score of 0.979.

Example of text recognition output showing the model's transcription of an image containing the phrase 'day as a reminder of the'.

When used as part of the full PP-OCRv5 pipeline, which includes optional document orientation classification, text unwarping, and text detection modules, the model can handle complete OCR workflows on complex documents, as shown in the example below.

Visualization of OCR pipeline output on a scientific paper page, with detected and recognized text regions highlighted.

best for

·Recognizing printed or handwritten text in images for document digitization
·Multilingual OCR in applications supporting Chinese, English, and Japanese text
·Extracting text from complex scenarios like vertical text, artistic text, or distorted text

FAQ

What languages does PP-OCRv5 Mobile Rec support?

It supports Simplified Chinese, Traditional Chinese, English, and Japanese, as well as handwriting, vertical text, pinyin, and rare characters.

What is the average accuracy of the model?

The line-level accuracy average is 0.8015 across 12 scenarios including handwritten, printed, traditional, and artistic text.

How do I use this model via the gigarouter API?

Send an image to the gigarouter OpenAI-compatible endpoint with your API key. The model returns the recognized text in JSON format.

Can this model be used in a full OCR pipeline?

Yes, the model is designed to work within the PP-OCR pipeline, combining with text detection and orientation modules for end-to-end OCR.

not yet live

We're benchmarking and onboarding PP-OCRv5 Mobile Rec as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related image-to-text models

compare all →

blip-image-captioning-base

1.9M dl/mo

blip-image-captioning-large

trocr-small-handwritten

448.6K dl/mo