PP-OCRv5 Server Rec

PaddlePaddle/PP-OCRv5_server_rec

published Jun 2025 · updated Jul 2025

PP-OCRv5 Server Rec is an image-to-text model that recognizes text from images, supporting Simplified Chinese, Traditional Chinese, English, Japanese, and complex scenarios such as handwriting, vertical text, pinyin, and rare characters using a single model.

status

coming soon

API providers

downloads / mo

189.4K

license

apache-2.0

specs

Task	Image-to-Text (Text Recognition)
Architecture	PP-OCRv5 server recognition model
Average Accuracy	0.8401 (line-level over 12 scenarios)

about this model

PP-OCRv5_server_rec is an image-to-text recognition model that efficiently supports simplified Chinese, traditional Chinese, English, and Japanese, along with handwriting, vertical text, pinyin, and rare characters using a single model. It achieves strong accuracy across twelve diverse scenarios, evaluated with a strict line-level metric where any character or punctuation error marks the entire line as incorrect. The overall average accuracy is 0.8401.

Scenario	Accuracy
Handwritten Chinese	0.5807
Handwritten English	0.5806
Printed Chinese	0.9013
Printed English	0.8679
Traditional Chinese	0.7472
Ancient Text	0.6039
Japanese	0.7372
General Scenario	0.5946
Pinyin	0.8384
Rotation	0.7435
Distortion	0.9314
Artistic Text	0.6397

The visualized recognition result on a sample image shows the model outputting the text “day as a reminder of the” with a confidence score of 0.9535. Sample image with recognized text 'day as a reminder of the' overlayed

Sample image with recognized text 'day as a reminder of the' overlayed

This model is hosted on gigarouter as a managed, OpenAI-compatible API, providing reliable image-to-text inference without local installation.

best for

·Text recognition in printed Chinese and English documents
·Handwriting recognition in Chinese and English
·Multilingual OCR covering Japanese and Traditional Chinese
·Complex scenarios like distorted, artistic, or rotated text

FAQ

What languages does PP-OCRv5 Server Rec support?

It supports Simplified Chinese, Traditional Chinese, English, Japanese, as well as handwriting, vertical text, pinyin, and rare characters.

What is the average accuracy of this model?

It achieves an average line-level accuracy of 0.8401 across 12 scenarios, including printed, handwritten, and distorted text.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with an API key. Input an image and receive recognized text as the response.

What input format does the model accept?

The model accepts image URLs or local image files. In the pipeline, it processes cropped text-line images.

Is this model part of a larger OCR pipeline?

Yes, it can be used standalone for text recognition or as part of the PP-OCRv5 pipeline that includes detection and orientation modules.

not yet live

We're benchmarking and onboarding PP-OCRv5 Server Rec as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related image-to-text models

compare all →

blip-image-captioning-base

1.9M dl/mo

blip-image-captioning-large

trocr-small-handwritten

448.6K dl/mo