skip to content
gigarouter gigarouter
models / image-to-text · coming soon

PP-OCRv4 Mobile Det

PaddlePaddle/PP-OCRv4_mobile_det

published Jun 2025 · updated Jul 2025

PP-OCRv4 Mobile Det is an image-to-text model that detects text regions in images, outputting bounding polygons with confidence scores, optimized for mobile and edge devices.

status
coming soon
API providers
0
downloads / mo
20.1K
license
apache-2.0

specs

TaskText Detection
ArchitectureMobile-optimized convolutional neural network
LicenseApache 2.0
Accuracy (Average)0.624
Output FormatPolygons and confidence scores

about this model

PP-OCRv4_mobile_det is a text detection model that localizes text regions in images, optimized for mobile and edge deployment. Developed by the PaddleOCR team as part of the PP-OCRv4_det series, it balances high accuracy with efficient inference, making it suitable for resource-constrained environments. The model outputs polygon coordinates and confidence scores for detected text boxes.

Benchmark Performance

The model's accuracy (measured as F-score) across diverse text scenarios is as follows:

ScenarioH-ChineseH-EnglishP-ChineseP-EnglishT-ChineseAncientJapaneseGeneralPinyinRotationDistortionArtisticAverage
F-score0.5830.3690.8720.7730.6630.2310.6340.7100.4300.2990.7150.5490.624

H = Handwritten, P = Printed, T = Traditional. Average F-score across all scenarios is 0.624.

Output Example

The following image shows a typical detection result with detected polygons overlaid on the input:

Example of text detection output with bounding polygons

PP-OCRv4_mobile_det is licensed under Apache 2.0 and is available as a hosted API on gigarouter, requiring no local installation. It integrates seamlessly with the PP-OCRv4 recognition pipeline for end-to-end OCR workflows.

best for

FAQ

What is the primary use of PP-OCRv4 Mobile Det?

It detects text regions in images, typically used as part of an OCR pipeline to extract text.

What license does this model use?

Apache 2.0.

What are the input and output formats?

Input: image (URL or file). Output: polygon coordinates and confidence scores for each detected text region.

How can I call this model via the API?

Use the gigarouter OpenAI-compatible endpoint with your API key, specifying the model name "PP-OCRv4 Mobile Det".

Is this model suitable for real-time applications?

Yes, it is mobile-optimized for efficient inference on edge devices.

not yet live

We're benchmarking and onboarding PP-OCRv4 Mobile Det as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related image-to-text models

compare all →