PP-LCNet x1.0 Doc Ori

PaddlePaddle/PP-LCNet_x1_0_doc_ori

published Jun 2025 · updated Jul 2025

PP-LCNet x1.0 Doc Ori is a image-to-text model that classifies document image orientation into 0°, 90°, 180°, or 270° to correct rotated images for OCR preprocessing.

status

coming soon

API providers

downloads / mo

445.3K

license

apache-2.0

specs

Task	Document Orientation Classification
Architecture	PP-LCNet x1.0
Accuracy	99.06%
Model Size	7 MB
Categories	4 (0°, 90°, 180°, 270°)

about this model

PP-LCNet_x1_0_doc_ori is a document image orientation classification model that identifies the rotation of a document or ID image among four categories: 0°, 90°, 180°, and 270°. It is designed to correct misoriented captures before OCR processing, improving overall OCR accuracy in document scanning and ID photo workflows.

Key Capabilities

The model classifies document orientation with a recognition average accuracy of 99.06%, as measured on internal test sets. It is based on the PP-LCNet_x1_0 architecture and requires only 7 MB of storage, making it lightweight for integration into preprocessing pipelines. When used as part of a document preprocessing pipeline, it can be combined with geometric distortion correction (text image unwarping) to fully normalize document images before text extraction.

Model	Recognition Avg Accuracy(%)	Model Storage Size (M)	Introduction
PP-LCNet_x1_0_doc_ori	99.06	7	A document image classification model based on PP-LCNet_x1_0, with four categories: 0°, 90°, 180°, and 270°.

Performance Highlights

High accuracy (99.06%) on orientation classification, reducing downstream OCR errors caused by rotated input.
Minimal model footprint (7 MB) enables fast inference and low-latency deployment.
Outputs class ID and confidence score, making integration into automated workflows straightforward.

Example inference result (visualized after classification):

Example document orientation classification output showing a 180-degree rotated document before correction.

Example pipeline output after orientation correction and geometric unwarping.

best for

·Preprocessing document images for OCR pipelines to improve recognition accuracy
·Correcting orientation of scanned IDs, passports, and forms

FAQ

What is the model's accuracy?

The model achieves 99.06% accuracy on document orientation classification.

How many orientation classes does the model predict?

It predicts four classes: 0°, 90°, 180°, and 270°.

What is the model size?

The model storage size is 7 MB.

How can I use this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key; refer to the documentation for input format.

What task is this model used for?

It is used for document image orientation classification, often as a preprocessing step for OCR.

not yet live

We're benchmarking and onboarding PP-LCNet x1.0 Doc Ori as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related image-to-text models

compare all →

blip-image-captioning-base

1.9M dl/mo

blip-image-captioning-large

trocr-small-handwritten

448.6K dl/mo