Question 1

What is PP-DocLayoutV3 best for?

Accepted Answer

It is designed for robust document layout analysis on real-world images, including non-planar, skewed, and curved documents. It outputs multi-point bounding boxes and reading order, reducing cascading errors.

Question 2

How does PP-DocLayoutV3 handle non-planar documents?

Accepted Answer

It directly predicts multi-point bounding boxes (instead of standard two-point boxes) and logical reading order in a single forward pass, handling skew, warping, and curved surfaces without additional correction steps.

Question 3

What license is PP-DocLayoutV3 released under?

Accepted Answer

Apache 2.0 license, as indicated in the model card.

Question 4

How can I call PP-DocLayoutV3 via the gigarouter API?

Accepted Answer

Use the gigarouter OpenAI-compatible endpoint with your API key. Input an image URL or base64-encoded image; the model returns detection results with bounding boxes, polygon points, and labels.

Question 5

What is the reported accuracy of PP-DocLayoutV3?

Accepted Answer

It achieves 94.5% state-of-the-art accuracy on the OmniDocBench v1.5 benchmark for document parsing, and also SOTA on the Real5-OmniDocBench robustness benchmark.

Task	Document Layout Analysis (Object Detection)
Architecture	Transformer-based object detection (AutoModelForObjectDetection)
License	Apache 2.0

PP-DocLayoutV3

specs

about this model

Capabilities and Architecture

Benchmark Performance

Qualitative Examples

Ecosystem and Evolution

best for

FAQ

related object detection models