Question 1

What is TrOCR Large Printed best used for?

Accepted Answer

It is best for optical character recognition (OCR) on single text-line images of printed text, particularly fine-tuned on the SROIE receipt dataset.

Question 2

How does TrOCR Large compare to other TrOCR variants?

Accepted Answer

TrOCR Large has 558M parameters and achieves higher accuracy than TrOCR Small and TrOCR Base, with 96.60% F1 on SROIE.

Question 3

What input format does the model expect?

Accepted Answer

It expects a single text-line image, preprocessed into 16x16 patches. The Hugging Face processor handles resizing and normalization.

Question 4

How can I call this model via API?

Accepted Answer

Use the gigarouter OpenAI-compatible endpoint with an API key to send image URLs or base64-encoded images and receive the recognized text.

Question 5

What is the license for this model?

Accepted Answer

The model card and additional sources do not specify a license; please check the official repository at github.com/microsoft/unilm/tree/master/trocr for terms.

Task	Image-to-Text (Optical Character Recognition)
Architecture	Encoder-decoder Transformer with BEiT image encoder and RoBERTa text decoder
Parameters	558M

Model	Parameters	SROIE F1	IAM Cased CER
TrOCR-Small	–	95.86%	4.22
TrOCR-Base	–	96.34%	3.42
TrOCR-L

TrOCR Large Printed

specs

about this model

Architecture

Benchmark Performance

best for

FAQ

related image-to-text models