Infinity Parser2 Flash
infly/Infinity-Parser2-Flash
published Feb 2026 · updated May 2026
A popular open vision-language model, with 16.6K downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
about this model
Infinity-Parser2-Flash is a vision-language model (VLM) specialized for high-speed document parsing. It extracts layout elements, text, tables, formulas, charts, and chemical structures from images, and also supports document visual question answering (VQA) and general multimodal understanding. The model is built on an upgraded synthetic data engine covering nearly 5 million diverse document samples and a multi-task reinforcement learning framework with joint verification rewards, enabling robust zero-shot performance across real-world business scenarios.
Key Strengths
- Inference speed: Delivers 1,624 tokens/sec throughput — a 3.68× speedup over Infinity-Parser-7B — reducing latency and deployment cost.
- Document parsing accuracy: Scores 86.0% on olmOCR-Bench and 72.2% on ParseBench, outperforming PaddleOCR-VL-1.5, DeepSeek-OCR-2, and MinerU-2.5.
- Element parsing: Achieves 96.5% on UniMERNet and 92.41% on PubTabNet. On OmniDocBench-v1.6, the model scores 91.98%.
- Document VQA: Reaches 93.16% on DocVQA and 75.94% on InfoVQA.
- General understanding: Scores 81.60% on OCRBench and 77.92% on MMBench-EN.
Full benchmark comparisons with leading models are shown below.
| Benchmark | Infinity-Parser2-Flash |
|---|---|
| olmOCR-Bench | 86.0 |
| ParseBench | 72.2 |
| OmniDocBench-v1.6 | 91.98 |
| PubTabNet (val) | 92.41 |
| UniMERNet | 96.5 |
| DocVQA (val) | 93.16 |
| OCRBench | 81.60 |
Infinity-Parser2-Flash is hosted on Gigarouter as a managed, OpenAI-compatible API — no local installation required.
We're benchmarking and onboarding Infinity Parser2 Flash as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.