Gemma 4 26B A4B

cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit

published Apr 2026 · updated Jul 2026

Gemma 4 26B A4B is a Vision-Language Model (VLM) with a Mixture-of-Experts architecture that processes text and image inputs and generates text output, supporting a 256K token context window and built-in reasoning.

est. price

~$1.341

/ 1k images · estimated, set at launch

API providers

downloads / mo

5.1M

license

apache-2.0

specs

Task	Vision-Language Model (Text & Image Input, Text Output)
Architecture	Mixture-of-Experts (MoE) – 8 active experts out of 128 total plus 1 shared expert
Parameters	25.2B total, 3.8B active
Context Length	256K tokens
License	Apache 2.0

about this model

cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit is a vision-language model that processes text and image input to generate text output, based on Google DeepMind's Gemma 4 26B A4B Mixture-of-Experts architecture and quantized to 4-bit AWQ for efficient inference through gigarouter's API.

The model employs a MoE design with 25.2B total parameters but only 3.8B active per token, routing through 8 active experts out of 128 total plus one shared expert. This yields inference speeds comparable to a 4B-parameter model while retaining the capacity of a much larger network. It supports a 256K-token context window with hybrid attention (local sliding window interleaved with global attention) for long-context tasks.

Key capabilities

Multimodal input: Text and images at variable aspect ratios and resolutions; interleaved multimodal prompts.
Reasoning: Configurable thinking mode for step-by-step reasoning before answering.
Function calling: Native structured tool use for agentic workflows.
Coding: Code generation, completion, and correction.
Multilingual: Supports 35+ languages out of the box, pre-trained on 140+ languages.

Benchmark results

Benchmark	Score
MMLU Pro	82.6%
AIME 2026 (no tools)	88.3%
LiveCodeBench v6	77.1%
Codeforces ELO	1718
GPQA Diamond	82.3%
MMMU Pro (vision)	73.8%
MATH-Vision	82.4%
OmniDocBench 1.5 (edit distance, lower is better)	0.149

Gemma 4 model diagram

best for

·Document & PDF parsing with OCR and handwriting recognition
·Chart, diagram, and UI understanding
·Step-by-step reasoning and math problem solving
·Code generation and agentic function-calling workflows

FAQ

What input modalities does Gemma 4 26B A4B support?

It supports text and image input (including variable aspect ratios and resolutions).

What is the context window size of this model?

256K tokens.

What license is the model released under?

Apache 2.0.

How can I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with a valid API key.

How large is the model in terms of total and active parameters?

25.2 billion total parameters, with 3.8 billion active parameters per inference.

not yet live

We're benchmarking and onboarding Gemma 4 26B A4B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related vision-language models

compare all →

Qwen2.5-VL-7B-Instruct

9.8M dl/mo

Qwen3.6-35B-A3B-FP8

6.2M dl/mo

Qwen2.5-VL-3B-Instruct