Qwen3.6 35B A3B

cyankiwi/Qwen3.6-35B-A3B-AWQ-4bit

published Apr 2026 · updated Jul 2026

Qwen3.6 35B A3B is a vision-language model that excels at agentic coding, repository-level reasoning, and frontend workflows with a 35B total parameter MoE architecture activating 3B per token.

est. price

~$1.341

/ 1k images · estimated, set at launch

API providers

downloads / mo

1.8M

license

apache-2.0

specs

Task	Vision-Language (Causal LM with Vision Encoder)
Architecture	Mixture of Experts (MoE) with Gated DeltaNet and Gated Attention
Parameters	35B total, 3B activated per token
License	Not specified in card

about this model

Qwen3.6-35B-A3B is a vision-language model (VLM) with a sparse mixture-of-experts architecture (35B total parameters, 3B activated) that excels at agentic coding, repository-level reasoning, and multimodal tasks. It is hosted on gigarouter as a managed, OpenAI-compatible API.

The model combines a vision encoder with a causal language model, supporting a native context length of 262,144 tokens, extensible to over 1 million. It is calibrated on STEM and agentic data and supports ten languages. Key upgrades include improved frontend workflow handling and a new option to retain reasoning context from historical messages.

Bar chart comparing benchmark scores across five models on SWE-bench Verified, Multilingual, Pro, Terminal-Bench, Claw-Eval, SkillsBench, NL2Repo, and QwenWebBench.

In coding agent benchmarks, Qwen3.6-35B-A3B achieves 73.4% on SWE-bench Verified (vs. 70.0% for Qwen3.5-35BA3B, 52.0% for Gemma4-31B), 67.2% on SWE-bench Multilingual, 49.5% on SWE-bench Pro, and 51.5% on Terminal-Bench 2.0. It scores 68.7% (Avg) and 50.0% (Pass) on Claw-Eval, 28.7% on SkillsBench Avg5, 29.4% on NL2Repo, and 1397 on QwenWebBench. These results demonstrate improvements in agentic coding, repository-level reasoning, and web interaction tasks. The AWQ 4-bit quantized variant enables efficient deployment with minimal quality loss.

best for

·Agentic coding and repository-level reasoning (e.g., SWE-bench tasks)
·Frontend workflow automation and web-based agent tasks
·Multilingual code generation and debugging across 10 languages

FAQ

What is the context length of Qwen3.6 35B A3B?

It supports 262,144 tokens natively, extensible up to 1,010,000 tokens.

How many parameters are activated per token?

3B parameters are activated per token out of 35B total, using a Mixture of Experts architecture with 256 experts (8 routed + 1 shared).

What languages does this model support?

It supports EN, ZH, HI, AR, RU, JA, KO, NL, FR, and ES.

How do I call this model via the gigarouter API?

Use the gigarouter OpenAI-compatible endpoint with your API key, sending requests in the standard chat completions format.

What is the model size on disk?

The AWQ 4-bit quantized version is 24.97 GB.

not yet live

We're benchmarking and onboarding Qwen3.6 35B A3B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related vision-language models

compare all →

Qwen2.5-VL-7B-Instruct

9.8M dl/mo

Qwen3.6-35B-A3B-FP8

6.2M dl/mo

Qwen2.5-VL-3B-Instruct

5.3M dl/mo

gemma-4-26B-A4B-it-AWQ-4bit