Qwen AgentWorld 35B A3B
Qwen/Qwen-AgentWorld-35B-A3B
published Jun 2026 · updated Jun 2026
Qwen AgentWorld 35B A3B is a text-generation model that simulates agentic environments across seven domains using long chain-of-thought reasoning.
specs
| Task | text-generation |
| Architecture | Mixture of Experts (MoE) with Gated DeltaNet and Gated Attention |
| Parameters | 35B total, 3B activated |
| License | Apache 2.0 |
| Context Length | 262,144 tokens |
about this model
Qwen-AgentWorld-35B-A3B is a native language world model for text-generation that simulates agentic environments across seven unified interaction domains via long chain-of-thought reasoning: MCP (tool calling), Search, Terminal, SWE (software engineering), Android, Web, and OS — spanning both text and GUI environments.
The model is trained through a three-stage pipeline: continual pre-training (CPT) injects environment knowledge from more than 10 million real-world interaction trajectories; supervised fine-tuning (SFT) activates next-state-prediction reasoning; and reinforcement learning (RL with GSPO using hybrid rubric-and-rule rewards) sharpens simulation fidelity. Environment modeling is the training objective from the CPT stage onward, making this a native world model rather than a post-hoc adaptation of a general-purpose LLM.
The architecture is a Mixture-of-Experts causal language model with 35B total parameters and 3B activated parameters, supporting a context length of 262,144 tokens. The model is released under the Apache 2.0 license and is associated with the AgentWorldBench dataset.
Benchmark Performance
On AgentWorldBench — an open-ended evaluation constructed from real-world interactions of five frontier models on nine established benchmarks, scored on a five-dimensional rubric (format, factuality, consistency, realism, quality) normalized to a 0–100 scale — Qwen-AgentWorld-35B-A3B achieves the following domain scores:
| Domain | Score |
|---|---|
| MCP | 64.79 |
| Search | 36.69 |
| Terminal | 53.96 |
| SWE | 65.63 |
| Android | 58.17 |
| Web | 49.55 |
| OS | 65.92 |
| Overall | 56.39 |
This overall score surpasses Gemini 3.1 Pro (54.57), DeepSeek-V4-Pro (52.97), GLM-5.1 (51.31), Kimi K2.6 (53.42), MiniMax-M2.7 (46.12), and the base model Qwen3.5-35B-A3B (47.73), while remaining competitive with Claude Opus 4.8 (56.59) and GPT-5.4 (58.25) — the latter two being substantially larger or proprietary models.
Key Strengths
- Seven unified domains in a single model — no domain-specific checkpoints required.
- Zero-shot generalization to out-of-distribution environments (e.g., OpenClaw) supported by controllable perturbations and fictional-world construction.
- Scalable and controllable simulation — the model can serve as a decoupled environment simulator for agentic reinforcement learning, yielding gains that surpass real-environment training alone.
- Long context (262K tokens) enables multi-turn environment trajectory simulation.
For further details, see the technical report and the blog post.
best for
- ·Simulating agent environments for reinforcement learning training
- ·Zero-shot environment prediction for novel tools and interfaces
- ·Multi-turn agent trajectory generation and evaluation
FAQ
It is best used for simulating agentic environments across seven domains (MCP, Search, Terminal, SWE, Android, Web, OS) via long chain-of-thought reasoning, enabling scalable and controllable environment simulation for agentic RL and evaluation.
It has 35B total parameters with 3B activated per token, using a Mixture of Experts architecture. It supports a context length of up to 262,144 tokens, allowing detailed multi-turn simulation.
The model is released under the Apache 2.0 license.
Input is a system prompt specifying the domain (e.g., a Linux terminal) followed by the agent's action. Output is the predicted next environment observation, with optional reasoning before the response.
Use the gigarouter OpenAI-compatible endpoint with your API key. Send a chat completion request with the model name "Qwen/Qwen-AgentWorld-35B-A3B" and the appropriate system prompt and user message.
We're benchmarking and onboarding Qwen AgentWorld 35B A3B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.