skip to content
gigarouter gigarouter
models / text generation · coming soon

Ornith 1.0 35B

deepreinforce-ai/Ornith-1.0-35B

published Jun 2026 · updated Jun 2026

Ornith 1.0 35B is a text-generation model for agentic coding that uses a self-improving reinforcement learning framework to jointly optimize scaffold and solution rollouts.

status
coming soon
API providers
0
downloads / mo
218.7K
license
mit

specs

TaskText Generation (Agentic Coding)
ArchitectureMixture of Experts (MoE)
Parameters35B
LicenseMIT

about this model

Ornith-1.0-35B is a text-generation model optimized for agentic coding tasks, built on top of pretrained Gemma 4 and Qwen 3.5 architectures using a self-improving reinforcement learning framework.

Self-Improving Training Framework

The model employs a two-stage RL process: it first proposes a refined scaffold conditioned on the task and previously used scaffold, then generates a solution rollout conditioned on that scaffold and task description. Reward from the rollout is propagated to both stages, enabling the model to discover better search trajectories and generate higher-quality solutions. A three-layer reward hacking defense protects training integrity through an immutable outer trust boundary, deterministic monitoring of forbidden actions, and a frozen LLM judge for intent-level gaming detection.

Benchmark Performance

Ornith-1.0-35B achieves state-of-the-art results among open-source models of comparable size on agentic coding benchmarks:

BenchmarkOrnith-1.0-35BQwen3.5-35BQwen3.6-35BGemma4-31BQwen3.5-397B
Terminal-Bench 2.1 (Terminus-2)64.241.452.542.153.5
Terminal-Bench 2.1 (Claude Code)62.838.949.2-48.6
SWE-bench Verified75.67073.45276.4
SWE-bench Pro50.444.649.535.751.6
SWE-bench Multilingual69.360.367.251.769.3
NL2Repo34.620.529.415.536.8
Claw-eval Avg69.865.468.748.570.7
SWE Atlas - QnA37.113.215.5-20.4
SWE Atlas - RF29.710.211.4-18.4
SWE Atlas - TW27.89.813.3-18.5

The 35B model surpasses Qwen 3.5-397B (53.5) on Terminal-Bench 2.1 despite having roughly one-tenth the parameters. The larger Ornith-1.0-397B variant achieves 77.5 on Terminal-Bench 2.1 and 82.4 on SWE-Bench Verified.

Ornith-1.0 benchmark comparison chart Ornith-1.0 performance visualization

The model uses a structured tool-calling format with <tool_call> XML tags and nested <function=...> blocks. It is a reasoning model that opens assistant turns with a <think>...</think> block. Released under the MIT license.

best for

FAQ

What is Ornith 1.0 35B best used for?

It is designed for agentic coding tasks such as automated bug fixing, feature implementation, and repository-level code generation, achieving state-of-the-art results on benchmarks like SWE-Bench and Terminal-Bench.

What is the license for Ornith 1.0 35B?

It is MIT licensed, globally accessible, and free from regional limitations.

How does Ornith 1.0 35B compare to larger models like Qwen 3.5-397B?

Despite having only 35B parameters, Ornith 1.0 35B surpasses Qwen 3.5-397B on Terminal-Bench 2.1 (64.2 vs 53.5) and achieves competitive results on SWE-Bench Verified (75.6 vs 76.4).

How do I call Ornith 1.0 35B via the API?

Use the gigarouter OpenAI-compatible endpoint with your API key. The model is a reasoning model that outputs a <think> block before the final answer.

not yet live

We're benchmarking and onboarding Ornith 1.0 35B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related text generation models

compare all →