Ornith 1.0 35B

deepreinforce-ai/Ornith-1.0-35B

published Jun 2026 · updated Jun 2026

Ornith 1.0 35B is a text-generation model for agentic coding that uses a self-improving reinforcement learning framework to jointly optimize scaffold and solution rollouts.

status

coming soon

API providers

downloads / mo

218.7K

license

mit

specs

Task	Text Generation (Agentic Coding)
Architecture	Mixture of Experts (MoE)
Parameters	35B
License	MIT

about this model

Ornith-1.0-35B is a text-generation model optimized for agentic coding tasks, built on top of pretrained Gemma 4 and Qwen 3.5 architectures using a self-improving reinforcement learning framework.

Self-Improving Training Framework

The model employs a two-stage RL process: it first proposes a refined scaffold conditioned on the task and previously used scaffold, then generates a solution rollout conditioned on that scaffold and task description. Reward from the rollout is propagated to both stages, enabling the model to discover better search trajectories and generate higher-quality solutions. A three-layer reward hacking defense protects training integrity through an immutable outer trust boundary, deterministic monitoring of forbidden actions, and a frozen LLM judge for intent-level gaming detection.

Benchmark Performance

Ornith-1.0-35B achieves state-of-the-art results among open-source models of comparable size on agentic coding benchmarks:

Benchmark	Ornith-1.0-35B	Qwen3.5-35B	Qwen3.6-35B	Gemma4-31B	Qwen3.5-397B
Terminal-Bench 2.1 (Terminus-2)	64.2	41.4	52.5	42.1	53.5
Terminal-Bench 2.1 (Claude Code)	62.8	38.9	49.2	-	48.6
SWE-bench Verified	75.6	70	73.4	52	76.4
SWE-bench Pro	50.4	44.6	49.5	35.7	51.6
SWE-bench Multilingual	69.3	60.3	67.2	51.7	69.3
NL2Repo	34.6	20.5	29.4	15.5	36.8
Claw-eval Avg	69.8	65.4	68.7	48.5	70.7
SWE Atlas - QnA	37.1	13.2	15.5	-	20.4
SWE Atlas - RF	29.7	10.2	11.4	-	18.4
SWE Atlas - TW	27.8	9.8	13.3	-	18.5

The 35B model surpasses Qwen 3.5-397B (53.5) on Terminal-Bench 2.1 despite having roughly one-tenth the parameters. The larger Ornith-1.0-397B variant achieves 77.5 on Terminal-Bench 2.1 and 82.4 on SWE-Bench Verified.

The model uses a structured tool-calling format with <tool_call> XML tags and nested <function=...> blocks. It is a reasoning model that opens assistant turns with a <think>...</think> block. Released under the MIT license.

best for

·Automated software engineering tasks like fixing bugs and implementing features in code repositories
·Terminal-based coding agents that need to navigate and edit codebases autonomously

FAQ

What is Ornith 1.0 35B best used for?

It is designed for agentic coding tasks such as automated bug fixing, feature implementation, and repository-level code generation, achieving state-of-the-art results on benchmarks like SWE-Bench and Terminal-Bench.

What is the license for Ornith 1.0 35B?

It is MIT licensed, globally accessible, and free from regional limitations.

How does Ornith 1.0 35B compare to larger models like Qwen 3.5-397B?

Despite having only 35B parameters, Ornith 1.0 35B surpasses Qwen 3.5-397B on Terminal-Bench 2.1 (64.2 vs 53.5) and achieves competitive results on SWE-Bench Verified (75.6 vs 76.4).

How do I call Ornith 1.0 35B via the API?

Use the gigarouter OpenAI-compatible endpoint with your API key. The model is a reasoning model that outputs a <think> block before the final answer.

not yet live

We're benchmarking and onboarding Ornith 1.0 35B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related text generation models

tiny-Qwen2ForCausalLM-2.5

9.2M dl/mo

deepseek-v4-gguf

6.4M dl/mo

Qwen3.6-35B-A3B-NVFP4

6.2M dl/mo

gemma-3-270m

5.1M dl/mo