Gemma 4 31B IT — Route B per-head+k37

This repository contains the Route B release built on top of google/gemma-4-31B-it.

What it is

Route B is a packed ternary weight representation for large language models. In this release, the strongest current long-context checkpoint uses per-head prosody in attention and a one-layer keep override for layers.37.k_proj.

packed Route checkpoint or materialized HF export;
minimal runtime under runtime/;
minimal conversion/publishing entrypoints under kernel/;
short architecture note in kernel/ARCHITECTURE.md.

How to load

If this repo contains the packed .pt checkpoint, use the Route B runtime in runtime/load_model.py.

If this repo contains a materialized Hugging Face export, load it as a standard transformers model.

Important notes

This is a derived release based on google/gemma-4-31B-it.
Redistribution requires compliance with the Gemma license.
The strongest settled quality numbers are in the accompanying paper and archiv.org/benchmarks/results.md.

Reference

Arman Aubakirov. A Discrete Weight Language for Large Language Models: Compressing Gemma 4 31B with a 5-Step Ternary Route. Technical report, April 2026.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for armanibadboy/gemma-4-31b-it-route-b-perhead-k37

Base model

google/gemma-4-31B-it

Finetuned

(100)

this model