gemma-4-31B-it-The-DECKARD-HERETIC-UNCENSORED-Thinking — MLX 4.6 BPW

Type: Vision (VLM)
Average: 4.6 bits per weight
Method: MLX Smart Quantize (MSQ)

Mixed-precision MLX quantization of DavidAU/gemma-4-31B-it-The-DECKARD-HERETIC-UNCENSORED-Thinking, quantized with MLX Smart Quantize (MSQ) — my own sensitivity-based mixed-precision quantization method for Apple Silicon. It measures per-layer NMSE and assigns optimal bit widths automatically, combining architecture knowledge with measured data.