cohere-transcribe-03-2026-mlx-8bit

Quantized MLX weights for beshkenadze/cohere-transcribe-03-2026-mlx-fp16.

Variant

Sample: Tests/media/conversational_a.wav

Generation TPS: 352.9
Peak memory: 2.87 GB
Output: Coffee's story likely begins in Ethiopia, where legend tells of a goat herder named Kaldi, who noticed his goats became energetic after eating red berries from a particular bush; curious, he tried them himself and felt invigorated.

This checkpoint has been re-validated against the current Swift and Python MLX runtimes.

Verified semantic parity on an English fixture:

This is a test recording in English. I am speaking clearly at a normal speed. Please transcribe this sentence exactly as I said.

Matched across:

Matches fp16 on the repo sample while reducing memory substantially.

Generated from the Swift-compatible fp16 checkpoint beshkenadze/cohere-transcribe-03-2026-mlx-fp16.
This repository contains inference artifacts only. Refer to the upstream Cohere model card and license for original model details.

MLX

Hardware compatibility

8-bit

Base model

Quantized

(24)

this model