Hi, welcome to use,

  1. The audio tokenizer gguf file in iq4xs, even i say so, only little smaller and faster than q8_0;
  2. This repo's qwen3-tts-0.6b-q8_0.gguf is smaller. iq4xs quant hurts the voice clone ablity, but maybe we don't really care that little lose.
  3. Updated new quanted iq4_xs tokenizer, which downcasted some fp16 tensors to q8_0.(211M)
  4. Updated iq3s tokenizer, yes, it's working well and faster

Now you only need 708 MB RAM in total to run this model!

Downloads last month
1,130
GGUF
Model size
0.9B params
Architecture
qwen3-tts
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Jahaz/Qwen3-tts-0.6b-gguf-for-koboldcpp

Quantized
(12)
this model