Hi, welcome to use,

The audio tokenizer gguf file in iq4xs, even i say so, only little smaller and faster than q8_0;
This repo's qwen3-tts-0.6b-q8_0.gguf is smaller. iq4xs quant hurts the voice clone ablity, but maybe we don't really care that little lose.
Updated new quanted iq4_xs tokenizer, which downcasted some fp16 tensors to q8_0.(211M)
Updated iq3s tokenizer, yes, it's working well and faster

Now you only need 708 MB RAM in total to run this model!

GGUF

Model size

0.9B params

Architecture

qwen3-tts

Hardware compatibility

4-bit

8-bit

Model tree for Jahaz/Qwen3-tts-0.6b-gguf-for-koboldcpp

Base model

Quantized

(12)

this model