mythicinfinity/libriheavy
Viewer • Updated • 12.4M • 23.4k • 15
LayaCodec: Rapid, High-Fidelity Audio Compression: Reaching the Pareto Frontier in Neural Audio Codecs
This is a neural audio codec/tokenizer that encodes 16khz at a rate from 12.5 t/s(0.16 kpbs) to 50 t/s(0.65 kpbs) using a single 8192 size codebook and decodes it into 44.1khz audio. This allows for much faster and scalable TTS models compared to othern modern codecs for several reasons.
Repo: https://github.com/ysharma3501/LayaCodec
This is still W.I.P, it has only seen a few hundred hours of training data but surprisingly good quality. It will still need some more training.
Model is released with a permissive CC-BY-4.0 license and Code is released with Apache-2.0 license.
Thanks very much to the authors of FocalCodec and Anime-XCodec2.