Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios Paper • 2605.28618 • Published 7 days ago • 27
Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer Paper • 2605.30940 • Published 5 days ago • 31
SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue Paper • 2605.30993 • Published 5 days ago • 39
WavAlign: Enhancing Intelligence and Expressiveness in Spoken Dialogue Models via Adaptive Hybrid Post-Training Paper • 2604.14932 • Published Apr 16 • 11