DiffRhythm is a software that transforms lyrics into complete songs by generating professional-quality music with vocals from your lyrics in less than 15 seconds. It allows creating unlimited songs in various styles and languages.
Generate full-length songs up to 4m45s in just ten seconds, dramatically faster than other music generation systems.
Produces both vocal and accompaniment tracks simultaneously in a single process, maintaining high musicality and intelligibility.
Create songs in multiple languages including English and Chinese with natural pronunciation.
Straightforward model structure eliminates the need for complex data preparation or multi-stage cascading architectures.
Requires only lyrics and a style prompt during inference to generate diverse musical outputs across different genres.
First latent diffusion-based song generation model, moving beyond slower language model-based approaches.
Architecture ensures fast inference speeds compared to autoregressive models that generate content sequentially.
Simple yet powerful design guarantees scalability for future development and broader applications.
Available on GitHub and Hugging Face with demo examples, making it accessible for researchers and developers.