DiffRhythm is a software that transforms lyrics into complete songs by generating professional-quality music with vocals from your lyrics in less than 15 seconds. It allows creating unlimited songs in various styles and languages.

Features

Blazing Fast Generation

Generate full-length songs up to 4m45s in just ten seconds, dramatically faster than other music generation systems.

Complete Song Generation

Produces both vocal and accompaniment tracks simultaneously in a single process, maintaining high musicality and intelligibility.

Multi-Language Support

Create songs in multiple languages including English and Chinese with natural pronunciation.

Embarrassingly Simple Design

Straightforward model structure eliminates the need for complex data preparation or multi-stage cascading architectures.

Style Prompt Control

Requires only lyrics and a style prompt during inference to generate diverse musical outputs across different genres.

Latent Diffusion Model

First latent diffusion-based song generation model, moving beyond slower language model-based approaches.

Non-Autoregressive Structure

Architecture ensures fast inference speeds compared to autoregressive models that generate content sequentially.

Highly Scalable

Simple yet powerful design guarantees scalability for future development and broader applications.

Open Source

Available on GitHub and Hugging Face with demo examples, making it accessible for researchers and developers.