FireRedASR is an open-source automatic speech recognition system. It supports languages like Mandarin, Chinese dialects, and English. It offers state-of-the-art performance on public ASR benchmarks and includes singing lyrics recognition. You can use it with pre-trained models for efficient speech transcription.

Features

FireRedASR-LLM

Designed to achieve state-of-the-art performance and to enable seamless end-to-end speech interaction. It adopts an Encoder Adapter LLM framework leveraging a Large Language Model (LLM).

FireRedASR-AED

Focuses on balancing high performance and computational efficiency to serve as an effective speech representation module in LLM-based speech models. It utilizes an Attention-based Encoder-Decoder architecture.

Multilingual Support

Supports multiple languages including Mandarin, Chinese dialects, and English.

Benchmark Performance

Achieves high scores in public Mandarin ASR benchmarks, demonstrating excellent performance measured by Character Error Rate (CER) and Word Error Rate (WER).