FireRedASR is an open-source automatic speech recognition system. It supports languages like Mandarin, Chinese dialects, and English. It offers state-of-the-art performance on public ASR benchmarks and includes singing lyrics recognition. You can use it with pre-trained models for efficient speech transcription.
Designed to achieve state-of-the-art performance and to enable seamless end-to-end speech interaction. It adopts an Encoder Adapter LLM framework leveraging a Large Language Model (LLM).
Focuses on balancing high performance and computational efficiency to serve as an effective speech representation module in LLM-based speech models. It utilizes an Attention-based Encoder-Decoder architecture.
Supports multiple languages including Mandarin, Chinese dialects, and English.
Achieves high scores in public Mandarin ASR benchmarks, demonstrating excellent performance measured by Character Error Rate (CER) and Word Error Rate (WER).