Xiaozhi ESP32 is an open-source project that enables users to build their own AI friends using various AI and IoT components, integrating language models and hardware like ESP32 to facilitate AI conversation capabilities.
This feature allows the device to be activated and interacted with through voice commands without needing an internet connection, using the ESP-SR framework.
Supports voice recognition for languages like Chinese, Cantonese, English, Japanese, and Korean using the SenseVoice technology, enabling more inclusive communication.
Enables real-time voice dialogue through WebSocket or UDP protocols, ensuring seamless and efficient voice interactions.
Utilizes 3D Speaker technology to identify who is calling the AI by recognizing unique voice prints.
Incorporates large language models like Qwen, DeepSeek, and Doubao to enhance the conversational abilities and responses of the AI.
Allows users to set up specific prompts and voice tones to create customized interactive experiences that can mimic different characters or personalities.