GMF-AI-Audio Component

Note

This document is automatically translated using AI. Please excuse any detailed errors. The official English version is still in progress.

Overview

GMF-AI-Audio is a voice interaction component developed based on the GMF framework. By encapsulating ESP-SR, it provides a complete interaction logic from voice wake-up to command recognition. The component integrates functions such as Wake Word detection, Voice Activity Detection (VAD), voice command recognition, and Acoustic Echo Cancellation (AEC), enabling efficient and natural voice interaction experiences in smart speakers, smart home devices, and more.

Supported Scenarios

Method	Corresponding Scenario
Immediately upload voice data after wake-up, stop uploading at the Wakeup End stage	Implement VAD function in the cloud, RTC scenarios
Wait for VAD to trigger after wake-up before starting to upload, stop uploading after VAD ends	Traditional interaction method of smart hardware
No wake-up, wait for VAD to trigger before starting to upload, stop uploading after VAD ends	New cloud processing logic
Immediately upload voice data after pressing the button, stop after releasing	Devices with limited computing power implement voice functions through interaction with the cloud
Wait for VAD to trigger after pressing the button before starting to upload, stop uploading after VAD ends	Solve the problem of excessive data volume caused by relying solely on VAD
Detect command words after wake-up	Default usage logic
No wake-up, wait for VAD to trigger before detecting command words	Can be applied to some vehicle systems
Detect command words after pressing the button	Toys
Continuous command word recognition	Home control

GMF-AI-Audio Component

Overview

Supported Scenarios

Related Links