GMF-AI-Audio Component
Note
This document is automatically translated using AI. Please excuse any detailed errors. The official English version is still in progress.
Overview
GMF-AI-Audio is a voice interaction component developed based on the GMF framework. By encapsulating ESP-SR, it provides a complete interaction logic from voice wake-up to command recognition. The component integrates functions such as Wake Word detection, Voice Activity Detection (VAD), voice command recognition, and Acoustic Echo Cancellation (AEC), enabling efficient and natural voice interaction experiences in smart speakers, smart home devices, and more.
Supported Scenarios
Method |
Corresponding Scenario |
|---|---|
Immediately upload voice data after wake-up, stop uploading at the Wakeup End stage |
Implement VAD function in the cloud, RTC scenarios |
Wait for VAD to trigger after wake-up before starting to upload, stop uploading after VAD ends |
Traditional interaction method of smart hardware |
No wake-up, wait for VAD to trigger before starting to upload, stop uploading after VAD ends |
New cloud processing logic |
Immediately upload voice data after pressing the button, stop after releasing |
Devices with limited computing power implement voice functions through interaction with the cloud |
Wait for VAD to trigger after pressing the button before starting to upload, stop uploading after VAD ends |
Solve the problem of excessive data volume caused by relying solely on VAD |
Detect command words after wake-up |
Default usage logic |
No wake-up, wait for VAD to trigger before detecting command words |
Can be applied to some vehicle systems |
Detect command words after pressing the button |
Toys |
Continuous command word recognition |
Home control |