WakeNet Interface

Setting up the speech recognition application to detect a wakeup word may be done using series of Audio Elements linked into a pipeline shown below.

Sample Speech Recognition Pipeline

Configuration and use of particular elements is demonstrated in several examples linked to elsewhere in this documentation. What may need clarification is use of the Filter and the RAW stream. The filter is used to adjust the sample rate of the I2S stream to match the sample rate of the speech recognition model. The RAW stream is the way to feed the audio input to the model.

The above introduction is the primary guidance. ESP-ADF offers users a more flexible and convenient module, namely the audio recorder, which is strongly recommended for use.

API Reference

For the latest speech recognition API reference, please refer to ESP-SR Speech Recognition Framework.