Flashing Models
In the AI industry, a model refers to a mathematical representation of a system or process. It is used to make predictions or decisions based on input data. There are many types of models, such as decision trees, neural networks, and support vector machines, each with their own strengths and weaknesses. Esprssif also provides our trained models such as WakeNet and MultiNet (see the model data used in model)
To use our models in your project, you need to flash these models. Currently, ESP-SR supports the following methods to flash models:
ESP32-S3:
Load directly from SIP Flash File System (flash)
Load from external SD card
So that on ESP32-S3 you can:
Greatly reduce the size of the user application APP BIN
Supports the selection of up to two wake words
Support online switching of Chinese and English Speech Command Recognition
Convenient for users to perform OTA
Supports reading and changing models from SD card, which is more convenient and can reduce the size of module Flash used in the project
When the user is developing the code, when the modification does not involve the model, it can avoid flashing the model data every time, greatly reducing the flashing time and improving the development efficiency
Configuration
Run idf.py menuconfig
to navigate to ESP Speech Recognition
:
Model Data Path
This option indicates the storage location of the model data: Read model data from flash
or Read model data from SD card
.
Read model data from flash
means that the model data is stored in the flash, and the model data will be loaded from the flash partitionRead model data from SD card
means that the model data is stored in the SD card, and the model data will be loaded from the SD card
Use AFE
This option is enabled by default. Users do not need to modify it. Please keep the default configuration.
Use WakeNet
This option is enabled by default. When the user only uses AEC
or BSS
, etc., and does not need WakeNet
or MultiNet
, please disable this option, which reduces the size of the project firmware.
Select wake words by via menuconfig
by navigating to ESP Speech Recognition
> Select wake words
. The model name of wake word in parentheses must be used to initialize WakeNet handle.
If you want to select multiple wake words, please select Load Multiple Wake Words
Then you can select multiple wake words at the same time:
Note
ESP32-S3 does support multiple wake words. Users can select more than one wake words according to the hardware flash size.
For more details, please refer to WakeNet .
Use Multinet
This option is enabled by default. When users only use WakeNet or other algorithm modules, please disable this option, which reduces the size of the project firmware in some cases.
Chinese Speech Commands Model
ESP32-S3 supports command words in both Chinese and English:
None
Chinese single recognition (MultiNet4.5)
Chinese single recognition (MultiNet4.5 quantized with 8-bit)
English Speech Commands Model
The user needs to add Chinese Speech Command words to this item when Chinese Speech Commands Model
is not None
.
For more details, please refer to Section MultiNet .
English Speech Commands Model
ESP32-S3 supports command words in both Chinese and English, and allows users to switch between these two languages.
None
English recognition (MultiNet5 quantized with 8-bit, depends on WakeNet8)
Add Chinese speech commands
The user needs to add English Speech Command words to this item when English Speech Commands Model
is not None
.
How To Use
After the above-mentioned configuration, users can initialize and start using the models following the examples described in the ESP-Skainet repo.
Here, we only introduce the code implementation, which can also be found in src/model_path.c.
ESP32-S3 can load model data from flash or SD card.
Load Model Data from flash
Write a partition table:
model, data, spiffs, , SIZE,
Among them,
SIZE
can refer to the recommended size when the user usesidf.py build
to compile, for example:Recommended model partition size: 500K
Initialize the flash partition: User can use
esp_srmodel_init(partition_label)
API to initialize flash and return all loaded models.base_path: The model storage
base_path
issrmodel
and cannot be changedpartition_label: The partition label of the model is
model
, which needs to be consistent with theName
in the above partition table
After completing the above configuration, the project will automatically generate model.bin
after the project is compiled, and flash it to the flash partition.
Load Model Data from SD Card
When configured to load model data from Read model data from SD card
, users need to:
- Manually load model data from SD card
After the above-mentioned configuration, users can compile the code, and copy the files in
model/target
to the root directory of the SD card.
- Initialize SD card
Users must initialize SD card so the chip can load SD card. Users of ESP-Skainet can call
esp_sdcard_init("/sdcard", num);
to initialize any board supported SD cards. Otherwise, users need to write the initialization code themselves. After the above-mentioned steps, users can flash the project.
- Read models
User use
esp_srmodel_init(model_path)
to read models inmodel_path
of SD card.
Model initialization and Usage
//
// step1: return models in flash or in sdcard
//
char *model_path = your_model_path: // partition_label or model_path in sdcard;
models = esp_srmodel_init(model_path);
//
// step2: select the specific model by keywords
//
char *wn_name = esp_srmodel_filter(models, ESP_WN_PREFIX, NULL); // select WakeNet model
char *nm_name = esp_srmodel_filter(models, ESP_MN_PREFIX, NULL); // select MultiNet model
char *alexa_wn_name = esp_srmodel_filter(models, ESP_WN_PREFIX, "alexa"); // select WakeNet with "alexa" wake word.
char *en_mn_name = esp_srmodel_filter(models, ESP_MN_PREFIX, ESP_MN_ENGLISH); // select english MultiNet model
char *cn_mn_name = esp_srmodel_filter(models, ESP_MN_PREFIX, ESP_MN_CHINESE); // select english MultiNet model
// It also works if you use the model name directly in your code.
char *my_wn_name = "wn9_hilexin"
// we recommend you to check that it is loaded correctly
if (!esp_srmodel_exists(models, my_wn_name))
printf("%s can not be loaded correctly\n")
//
// step3: initialize model
//
esp_wn_iface_t *wakenet = esp_wn_handle_from_name(wn_name);
model_iface_data_t *wn_model_data = wakenet->create(wn_name, DET_MODE_2CH_90);
esp_mn_iface_t *multinet = esp_mn_handle_from_name(mn_name);
model_iface_data_t *mn_model_data = multinet->create(mn_name, 6000);