性能测试结果
AFE
资源消耗
Config |
Pipeline |
---|---|
MR, SR, LOW_COST |
|
MR, SR, HIGH_PERF |
|
MR, VC, LOW_COST |
|
MR, VC, HIGH_PERF |
|
MMNR, SR, LOW_COST |
|
MMNR, SR, HIGH_PERF |
|
备注
MR: 一个麦克风通道和一个播放通道
MMNR: 两个麦克风通道和一个播放通道
Models: nsnet2, vadnet1_medium, wn9_hilexin
Config |
Internal RAM (KB) |
PSRAM (KB) |
Feed CPU usage (1 core,%) |
Fetch CPU usage (1 core,%) |
---|---|---|---|---|
MR, SR, LOW_COST |
72.3 |
732.7 |
8.4 |
15.0 |
MR, SR, HIGH_PERF |
78.0 |
734.7 |
9.4 |
14.9 |
MR, VC, LOW_COST |
50.3 |
821.4 |
60.0 |
8.2 |
MR, VC, HIGH_PERF |
93.7 |
824.0 |
64.0 |
8.2 |
MMNR, SR, LOW_COST |
76.6 |
1173.9 |
36.6 |
30.0 |
MMNR, SR, HIGH_PERF |
99.0 |
1173.7 |
38.8 |
30.0 |
WakeNet
资源消耗
Model Type |
RAM |
PSRAM |
Average Running Time per Frame |
Frame Length |
---|---|---|---|---|
Quantised WakeNet8 @ 2 channel |
50 KB |
1640 KB |
10.0 ms |
32 ms |
Quantised WakeNet9 @ 2 channel |
16 KB |
324 KB |
3.0 ms |
32 ms |
Quantised WakeNet9 @ 3 channel |
20 KB |
347 KB |
4.3 ms |
32 ms |
性能测试
Distance |
Quiet |
Stationary Noise (SNR = 4 dB) |
Speech Noise (SNR = 4 dB) |
AEC I nterruption (-10 dB) |
---|---|---|---|---|
1 m |
98% |
96% |
94% |
96% |
3 m |
98% |
96% |
94% |
94% |
误触发率:12 小时 1 次
备注
我们在测试中使用了 ESP32-S3-Korvo V4.0 开发板和 WakeNet9(Alexa) 模型。
MultiNet
资源消耗
Model Type |
Internal RAM |
PSRAM |
Average Running Time per Frame |
Frame Length |
---|---|---|---|---|
MultiNet 4 |
16.8KB |
1866 KB |
18 ms |
32 ms |
MultiNet 4 Q8 |
10.5 KB |
1009 KB |
11 ms |
32 ms |
MultiNet 5 Q8 |
16 KB |
2310 KB |
12 ms |
32 ms |
MultiNet 6 |
32 KB |
4100 KB |
12 ms |
32 ms |
Word Error Rate 性能测试
Model Type |
aishell test |
---|---|
MultiNet 5_cn |
9.5% |
MultiNet 6_cn |
5.2% |
备注
中文使用没有声调的拼音单元去计算WER。
Speech Commands 性能测试(空调控制场景)
Model Type |
Distance |
Quiet |
Stationary Noise (SNR=5~10dB dB) |
Speech Noise (SNR=5~10dB dB) |
---|---|---|---|---|
MultiNet 5_cn |
3 m |
88.9% |
66.1% |
67.5% |
MultiNet 6_cn |
3 m |
98.8% |
88.3% |
88.0% |
MultiNet 6_cn_ac |
3 m |
97.1% |
95.1% |
96.8% |
备注
MultiNet6_cn_ac在空调场景数据集上进行了进一步的微调,所以在空调控制场景具有更好的性能。
TTS
资源消耗
Flash image size: 2.2 MB
RAM runtime: 20 KB
性能测试
CPU 负载测试(ESP32 @240 MHz):
Speech Rate |
0 |
1 |
2 |
3 |
4 |
5 |
---|---|---|---|---|---|---|
Times faster than real time |
4.5 |
3.2 |
2.9 |
2.5 |
2.2 |
1.8 |
NSNET
性能测试
数据集:array_onemic_nnoise_20230608(按照亚马逊声学认证标准录制测试集)
dnsmos |
|
---|---|
nsnet1 |
2.4 |
nsnet2 |
2.71 |