性能测试结果

[English]

AFE

资源消耗

AFE 配置和算法流程

Config

Pipeline

MR, SR, LOW_COST

|AEC(SR_LOW_COST)| -> |VAD(vadnet1_medium)| -> |WakeNet(wn9_hilexin,)|

MR, SR, HIGH_PERF

|AEC(SR_HIGH_PERF)| -> |VAD(vadnet1_medium)| -> |WakeNet(wn9_hilexin,)|

MR, VC, LOW_COST

|AEC(VOIP_LOW_COST)| -> |NS(nsnet2)| -> |VAD(vadnet1_medium)|

MR, VC, HIGH_PERF

|AEC(VOIP_HIGH_PERF)| -> |NS(nsnet2)| -> |VAD(vadnet1_medium)|

MMNR, SR, LOW_COST

|AEC(SR_LOW_COST)| -> |SE(BSS)| -> |VAD(vadnet1_medium)| -> |WakeNet(wn9_hilexin,)|

MMNR, SR, HIGH_PERF

|AEC(SR_HIGH_PERF)| -> |SE(BSS)| -> |VAD(vadnet1_medium)| -> |WakeNet(wn9_hilexin,)|

备注

  • MR: 一个麦克风通道和一个播放通道

  • MMNR: 两个麦克风通道和一个播放通道

  • Models: nsnet2, vadnet1_medium, wn9_hilexin

AFE 配置和性能

Config

Internal RAM (KB)

PSRAM (KB)

Feed CPU usage (1 core,%)

Fetch CPU usage (1 core,%)

MR, SR, LOW_COST

72.3

732.7

8.4

15.0

MR, SR, HIGH_PERF

78.0

734.7

9.4

14.9

MR, VC, LOW_COST

50.3

821.4

60.0

8.2

MR, VC, HIGH_PERF

93.7

824.0

64.0

8.2

MMNR, SR, LOW_COST

76.6

1173.9

36.6

30.0

MMNR, SR, HIGH_PERF

99.0

1173.7

38.8

30.0

WakeNet

资源消耗

Model Type

RAM

PSRAM

Average Running Time per Frame

Frame Length

Quantised WakeNet8 @ 2 channel

50 KB

1640 KB

10.0 ms

32 ms

Quantised WakeNet9 @ 2 channel

16 KB

324 KB

3.0 ms

32 ms

Quantised WakeNet9 @ 3 channel

20 KB

347 KB

4.3 ms

32 ms

性能测试

Distance

Quiet

Stationary Noise (SNR = 4 dB)

Speech Noise (SNR = 4 dB)

AEC I nterruption (-10 dB)

1 m

98%

96%

94%

96%

3 m

98%

96%

94%

94%

误触发率:12 小时 1 次

备注

我们在测试中使用了 ESP32-S3-Korvo V4.0 开发板和 WakeNet9(Alexa) 模型。

MultiNet

资源消耗

Model Type

Internal RAM

PSRAM

Average Running Time per Frame

Frame Length

MultiNet 4

16.8KB

1866 KB

18 ms

32 ms

MultiNet 4 Q8

10.5 KB

1009 KB

11 ms

32 ms

MultiNet 5 Q8

16 KB

2310 KB

12 ms

32 ms

MultiNet 6

32 KB

4100 KB

12 ms

32 ms

Word Error Rate 性能测试

Model Type

aishell test

MultiNet 5_cn

9.5%

MultiNet 6_cn

5.2%

备注

中文使用没有声调的拼音单元去计算WER。

Speech Commands 性能测试(空调控制场景)

Model Type

Distance

Quiet

Stationary Noise (SNR=5~10dB dB)

Speech Noise (SNR=5~10dB dB)

MultiNet 5_cn

3 m

88.9%

66.1%

67.5%

MultiNet 6_cn

3 m

98.8%

88.3%

88.0%

MultiNet 6_cn_ac

3 m

97.1%

95.1%

96.8%

备注

MultiNet6_cn_ac在空调场景数据集上进行了进一步的微调,所以在空调控制场景具有更好的性能。

TTS

资源消耗

Flash image size: 2.2 MB

RAM runtime: 20 KB

性能测试

CPU 负载测试(ESP32 @240 MHz):

Speech Rate

0

1

2

3

4

5

Times faster than real time

4.5

3.2

2.9

2.5

2.2

1.8

NSNET

性能测试

数据集:array_onemic_nnoise_20230608(按照亚马逊声学认证标准录制测试集)

dnsmos

nsnet1

2.4

nsnet2

2.71