Audio Streams

[中文]

The Audio Stream refers to an Audio Element that is responsible for acquiring of audio data and then sending the data out after processing.

The following stream types are supported:

Algorithm Stream
FatFs Stream
HTTP Stream
I2S Stream
PWM Stream
Raw Stream
SPIFFS Stream
TCP Client Stream
Tone Stream
Flash-Embedding Stream
TTS Stream

Each stream is initialized with a structure as an input, and the returned audio_element_handle_t handle is used to call the functions in audio_element.h. Most streams have two types, AUDIO_STREAM_READER (reader) and AUDIO_STREAM_WRITER (writer). For example, to set the I2S stream type, use i2s_stream_init() and i2s_stream_cfg_t.

See description below for the API details.

Algorithm Stream 

The algorithm stream integrates front-end algorithms such as acoustic echo cancellation (AEC), automatic gain control (AGC), and noise suppression (NS) to process the received audio. It is often used in audio preprocessing scenarios, including VoIP, speech recognition, and keyword wake-up. The stream calls esp-sr and thus occupies large memory. The stream only supports the AUDIO_STREAM_READER type.

Application Example

Header File

components/audio_stream/include/algorithm_stream.h

Functions

audio_element_handle_t algo_stream_init(algorithm_stream_cfg_t *config)

Initialize algorithm stream.

Parameters: config – The algorithm Stream configuration
Returns: The audio element handle

audio_element_err_t algo_stream_set_delay(audio_element_handle_t el, ringbuf_handle_t ringbuf, int delay_ms)

Set playback signal or recording signal delay when use type2.

Note

The AEC internal buffering mechanism requires that the recording signal is delayed by around 0 - 10 ms compared to the corresponding reference (playback) signal.

Parameters

el – Handle of element
ringbuf – Handle of ringbuf
delay_ms – The delay between playback and recording in ms This delay_ms can be debugged by yourself, you can set the configuration debug_input to true, then get the original input data (left channel is the signal captured from the microphone, right channel is the signal played to the speaker), and check the delay with an audio analysis tool.

Returns

ESP_OK
ESP_FAIL
ESP_ERR_INVALID_ARG

esp_err_t algorithm_mono_fix(uint8_t *sbuff, uint32_t len)

Fix I2S mono noise issue.

Note

This API only for ESP32 with I2S 16bits

Parameters

sbuff – I2S data buffer
len – I2S data len

Returns

ESP_OK

Structures

struct algorithm_stream_cfg_t

Algorithm stream configurations.

Public Members

algorithm_stream_input_type_t input_type: Input type of stream

int task_stack: Task stack size

int task_prio: Task peroid

int task_core: The core that task to be created

int out_rb_size: Size of output ringbuffer

bool stack_in_ext: Try to allocate stack in external memory

int rec_linear_factor: The linear amplication factor of record signal

int ref_linear_factor: The linear amplication factor of reference signal

bool debug_input: debug algorithm input data

bool swap_ch: Swap left and right channels

int8_t algo_mask: Choose algorithm to use

int sample_rate: The sampling rate of the input PCM (in Hz)

int mic_ch: MIC channel num

int agc_gain: AGC gain(dB) for voice communication

bool aec_low_cost: AEC uses less cpu and ram resources, but has poor suppression of nonlinear distortion

char *partition_label: Partition label which stored the model data

Macros

ALGORITHM_STREAM_PINNED_TO_CORE

ALGORITHM_STREAM_TASK_PERIOD

ALGORITHM_STREAM_RINGBUFFER_SIZE

ALGORITHM_STREAM_TASK_STACK_SIZE

ALGORITHM_STREAM_DEFAULT_SAMPLE_RATE_HZ

ALGORITHM_STREAM_DEFAULT_SAMPLE_BIT

ALGORITHM_STREAM_DEFAULT_MIC_CHANNELS

ALGORITHM_STREAM_DEFAULT_AGC_GAIN_DB

ALGORITHM_STREAM_DEFAULT_MASK

ALGORITHM_STREAM_CFG_DEFAULT()

Enumerations

enum algorithm_stream_input_type_t

Two types of algorithm stream input method.

Values:

enumerator ALGORITHM_STREAM_INPUT_TYPE1: Type 1 is default used by mini-board, the reference signal and the recording signal are respectively read in from the left channel and the right channel of the same I2S

enumerator ALGORITHM_STREAM_INPUT_TYPE2: As the simple diagram above shows, when type2 is choosen, the recording signal and reference signal should be input by users.

enum algorithm_stream_mask_t

Choose the algorithm to be used.

Values:

enumerator ALGORITHM_STREAM_USE_AEC: Use AEC

enumerator ALGORITHM_STREAM_USE_AGC: Use AGC

enumerator ALGORITHM_STREAM_USE_NS: Use NS

enumerator ALGORITHM_STREAM_USE_VAD: Use VAD

FatFs Stream 

The FatFs stream reads and writes data from FatFs. It has two types: “reader” and “writer”. The type is defined by audio_stream_type_t.

Create a handle to an Audio Element to stream data from FatFs to another Element or get data from other elements written to FatFs, depending on the configuration the stream type, either AUDIO_STREAM_READER or AUDIO_STREAM_WRITER.

Parameters: config – The configuration
Returns: The Audio Element handle

Structures

struct fatfs_stream_cfg_t

FATFS Stream configurations, if any entry is zero then the configuration will be set to default values.

Public Members

audio_stream_type_t type: Stream type

int buf_sz: Audio Element Buffer size

int out_rb_size: Size of output ringbuffer

int task_stack: Task stack size

int task_core: Task running in core (0 or 1)

int task_prio: Task priority (based on freeRTOS priority)

bool ext_stack: Allocate stack on extern ram

bool write_header: Choose to write amrnb/amrwb header in fatfs whether or not (true or false, true means choose to write amrnb header)

Macros

FATFS_STREAM_BUF_SIZE

FATFS_STREAM_TASK_STACK

FATFS_STREAM_TASK_CORE

FATFS_STREAM_TASK_PRIO

FATFS_STREAM_RINGBUFFER_SIZE

FATFS_STREAM_CFG_DEFAULT()

HTTP Stream 

The HTTP stream obtains and sends data through esp_http_client(). The stream has two types: “reader” and “writer”, and the type is defined by audio_stream_type_t. AUDIO_STREAM_READER supports HTTP, HTTPS, HTTP Live Stream, and other protocols. Make sure the network is connected before using the stream.

Application Example

Reader example
- player/pipeline_living_stream
- player/pipeline_http_mp3
Writer example
- recorder/pipeline_raw_http

Header File

components/audio_stream/include/http_stream.h

Functions

audio_element_handle_t http_stream_init(http_stream_cfg_t *config)

Create a handle to an Audio Element to stream data from HTTP to another Element or get data from other elements sent to HTTP, depending on the configuration the stream type, either AUDIO_STREAM_READER or AUDIO_STREAM_WRITER.

Parameters: config – The configuration
Returns: The Audio Element handle

esp_err_t http_stream_next_track(audio_element_handle_t el)

Connect to next track in the playlist.

        This function can be used in event_handler of http_stream.
        User can call this function to connect to next track in playlist when he/she gets `HTTP_STREAM_FINISH_TRACK` event

Parameters

el – The http_stream element handle

Returns

ESP_OK on success
ESP_FAIL on errors

esp_err_t http_stream_restart(audio_element_handle_t el)

esp_err_t http_stream_fetch_again(audio_element_handle_t el)

Try to fetch the tracks again.

         If this is live stream we will need to keep fetching URIs.

Parameters

el – The http_stream element handle

Returns

ESP_OK on success
ESP_ERR_NOT_SUPPORTED if playlist is finished

esp_err_t http_stream_set_server_cert(audio_element_handle_t el, const char *cert)

Set SSL server certification.

Note

EM format as string, if the client requires to verify server

Parameters

el – The http_stream element handle
cert – server certification

Returns

ESP_OK on success

Structures

struct http_stream_event_msg_t

Stream event message.

Public Members

http_stream_event_id_t event_id: Event ID

void *http_client: Reference to HTTP Client using by this HTTP Stream

void *buffer: Reference to Buffer using by the Audio Element

int buffer_len: Length of buffer

void *user_data: User data context, from http_stream_cfg_t

audio_element_handle_t el: Audio element context

struct http_stream_cfg_t

HTTP Stream configurations Default value will be used if any entry is zero.

Public Members

audio_stream_type_t type: Type of stream

int out_rb_size: Size of output ringbuffer

int task_stack: Task stack size

int task_core: Task running in core (0 or 1)

int task_prio: Task priority (based on freeRTOS priority)

bool stack_in_ext: Try to allocate stack in external memory

http_stream_event_handle_t event_handle: The hook function for HTTP Stream

void *user_data: User data context

bool auto_connect_next_track: connect next track without open/close

bool enable_playlist_parser: Enable playlist parser

int multi_out_num: The number of multiple output

const char *cert_pem: SSL server certification, PEM format as string, if the client requires to verify server

esp_err_t (*crt_bundle_attach)(void *conf): Function pointer to esp_crt_bundle_attach. Enables the use of certification bundle for server verification, must be enabled in menuconfig

int request_size: Request data size each time from http_client Defaults use DEFAULT_ELEMENT_BUFFER_LENGTH if set to 0 Need care this setting if audio frame size is small and want low latency playback

int request_range_size: Range size setting for header Range: bytes=start-end Request full range of resource if set to 0 Range size bigger than request size is recommended

const char *user_agent: The User Agent string to send with HTTP requests

Macros

HTTP_STREAM_TASK_STACK

HTTP_STREAM_TASK_CORE

HTTP_STREAM_TASK_PRIO

HTTP_STREAM_RINGBUFFER_SIZE

HTTP_STREAM_CFG_DEFAULT()

Type Definitions

typedef int (*http_stream_event_handle_t)(http_stream_event_msg_t *msg)

Enumerations

enum http_stream_event_id_t

HTTP Stream hook type.

Values:

enumerator HTTP_STREAM_PRE_REQUEST: The event handler will be called before HTTP Client making the connection to the server

enumerator HTTP_STREAM_ON_REQUEST: The event handler will be called when HTTP Client is requesting data, If the fucntion return the value (-1: ESP_FAIL), HTTP Client will be stopped If the fucntion return the value > 0, HTTP Stream will ignore the post_field If the fucntion return the value = 0, HTTP Stream continue send data from post_field (if any)

enumerator HTTP_STREAM_ON_RESPONSE: The event handler will be called when HTTP Client is receiving data If the fucntion return the value (-1: ESP_FAIL), HTTP Client will be stopped If the fucntion return the value > 0, HTTP Stream will ignore the read function If the fucntion return the value = 0, HTTP Stream continue read data from HTTP Server

enumerator HTTP_STREAM_POST_REQUEST: The event handler will be called after HTTP Client send header and body to the server, before fetching the headers

enumerator HTTP_STREAM_FINISH_REQUEST: The event handler will be called after HTTP Client fetch the header and ready to read HTTP body

enumerator HTTP_STREAM_RESOLVE_ALL_TRACKS

enumerator HTTP_STREAM_FINISH_TRACK

enumerator HTTP_STREAM_FINISH_PLAYLIST

I2S Stream 

The I2S stream receives and transmits audio data through the chip’s I2S, PDM, ADC, and DAC interfaces. To use the ADC and DAC functions, the chip needs to define SOC_I2S_SUPPORTS_ADC_DAC. The stream integrates automatic level control (ALC) to adjust volume, multi-channel output, and sending audio data with extended bit width. The relevant control bits are defined in i2s_stream_cfg_t.

Application Example

Reader example: recorder/pipeline_wav_amr_sdcard
Writer example: get-started/play_mp3_control

Header File

components/audio_stream/include/i2s_stream.h

Functions

audio_element_handle_t i2s_stream_init(i2s_stream_cfg_t *config)

Create a handle to an Audio Element to stream data from I2S to another Element or get data from other elements sent to I2S, depending on the configuration of stream type is AUDIO_STREAM_READER or AUDIO_STREAM_WRITER.

Note

If I2S stream is enabled with built-in DAC mode, please don’t use I2S_NUM_1. The built-in DAC functions are only supported on I2S0 for the current ESP32 chip.

Parameters: config – The configuration
Returns: The Audio Element handle

esp_err_t i2s_stream_set_channel_type(i2s_stream_cfg_t *config, i2s_channel_type_t type)

Set I2S stream channel format type.

Note

: This function only updates i2s_stream_cfg_t, so it needs to be called before i2s_stream_init.

Parameters

config – [in] The I2S stream configuration
type – [in] I2S channel format type

Returns

ESP_OK
ESP_ERR_INVALID_ARG

esp_err_t i2s_stream_set_clk(audio_element_handle_t i2s_stream, int rate, int bits, int ch)

Setup clock for I2S Stream, this function is only used with handle created by i2s_stream_init

Parameters

i2s_stream – [in] The i2s element handle
rate – [in] Clock rate (in Hz)
bits – [in] Audio bit width (8, 16, 24, 32)
ch – [in] Number of Audio channels (1: Mono, 2: Stereo). But when set to tdm mode, ch is slot mask.(ex: I2S_TDM_SLOT0 | I2S_TDM_SLOT1 | I2S_TDM_SLOT2 | I2S_TDM_SLOT3)

Returns

ESP_OK
ESP_FAIL

esp_err_t i2s_alc_volume_set(audio_element_handle_t i2s_stream, int volume)

Set the volume of input audio stream with ALC. Positive value indicates an increase in volume, negative value indicates a decrease in volume, 0 indicates the volume level remains unchanged.

Parameters

i2s_stream – [in] The i2s element handle
volume – [in] The gain of input audio stream:
- Supported range [-64, 63], unit: dB

Returns

ESP_OK
ESP_FAIL

esp_err_t i2s_alc_volume_get(audio_element_handle_t i2s_stream, int *volume)

Get volume of stream.

Parameters

i2s_stream – [in] The i2s element handle
volume – [in] The volume of stream

Returns

ESP_OK
ESP_FAIL

esp_err_t i2s_stream_sync_delay(audio_element_handle_t i2s_stream, int delay_ms)

Set sync delay of stream.

Parameters

i2s_stream – [in] The i2s element handle
delay_ms – [in] The delay of stream

Returns

ESP_OK
ESP_FAIL

Structures

struct i2s_stream_cfg_t

I2S Stream configurations Default value will be used if any entry is zero.

Public Members

audio_stream_type_t type: Type of stream

i2s_comm_mode_t transmit_mode: I2S transmit mode

i2s_chan_config_t chan_cfg: I2S controller channel configuration

i2s_std_config_t std_cfg: I2S standard mode major configuration that including clock/slot/gpio configuration

bool use_alc: It is a flag for ALC. If use ALC, the value is true. Or the value is false

int volume: The volume of audio input data will be set.

int out_rb_size: Size of output ringbuffer

int task_stack: Task stack size

int task_core: Task running in core (0 or 1)

int task_prio: Task priority (based on freeRTOS priority)

bool stack_in_ext: Try to allocate stack in external memory

int multi_out_num: The number of multiple output

bool uninstall_drv: whether uninstall the i2s driver when stream destroyed

bool need_expand: whether to expand i2s data

i2s_data_bit_width_t expand_src_bits: The source bits per sample when data expand

int buffer_len: Buffer length use for an Element. Note: when ‘bits_per_sample’ is 24 bit, the buffer length must be a multiple of 3. The recommended value is 3600

Macros

I2S_STREAM_TASK_STACK

I2S_STREAM_BUF_SIZE

I2S_STREAM_TASK_PRIO

I2S_STREAM_TASK_CORE

I2S_STREAM_RINGBUFFER_SIZE

I2S_STREAM_CFG_DEFAULT()

I2S_STREAM_CFG_DEFAULT_WITH_PARA(port, rate, bits, stream_type)

I2S_STD_PHILIPS_SLOT_DEFAULT_ADF_CONFIG(bits_per_sample, mono_or_stereo)

I2S_STREAM_CFG_DEFAULT_WITH_TYLE_AND_CH(port, rate, bits, stream_type, channel)

Enumerations

enum i2s_channel_type_t

Values:

enumerator I2S_CHANNEL_TYPE_RIGHT_LEFT: Separated left and right channel

enumerator I2S_CHANNEL_TYPE_ALL_RIGHT: Load right channel data in both two channels

enumerator I2S_CHANNEL_TYPE_ALL_LEFT: Load left channel data in both two channels

enumerator I2S_CHANNEL_TYPE_ONLY_RIGHT: Only load data in right channel (mono mode)

enumerator I2S_CHANNEL_TYPE_ONLY_LEFT: Only load data in left channel (mono mode)

PWM Stream 

In some cost-sensitive scenarios, the audio signal is not converted by the DAC but is modulated by the PWM (pulse width modulation) and then implemented by a filter circuit. The PWM stream modulates the audio signal with the chip’s PWM and sends out the processed audio. It only has the AUDIO_STREAM_WRITER type. Note that the digital-to-analog conversion by PWM has a lower signal-to-noise ratio.

Application Example

Writer example: player/pipeline_play_mp3_with_dac_or_pwm

Header File

components/audio_stream/include/pwm_stream.h

Functions

audio_element_handle_t pwm_stream_init(pwm_stream_cfg_t *config)

Initialize PWM stream Only support AUDIO_STREAM_READER type.

Parameters: config – The PWM Stream configuration
Returns: The audio element handle

esp_err_t pwm_stream_set_clk(audio_element_handle_t pwm_stream, int rate, int bits, int ch)

Setup clock for PWM Stream, this function is only used with handle created by pwm_stream_init

Parameters

pwm_stream – [in] The pwm element handle
rate – [in] Clock rate (in Hz)
bits – [in] Audio bit width (16, 32)
ch – [in] Number of Audio channels (1: Mono, 2: Stereo)

Returns

ESP_OK
ESP_FAIL

Structures

struct audio_pwm_config_t

PWM audio configurations.

Public Members

timer_group_t tg_num: timer group number (0 - 1)

timer_idx_t timer_num: timer number (0 - 1)

int gpio_num_left: the LEDC output gpio_num, Left channel

int gpio_num_right: the LEDC output gpio_num, Right channel

ledc_channel_t ledc_channel_left: LEDC channel (0 - 7), Corresponding to left channel

ledc_channel_t ledc_channel_right: LEDC channel (0 - 7), Corresponding to right channel

ledc_timer_t ledc_timer_sel: Select the timer source of channel (0 - 3)

ledc_timer_bit_t duty_resolution: ledc pwm bits

uint32_t data_len: ringbuffer size

struct pwm_stream_cfg_t

PWM Stream configurations Default value will be used if any entry is zero.

Public Members

audio_stream_type_t type: Type of stream

audio_pwm_config_t pwm_config: driver configurations

int out_rb_size: Size of output ringbuffer

int task_stack: Task stack size

int task_core: Task running in core (0 or 1)

int task_prio: Task priority (based on freeRTOS priority)

int buffer_len: pwm_stream buffer length

bool ext_stack: Allocate stack on extern ram

Macros

PWM_STREAM_GPIO_NUM_LEFT

PWM_STREAM_GPIO_NUM_RIGHT

PWM_STREAM_TASK_STACK

PWM_STREAM_BUF_SIZE

PWM_STREAM_TASK_PRIO

PWM_STREAM_TASK_CORE

PWM_STREAM_RINGBUFFER_SIZE

PWM_CONFIG_RINGBUFFER_SIZE

PWM_STREAM_CFG_DEFAULT()

Raw Stream 

The raw stream is used to obtain the output data of the previous element of the connection or to provide the data for the next element of the connection. It does not create a thread. For AUDIO_STREAM_READER, the connection is [i2s] -> [filter] -> [raw] or [i2s] -> [codec-amr] -> [raw]. For AUDIO_STREAM_WRITER, the connection is [raw] ->[codec-mp3]->[i2s].

Application Example

Reader example: protocols/voip
Writer example: advanced_examples/downmix_pipeline

Header File

components/audio_stream/include/raw_stream.h

Functions

audio_element_handle_t raw_stream_init(raw_stream_cfg_t *cfg)

Initialize RAW stream.

Parameters: cfg – The RAW Stream configuration
Returns: The audio element handle

int raw_stream_read(audio_element_handle_t pipeline, char *buffer, int buf_size)

Read data from Stream.

Parameters

pipeline – The audio pipeline handle
buffer – The buffer
buf_size – Maximum number of bytes to be read.

Returns

Number of bytes actually read.

int raw_stream_write(audio_element_handle_t pipeline, char *buffer, int buf_size)

Write data to Stream.

Parameters

pipeline – The audio pipeline handle
buffer – The buffer
buf_size – Number of bytes to write

Returns

Number of bytes written

Structures

struct raw_stream_cfg_t

Raw stream provides APIs to obtain the pipeline data without output stream or fill the pipeline data without input stream. The stream has two types / modes, reader and writer:

AUDIO_STREAM_READER, e.g. [i2s]->[filter]->[raw],[i2s]->[codec-amr]->[raw]
AUDIO_STREAM_WRITER, e.g. [raw]->[codec-mp3]->[i2s] Raw Stream configurations

Public Members

audio_stream_type_t type: Type of stream

int out_rb_size: Size of output ringbuffer

Macros

RAW_STREAM_RINGBUFFER_SIZE

RAW_STREAM_CFG_DEFAULT()

SPIFFS Stream 

The SPIFFS stream reads and writes audio data from or into SPIFFS.

Application Example

player/pipeline_spiffs_mp3

Header File

components/audio_stream/include/spiffs_stream.h

Functions

audio_element_handle_t spiffs_stream_init(spiffs_stream_cfg_t *config)

Create a handle to an Audio Element to stream data from SPIFFS to another Element or get data from other elements written to SPIFFS, depending on the configuration the stream type, either AUDIO_STREAM_READER or AUDIO_STREAM_WRITER.

Parameters: config – The configuration
Returns: The Audio Element handle

Structures

struct spiffs_stream_cfg_t

SPIFFS Stream configuration, if any entry is zero then the configuration will be set to default values.

Public Members

audio_stream_type_t type: Stream type

int buf_sz: Audio Element Buffer size

int out_rb_size: Size of output ringbuffer

int task_stack: Task stack size

int task_core: Task running in core (0 or 1)

int task_prio: Task priority (based on freeRTOS priority)

bool write_header: Choose to write amrnb/armwb header in spiffs whether or not (true or false, true means choose to write amrnb header)

Macros

SPIFFS_STREAM_BUF_SIZE

SPIFFS_STREAM_TASK_STACK

SPIFFS_STREAM_TASK_CORE

SPIFFS_STREAM_TASK_PRIO

SPIFFS_STREAM_RINGBUFFER_SIZE

SPIFFS_STREAM_CFG_DEFAULT()

TCP Client Stream 

The TCP client stream reads and writes server data over TCP.

Application Example

get-started/pipeline_tcp_client

Header File

components/audio_stream/include/tcp_client_stream.h

Functions

audio_element_handle_t tcp_stream_init(tcp_stream_cfg_t *config)

Initialize a TCP stream to/from an audio element This function creates a TCP stream to/from an audio element depending on the stream type configuration (e.g., AUDIO_STREAM_READER or AUDIO_STREAM_WRITER). The handle of the audio element is the returned.

Parameters: config – The configuration
Returns: The audio element handle

Structures

struct tcp_stream_event_msg

TCP Stream massage configuration.

Public Members

void *source: Element handle

void *data: Data of input/output

int data_len: Data length of input/output

esp_transport_handle_t sock_fd: handle of socket

struct tcp_stream_cfg_t

TCP Stream configuration, if any entry is zero then the configuration will be set to default values.

Public Members

audio_stream_type_t type: Type of stream

int timeout_ms: time timeout for read/write

int port: TCP port>

char *host: TCP host>

int task_stack: Task stack size

int task_core: Task running in core (0 or 1)

int task_prio: Task priority (based on freeRTOS priority)

bool ext_stack: Allocate stack on extern ram

tcp_stream_event_handle_cb event_handler: TCP stream event callback

void *event_ctx: User context

Macros

TCP_STREAM_DEFAULT_PORT: TCP stream parameters.

TCP_STREAM_TASK_STACK

TCP_STREAM_BUF_SIZE

TCP_STREAM_TASK_PRIO

TCP_STREAM_TASK_CORE

TCP_SERVER_DEFAULT_RESPONSE_LENGTH

TCP_STREAM_CFG_DEFAULT()

Type Definitions

typedef struct tcp_stream_event_msg tcp_stream_event_msg_t: TCP Stream massage configuration.

typedef esp_err_t (*tcp_stream_event_handle_cb)(tcp_stream_event_msg_t *msg, tcp_stream_status_t state, void *event_ctx)

Enumerations

enum tcp_stream_status_t

Values:

enumerator TCP_STREAM_STATE_NONE

enumerator TCP_STREAM_STATE_CONNECTED

Tone Stream 

The tone stream reads the data generated by tools/audio_tone/mk_audio_tone.py. It only supports the AUDIO_STREAM_READER type.

Application Example

player/pipeline_flash_tone

Header File

components/audio_stream/include/tone_stream.h

Functions

audio_element_handle_t tone_stream_init(tone_stream_cfg_t *config)

Create an Audio Element handle to stream data from flash to another Element, only support AUDIO_STREAM_READER type.

Parameters: config – The configuration
Returns: The Audio Element handle

Structures

struct tone_stream_cfg_t

TONE Stream configurations, if any entry is zero then the configuration will be set to default values.

Public Members

audio_stream_type_t type: Stream type

int buf_sz: Audio Element Buffer size

int out_rb_size: Size of output ringbuffer

int task_stack: Task stack size

int task_core: Task running in core (0 or 1)

int task_prio: Task priority (based on freeRTOS priority)

const char *label: Label of tone stored in flash. The default value is flash_tone

bool extern_stack: Task stack allocate on the extern ram

bool use_delegate: Read tone partition with esp_delegate. If task stack is on extern ram, this MUST be TRUE

Macros

TONE_STREAM_BUF_SIZE

TONE_STREAM_TASK_STACK

TONE_STREAM_TASK_CORE

TONE_STREAM_TASK_PRIO

TONE_STREAM_RINGBUFFER_SIZE

TONE_STREAM_EXT_STACK

TONE_STREAM_USE_DELEGATE

TONE_STREAM_CFG_DEFAULT()

Flash-Embedding Stream 

The flash-embedding stream reads the data generated by tools/audio_tone/mk_embed_flash.py. It only supports the AUDIO_STREAM_READER type.

Application Example

player/pipeline_embed_flash_tone

Header File

components/audio_stream/include/embed_flash_stream.h

Functions

audio_element_handle_t embed_flash_stream_init(embed_flash_stream_cfg_t *config)

Create an Audio Element handle to stream data from flash to another Element, only support AUDIO_STREAM_READER type.

Parameters: config – The configuration
Returns: The Audio Element handle

esp_err_t embed_flash_stream_set_context(audio_element_handle_t embed_stream, const embed_item_info_t *context, int max_num)

Set the embed flash context.

        This function mainly provides information about embed flash data

Parameters

embed_stream – [in] The embed flash element handle
context – [in] The embed flash context
max_num – [in] The number of embed flash context

Returns

ESP_OK
ESP_FAIL

Structures

struct embed_flash_stream_cfg_t

Flash-embedding stream configurations, if any entry is zero then the configuration will be set to default values.

Public Members

int buf_sz: Audio Element Buffer size

int out_rb_size: Size of output ringbuffer

int task_stack: Task stack size

int task_core: Task running in core (0 or 1)

int task_prio: Task priority (based on freeRTOS priority)

bool extern_stack: At present, task stack can only be placed on SRAM, so it should always be set to false

struct embed_item_info

Embed tone information in flash.

Public Members

const uint8_t *address: The corresponding address in flash

int size: Size of corresponding data

Macros

EMBED_FLASH_STREAM_BUF_SIZE

EMBED_FLASH_STREAM_TASK_STACK

EMBED_FLASH_STREAM_TASK_CORE

EMBED_FLASH_STREAM_TASK_PRIO

EMBED_FLASH_STREAM_RINGBUFFER_SIZE

EMBED_FLASH_STREAM_EXT_STACK

EMBED_FLASH_STREAM_CFG_DEFAULT()

Type Definitions

typedef struct embed_item_info embed_item_info_t: Embed tone information in flash.

TTS Stream 

The tex-to-speech stream (TTS stream) obtains the esp_tts_voice data of esp-sr. It only supports the AUDIO_STREAM_READER type.

Application Example

Reader example: player/pipeline_tts_stream

Header File

components/audio_stream/include/tts_stream.h

Functions

audio_element_handle_t tts_stream_init(tts_stream_cfg_t *config)

Create a handle to an Audio Element to stream data from TTS to another Element, the stream type only support AUDIO_STREAM_READER for now.

Parameters: config – The configuration
Returns: The Audio Element handle

esp_err_t tts_stream_set_strings(audio_element_handle_t el, const char *strings)

Set tts stream strings.

Parameters

el – [in] The audio element handle
strings – [in] The string pointer

Returns

ESP_OK
ESP_FAIL

esp_err_t tts_stream_set_speed(audio_element_handle_t el, tts_voice_speed_t speed)

Setting tts stream voice speed.

Parameters

el – [in] The esp_audio instance
speed – [in] Speed will be set. 0-5 is legal. 0 is the slowest speed.

Returns

ESP_OK
ESP_FAIL

esp_err_t tts_stream_get_speed(audio_element_handle_t el, tts_voice_speed_t *speed)

Get tts stream voice speed.

Parameters

el – [in] The esp_audio instance
speed – [in] Return tts stream Speed will be [0,5]

Returns

ESP_OK
ESP_FAIL

Structures

struct tts_stream_cfg_t

TTS Stream configurations, if any entry is zero then the configuration will be set to default values.

Public Members

audio_stream_type_t type: Stream type

int buf_sz: Audio Element Buffer size

int out_rb_size: Size of output ringbuffer

int task_stack: Task stack size

int task_core: Task running in core (0 or 1)

int task_prio: Task priority (based on freeRTOS priority)

bool ext_stack: Allocate stack on extern ram

Macros

TTS_STREAM_BUF_SIZE

TTS_STREAM_TASK_STACK

TTS_STREAM_TASK_CORE

TTS_STREAM_TASK_PRIO

TTS_STREAM_RINGBUFFER_SIZE

TTS_STREAM_CFG_DEFAULT()

Enumerations

enum tts_voice_speed_t

Values:

enumerator TTS_VOICE_SPEED_0

enumerator TTS_VOICE_SPEED_1

enumerator TTS_VOICE_SPEED_2

enumerator TTS_VOICE_SPEED_3

enumerator TTS_VOICE_SPEED_4

enumerator TTS_VOICE_SPEED_5

enumerator TTS_VOICE_SPEED_MAX

Provide feedback about this document