ESP Audio Render

[中文]

ESP Audio Render is an advanced audio rendering component for Espressif SoCs. It mixes one or more PCM inputs and delivers the result to a user-provided writer callback. Each input is called a stream; a stream processor (per-stream EQ, Sonic speed change, etc.) can be connected before mixing, and a mixed processor (ALC, limiter, etc.) can be connected after mixing. The processing chain is dynamically generated by ESP-GMF elements to optimize performance. Suitable for scenarios such as music + TTS + notification sound layering and multi-track audio synthesis.

Key Features

Multi-stream mixing: mix multiple PCM inputs into a single output
Optional per-stream processing: each stream independently connects an ESP-GMF element chain for pre-processing
Optional post-mix processing: connect a unified mixed processor (e.g., ALC, limiter) after the mixer
Customizable output: deliver final PCM to the application via a writer callback; compatible with any sink such as I2S, Bluetooth sink, or network streaming
Dynamic pipeline: automatically generates the processing chain based on the number of active streams to avoid idle cycles and save CPU
Runtime control: each stream supports pause / resume / flush / speed change
Solo playback: designate one stream for solo while muting the others
Mix control: per-stream mixer gain and fade in/out
Dynamic output format switching: output sample rate, channels, and bit width can be changed at runtime without rebuilding the pipeline
Configurable frame size: the frame length per processing call is adjustable and can be aligned with the downstream sink

Was this page helpful?

Thank you! We received your feedback.
If you have any comments, fill in Espressif Documentation Feedback Form.

We value your feedback.
Let us know how we can improve this page by filling in Espressif Documentation Feedback Form.