ESP-Muxer Component
Note
This document is automatically translated using AI. Please excuse any detailed errors. The official English version is still in progress.
Overview
ESP-Muxer is an embedded audio and video multiplexing component launched by Espressif, used to multiplex (mux, multiplex) encoded audio and video frames (such as AAC, MP3, H.264, MJPEG) into a standard media container format data stream or file. It is not responsible for the encoding or decoding of audio and video, but only for the temporal organization and encapsulation of the input encoded frames.
In design, ESP-Muxer supports two types of output modes: it can be used for local file recording with optional slicing and naming rules, and it can also perform streaming output through a callback mechanism, making the same set of multiplexing logic meet local storage needs and flexibly apply to network transmission scenarios.
In the media processing pipeline, ESP-Muxer is located after the encoder and before storage or network transmission, used to convert the encoded stream into a playable file or streamable data format.
Main Features and Capabilities
Core Features
Audio and Video Multiplexing: Merge one or more audio and video tracks (provided the container format supports it) into a single output.
Multiple Container Formats: Supports container formats such as MP4, TS, FLV, WAV, CAF, and OGG, which can be selected according to recording, streaming, or compatibility requirements.
Direct File Writing: The multiplexed data can be directly written to the file system (such as an SD card), supporting optional file slicing (for example, slicing by duration) and custom file naming (for example, naming by slice number or time).
Streaming Callback Output: The output data can be sent to a data callback function instead of (or not just) writing to a file. This method is suitable for real-time streaming media scenarios: the multiplexer generates container data, and the user code pushes it to the network or other receivers.
Custom Write Path: Supports access to custom file writers (open / write / seek / close). This allows data to be written to non-standard storage media (such as custom Flash partitions, RAM buffers, or different file systems), while still using the same multiplexing logic.
Containers Suitable for Streaming: Some containers (such as TS, FLV) structurally support incremental output, making them more suitable for streaming. MP4 and WAV rely on headers and metadata related to total length, usually not recommended for streaming output, but very suitable for file recording scenarios.
Supported Containers and Encoding Formats
See ESP-Muxer README for details on container and encoding format support.
Typical Application Scenarios
Personal / Embedded Video Recorders (PVR): Camera + Microphone → Encoding → Multiplexing → Output as MP4/TS file to SD card, support for slicing by time or size.
HLS Segment Generation: Generate TS (or other format) segments for HLS servers or pipelines for live or on-demand broadcasting.
HTTP FLV and other Real-time Stream Sources: Multiplex into FLV (or TS) and push to HTTP FLV service or other real-time stream access endpoints.
Voice/Audio Recording: Multiplex encoded audio (AAC, MP3, etc.) into WAV, CAF, or OGG files for easy playback and portability.
Custom Storage or Hybrid Architecture: Use a custom file writer to write multiplexed data to non-standard media while maintaining a unified multiplexing processing logic.
Boosting storage write speed
When writing multiplexed data to storage media (especially SD cards), the write speed often limits the supported bit rate or the number of streams that can be recorded simultaneously. ESP-Muxer provides optimization methods in the following aspects:
RAM cache for aligned writing: Many file systems and storage drivers perform better when performing aligned and large block writing, while the encoded audio and video data are often small and unaligned packets. ESP-Muxer can use an internal RAM cache to aggregate data first, and then write in larger and aligned blocks. The cache size is configurable (0–64 KB). A larger cache usually means higher throughput, but it will occupy more RAM.
File slicing: Splitting files by duration (or size) can prevent individual files from becoming too large, reduce the impact of continuous writing for a long time, and help with retrieval and management (such as rotating to delete old slices).
Custom writer: If the storage medium has special requirements (such as wear leveling, custom block size, or additional buffering mechanism), a custom file writer (open / write / seek / close) can be provided. The container data generated by the multiplexer remains unchanged, only the underlying writing method is different. This mechanism does not replace the RAM cache, but replaces the default file I/O; you can still enable RAM cache before the custom writer to improve throughput performance.