ESP-Brookesia Programming Guide

[中文]

ESP-Brookesia Logo

Getting Started

Utils

HAL

Getting Started

Utils

HAL

Service

Agent

Expression

Service

Agent

Expression

Overview

ESP-Brookesia is a human-machine interaction development framework for AIoT devices. It streamlines application development and AI capability integration. Built on ESP-IDF and a component-based architecture, it provides full-stack support from hardware abstraction and system services to AI agents, accelerating time-to-market for HMI and AI products.

Note

“Brookesia” is a genus of chameleons known for camouflage and adaptation—goals aligned with ESP-Brookesia. The framework aims to offer a flexible, scalable solution that adapts to diverse hardware and application needs, with high adaptability and flexibility like its namesake.

Key features of ESP-Brookesia:

  • Native ESP-IDF integration: C/C++ development deeply integrated with ESP-IDF and the ESP Registry component catalog, leveraging Espressif’s open-source ecosystem.

  • Extensible hardware abstraction: Unified hardware interfaces (audio, display, touch, storage) with board-level adaptation for fast porting.

  • Rich system services: Wi-Fi, audio/video, using a Manager + Helper architecture for decoupling and extension, providing support for Agent CLI.

  • Multi-LLM backends: Built-in adapters for OpenAI, Coze, XiaoZhi, and other platforms, with unified agent lifecycle management.

  • MCP protocol support: Function Calling / MCP exposes device services to large language models for unified LLM–service communication.

  • AI expression: Emoji sets, animations, and visual feedback for anthropomorphic interaction.

Architecture

ESP-Brookesia uses a layered design with three levels—environment & dependencies, service & framework, and application—as shown below.

ESP-Brookesia framework overview

Environment & dependencies

The runtime foundation. ESP-IDF provides the toolchain, RTOS, and peripheral drivers; ESP Registry manages distribution and versioning of framework components and third-party dependencies.

Service & framework

The core layer between the environment and applications. It exposes standardized service interfaces to applications and AI agents, covering utilities, HAL, system services, AI agents, and expression.

  • Utils: General utilities (logging, checks, state machine, task scheduler, plugins, profilers) and MCP Utils, bridging Brookesia services and the MCP engine so registered service functions become standard MCP tools for LLMs.

  • HAL: Interface defines audio, display, touch, status LED, and storage APIs; Adaptor provides board-specific implementations and resource mapping; Boards provides board-level YAML configuration describing the peripheral topology, pin assignments, and driver parameters.

  • General Service: Wi-Fi, Audio, Video, NVS, SNTP, and Custom extensions. All services use Manager + Helper with local calls and RPC.

  • AI Agent: Unified agent management with adapters for Coze, OpenAI, XiaoZhi, and Function Calling / MCP for bidirectional LLM–service communication.

  • AI Expression: Visual expression including Emote sets and animation control for anthropomorphic UIs.

  • System framework (planned): GUI, system shell, and app frameworks for phones, speakers, robots, and similar products.

  • Runtime (planned): WebAssembly, Python, and Lua for dynamic loading and execution.

Application layer

Products and projects built on the layers above:

  • General Projects: Product-oriented templates integrating framework components for direct product development.

  • System Apps (planned): Product-oriented system apps such as Settings, AI assistant, and app store, optional and integrable as needed.