Model API Reference

This section covers model loading and static memory planning, making it convenient for users to directly load and run ESPDL models.

Header File

esp-dl/dl/model/include/dl_model_base.hpp

Macros

DL_LOG_INFER_LATENCY_INIT_WITH_SIZE(size)

DL_LOG_INFER_LATENCY_INIT()

DL_LOG_INFER_LATENCY_START()

DL_LOG_INFER_LATENCY_END()

DL_LOG_INFER_LATENCY_PRINT(prefix, key)

DL_LOG_INFER_LATENCY_END_PRINT(prefix, key)

DL_LOG_INFER_LATENCY_ARRAY_INIT_WITH_SIZE(n, size)

DL_LOG_INFER_LATENCY_ARRAY_INIT(n)

DL_LOG_INFER_LATENCY_ARRAY_START(i)

DL_LOG_INFER_LATENCY_ARRAY_END(i)

DL_LOG_INFER_LATENCY_ARRAY_PRINT(i, prefix, key)

DL_LOG_INFER_LATENCY_ARRAY_END_PRINT(i, prefix, key)

Classes

class Model

Neural Network Model.

Public Functions

Model(const char *rodata_address_or_partition_label_or_path, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, int max_internal_size = 0, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY, const uint8_t *key = nullptr, bool param_copy = true)

Create the Model object by rodata address or partition label.

Parameters

rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.
location – The model location.
max_internal_size – In bytes. Limit the max internal size usage. Only take effect when there’s a PSRAM, and you want to alloc memory on internal RAM first.
mm_type – Type of memory manager
key – The key of encrypted model.
param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.

Model(const char *rodata_address_or_partition_label_or_path, int model_index, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, int max_internal_size = 0, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY, const uint8_t *key = nullptr, bool param_copy = true)

Create the Model object by rodata address or partition label.

Parameters

rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.
model_index – The model index of packed models.
location – The model location.
max_internal_size – In bytes. Limit the max internal size usage. Only take effect when there’s a PSRAM, and you want to alloc memory on internal RAM first.
mm_type – Type of memory manager
key – The key of encrypted model.
param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.

Model(const char *rodata_address_or_partition_label_or_path, const char *model_name, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, int max_internal_size = 0, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY, const uint8_t *key = nullptr, bool param_copy = true)

Create the Model object by rodata address or partition label.

Parameters

rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.
model_name – The model name of packed models.
location – The model location.
max_internal_size – In bytes. Limit the max internal size usage. Only take effect when there’s a PSRAM, and you want to alloc memory on internal RAM first.
mm_type – Type of memory manager
key – The key of encrypted model.
param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.

Model(fbs::FbsModel *fbs_model, int internal_size = 0, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY)

Create the Model object by fbs_model.

Parameters

fbs_model – The fbs model.
internal_size – Internal ram size, in bytes
mm_type – Type of memory manager

virtual ~Model(): Destroy the Model object.

virtual esp_err_t load(const char *rodata_address_or_partition_label_or_path, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, const uint8_t *key = nullptr, bool param_copy = true)

Load model graph and parameters from FLASH or sdcard.

Parameters

rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.
location – The model location.
key – The key of encrypted model.
param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.

Returns

ESP_OK Success
ESP_FAIL Failed

virtual esp_err_t load(const char *rodata_address_or_partition_label_or_path, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, int model_index = 0, const uint8_t *key = nullptr, bool param_copy = true)

Load model graph and parameters from FLASH or sdcard.

Parameters

rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.
location – The model location.
model_index – The model index of packed models.
key – The key of encrypted model.
param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.

Returns

ESP_OK Success
ESP_FAIL Failed

virtual esp_err_t load(const char *rodata_address_or_partition_label_or_path, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, const char *model_name = nullptr, const uint8_t *key = nullptr, bool param_copy = true)

Load model graph and parameters from FLASH or sdcard.

Parameters

rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.
location – The model location.
model_name – The model name of packed models.
key – The key of encrypted model.
param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.

Returns

ESP_OK Success
ESP_FAIL Failed

virtual esp_err_t load(fbs::FbsModel *fbs_model)

Load model graph and parameters from Flatbuffers model.

Parameters

fbs_model – The FlatBuffers model

Returns

ESP_OK Success
ESP_FAIL Failed

virtual void build(size_t max_internal_size, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY, bool preload = false)

Allocate memory for the model.

Parameters

max_internal_size – In bytes. Limit the max internal size usage. Only take effect when there’s a PSRAM, and you want to alloc memory on internal RAM first.
mm_type – Type of memory manager
preload – Whether to preload the model’s parameters to internal ram (not implemented yet)

virtual void run(runtime_mode_t mode = RUNTIME_MODE_SINGLE_CORE)

Run the model module by module.

Parameters: mode – Runtime mode.

virtual void run(TensorBase *input, runtime_mode_t mode = RUNTIME_MODE_SINGLE_CORE)

Run the model module by module.

Parameters

input – The model input.
mode – Runtime mode.

virtual void run(std::map<std::string, TensorBase*> &user_inputs, runtime_mode_t mode = RUNTIME_MODE_SINGLE_CORE, std::map<std::string, TensorBase*> user_outputs = {})

Run the model module by module.

Parameters

user_inputs – The model inputs.
mode – Runtime mode.
user_outputs – It’s for debug to pecify the output of the intermediate layer; Under normal use, there is no need to pass a value to this parameter. If no parameter is passed, the default is the graphical output, which can be obtained through Model::get_outputs().

void minimize(): Minimize the model.

esp_err_t test()

Test whether the model inference result is correct. The model should contain test_inputs and test_outputs. Enable export_test_values option in esp-ppq to use this api.

Returns: esp_err_t

std::map<std::string, mem_info_t> get_memory_info()

Get memory info.

Returns: Memory usage statistics on internal and PSRAM.

std::map<std::string, module_info> get_module_info()

Get module info.

Returns: return Type and latency of each module.

void print_module_info(const std::map<std::string, module_info> &info, bool sort_module_by_latency = false)

Print the module info obtained by get_module_info function.

Parameters

info –
sort_module_by_latency –

void profile_memory(): Print model memory summary.

void profile_module(bool sort_module_by_latency = false)

Print module info summary. (Name, Type, Latency)

Parameters: sort_module_by_latency – True The module is printed in latency decreasing sort. False The module is printed in ONNX topological sort.

void profile(bool sort_module_by_latency = false)

Combination of profile_memory & profile_module.

Parameters: sort_module_by_latency – True The module is printed in latency decreasing sort. False The module is printed in ONNX topological sort.

virtual std::map<std::string, TensorBase*> &get_inputs()

Get inputs of model.

Returns: The map of model input’s name and TensorBase*

virtual TensorBase *get_input()

Get the only input of model.

Returns: TensorBase*

virtual TensorBase *get_input(const std::string &name)

Get input of model by name.

Parameters: name – input name
Returns: TensorBase*

virtual TensorBase *get_intermediate(const std::string &name)

Get intermediate TensorBase of model.

Note

When using memory manager, the content of TensorBase’s data may be overwritten by the outputs of other

Parameters: name – The name of intermediate Tensor. operators.
Returns: The intermediate TensorBase*.

virtual std::map<std::string, TensorBase*> &get_outputs()

Get outputs of model.

Returns: The map of model output’s name and TensorBase*

virtual TensorBase *get_output()

Get the only output of model.

Returns: TensorBase*

virtual TensorBase *get_output(const std::string &name)

Get output of model by name.

Parameters: name – output name
Returns: TensorBase*

virtual void print(): Print the model.

inline virtual fbs::FbsModel *get_fbs_model()

Get the fbs model instance.

Returns: fbs::FbsModel *

Header File

esp-dl/dl/model/include/dl_model_context.hpp

Macros

CONTEXT_PARAMETER_OFFSET: Offset for parameter tensors

Classes

class ModelContext

Model Context class including variable tensors and parameters.

Public Functions

inline ModelContext(): Constructor for ModelContext. Initializes the PSRAM and internal root pointers to nullptr.

inline ~ModelContext(): Destructor for ModelContext. Clears all resources and tensors.

int add_tensor(const std::string name, bool is_paramter = false, TensorBase *tensor = nullptr)

Adds a tensor to the parameter or variable list.

Parameters

name – The name of the tensor.
is_paramter – Whether the tensor is a parameter (default: false).
tensor – Pointer to the TensorBase object (default: nullptr).

Returns

int Returns the index of the added tensor.

int push_back_tensor(TensorBase *tensor, bool is_paramter = false)

Push back a tensor.

Parameters

tensor – Pointer to the TensorBase object.
is_paramter – Whether the tensor is a parameter (default: false).

Returns

int Returns the index of the added tensor.

void update_tensor(int index, TensorBase *tensor)

Updates the tensor at the specified index.

Parameters

index – The index of the tensor to update.
tensor – Pointer to the new TensorBase object.

TensorBase *get_tensor(int index)

Gets the tensor by its index.

Parameters: index – The index of the tensor.
Returns: TensorBase* Returns the pointer to the TensorBase object, or nullptr if the index is invalid.

TensorBase *get_tensor(const std::string &name)

Gets the tensor by its name.

Parameters: name – The name of the tensor.
Returns: TensorBase* Returns the pointer to the TensorBase object, or nullptr if the name is not found.

int get_tensor_index(const std::string &name)

Gets the tensor index by its name.

Parameters: name – The name of the tensor.
Returns: int Returns index if the name is found, else -1

int get_variable_index(const std::string &name)

Gets the variable tensor index by its name.

Parameters: name – The name of the tensor.
Returns: int Returns index if the name is found and is variable tensor, else -1

inline int get_variable_count()

Gets the count of variable tensors.

Returns: int Returns the number of variable tensors.

inline int get_parameter_count()

Gets the count of parameter tensors.

Returns: int Returns the number of parameter tensors.

bool root_alloc(size_t internal_size, size_t psram_size, int alignment = 16)

Allocates memory for PSRAM and internal roots.

Parameters

internal_size – The size of the internal memory in bytes.
psram_size – The size of the PSRAM memory in bytes.
alignment – The alignment of the memory in bytes.

Returns

Bool Return true if the allocation is successful, false otherwise.

inline void *get_psram_root()

Gets the pointer to the PSRAM root.

Returns: Void* Returns the pointer to the PSRAM root.

inline void *get_internal_root()

Gets the pointer to the internal root.

Returns: Void* Returns the pointer to the internal root.

size_t get_parameter_memory_size(mem_info_t &mem_info, bool copy)

Gets the size of the parameters in bytes.

Parameters

mem_info – The size of the memory used by the parameters in bytes, filtered by copy option.
copy – Filter the parameters by auto_free.

Returns

size_t Returns the total size of the parameters memory in bytes.

size_t get_variable_memory_size(mem_info_t &mem_info)

Get the variable memory size object.

Parameters: mem_info – The size of the memory used by the variables in bytes.
Returns: size_t Returns the total size of the variables memory in bytes.

inline void root_free(): Frees the memory allocated for PSRAM and internal roots. This function ensures proper cleanup of allocated memory.

inline void minimize(): Minimizes the context by clearing the name-to-index map. This is used to free unnecessary intermediate variables during the inference.

inline void clear(): Clears all resources and tensors in the context. This includes clearing variables, parameters, name-to-index map, and freeing memory.

Public Members

std::vector<TensorBase*> m_variables: Variable tensors of model, the first one is nullptr

std::vector<TensorBase*> m_parameters: Parameters of model, the first one is nullptr

Header File

esp-dl/dl/model/include/dl_memory_manager.hpp

Classes

class MemoryManagerBase

Memory manager base class, each model has its own memory manager TODO: share memory manager with different models.

Subclassed by dl::MemoryManagerGreedy

Public Functions

inline MemoryManagerBase(int alignment = 16)

Construct a new Memory Manager Base object.

Parameters: alignment – Memory address alignment

inline virtual ~MemoryManagerBase(): Destroy the MemoryManager object. Return resource.

virtual bool alloc(fbs::FbsModel *fbs_model, std::vector<dl::module::Module*> &execution_plan, ModelContext *context) = 0

Allocate memory for each tensor, include all input and output tensors.

Parameters

fbs_model – FlatBuffer’s Model
execution_plan – Topological sorted module list
context – Model context

Returns

Bool Return true if the allocation is successful, false otherwise.

Public Members

int alignment: The root pointer needs to be aligned must be a power of two

class TensorInfo

Tensor info, include tensor name, shape, dtype, size, time range and call times, which is used to plan model memory.

Public Functions

TensorInfo(std::string &name, int time_begin, int time_end, std::vector<int> shape, dtype_t dtype, int exponent, bool is_internal = false)

Construct a new Tensor Info object.

Parameters

name – Tensor name
time_begin – Tensor lifetime begin
time_end – Tensor lifetime end
shape – Tensor shape
dtype – Tensor dtype
exponent – Tensor exponent
is_internal – Is tensor in internal RAM or not

inline ~TensorInfo(): Destroy the Tensor Info object.

void set_inplace_leader_tensor(TensorInfo *tensor)

Set the inplace leader tensor object.

Parameters: tensor – Inplace leader tensor

inline void set_inplace_follower_tensor(TensorInfo *tensor)

Set the inplace follower tensor object.

Parameters: tensor – Inplace follower tensor

inline TensorInfo *get_inplace_follower_tensor()

Get the inplace follower tensor object.

Returns: TensorInfo* Inplace follower tensor

void update_time(int new_time)

Update Tensor lifetime.

Parameters: new_time – new tensor lifetime

TensorBase *create_tensor(void *internal_root, void *psram_root)

Create a TensorBase object according to TensorInfo.

Parameters

internal_root – Internal RAM root pointer
psram_root – PSRAM root pointer

Returns

TensorBase*

inline bool is_inplaced()

Is inplaced or not.

Returns: true if inplaced else false

inline uint32_t get_offset()

Get the tensor offset.

Returns: uint32_t

inline void set_offset(uint32_t offset)

Set the tensor offset.

Parameters: offset –

inline uint32_t get_internal_offset()

Get the internal offset.

Returns: uint32_t

inline bool get_internal_state()

Get the internal state.

Returns: true if is internal else false

inline void set_internal_state(bool is_internal)

Set the internal state.

Parameters: is_internal –

inline void set_internal_offset(uint32_t offset)

Set the internal offset.

Parameters: offset –

inline int get_time_end()

Get the liftetime end.

Returns: int

inline int get_time_begin()

Get the liftetime begin.

Returns: int

inline size_t get_size()

Get the tensor size.

Returns: size_t

inline std::string get_name()

Get the tensor name.

Returns: std::string

inline std::vector<int> get_shape()

Get the tensor shape.

Returns: std::vector<int>

inline void print(): print tensor info

class MemoryChunk

Memory chunk, include size, is free, offset, alignment and tensor, which is used to simulate memory allocation.

Public Functions

MemoryChunk(size_t size, int is_free, int alignment = 16)

Construct a new Memory Chunk object.

Parameters

size – Memory chunk size
is_free – Whether free or not
alignment – Memory chunk alignment

MemoryChunk(TensorInfo *tensor, int alignment = 16)

Construct a new Memory Chunk object.

Parameters

tensor – TensorInfo
alignment – Memory chunk alignment

inline ~MemoryChunk(): Destroy the Memory Chunk object.

MemoryChunk *merge_free_chunk(MemoryChunk *chunk)

Merge continuous free chunk.

Parameters: chunk –
Returns: MemoryChunk*

MemoryChunk *insert(TensorInfo *tensor)

Insert tensor into free chunk.

Parameters: tensor –
Returns: MemoryChunk*

MemoryChunk *extend(TensorInfo *tensor)

Extend free chunk and insert tensor.

Parameters: tensor –
Returns: MemoryChunk*

inline void free(): Free memory chunk, set is_free to true and set tensor to nullptr.

size_t get_aligned_size(size_t size)

get aligned size, which is 16/alignemt bytes aligned

Parameters: size –
Returns: size_t

Public Members

size_t size: Memeory chunk size

bool is_free: Whether memory chunk is free or not

int offset: Offset relative to root pointer

int alignment: Memory address alignment

TensorInfo *tensor: Info of the tensor which occupies the memory

Header File

esp-dl/dl/model/include/dl_memory_manager_greedy.hpp

Classes

class MemoryManagerGreedy : public dl::MemoryManagerBase 

Greedy memory manager that allocates memory for tensors in execution order, prioritizing internal RAM allocation first.

Public Functions

inline MemoryManagerGreedy(int max_internal_size, int alignment = 16)

Constructs a greedy memory manager with specified constraints.

Parameters

max_internal_size – Maximum allowed internal RAM usage in bytes
alignment – Memory address alignment requirement (default: 16 bytes)

inline ~MemoryManagerGreedy(): Destructor that releases all managed memory resources.

virtual bool alloc(fbs::FbsModel *fbs_model, std::vector<dl::module::Module*> &execution_plan, ModelContext *context)

Allocates memory for all network tensors following greedy strategy.

Parameters

fbs_model – FlatBuffer model containing network architecture
execution_plan – Execution graph ordered by computation dependencies
context – Device-specific runtime configuration

Returns

bool True if successful allocation, false if memory insufficient

void free(): Releases all allocated memory including tensor buffers and memory pools.