Model API Reference

This section covers model loading and static memory planning, making it convenient for users to directly load and run ESPDL models.

Header File

Macros

DL_LOG_INFER_LATENCY_INIT_WITH_SIZE(size)
DL_LOG_INFER_LATENCY_INIT()
DL_LOG_INFER_LATENCY_START()
DL_LOG_INFER_LATENCY_END()
DL_LOG_INFER_LATENCY_PRINT(prefix, key)
DL_LOG_INFER_LATENCY_END_PRINT(prefix, key)
DL_LOG_INFER_LATENCY_ARRAY_INIT_WITH_SIZE(n, size)
DL_LOG_INFER_LATENCY_ARRAY_INIT(n)
DL_LOG_INFER_LATENCY_ARRAY_START(i)
DL_LOG_INFER_LATENCY_ARRAY_END(i)
DL_LOG_INFER_LATENCY_ARRAY_PRINT(i, prefix, key)
DL_LOG_INFER_LATENCY_ARRAY_END_PRINT(i, prefix, key)

Classes

class Model

Neural Network Model.

Public Functions

Model(const char *rodata_address_or_partition_label_or_path, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, int max_internal_size = 0, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY, uint8_t *key = nullptr, bool param_copy = true)

Create the Model object by rodata address or partition label.

参数
  • rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.

  • location – The model location.

  • max_internal_size – In bytes. Limit the max internal size usage. Only take effect when there’s a PSRAM, and you want to alloc memory on internal RAM first.

  • mm_type – Type of memory manager

  • key – The key of encrypted model.

  • param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.

Model(const char *rodata_address_or_partition_label_or_path, int model_index, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, int max_internal_size = 0, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY, uint8_t *key = nullptr, bool param_copy = true)

Create the Model object by rodata address or partition label.

参数
  • rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.

  • model_index – The model index of packed models.

  • location – The model location.

  • max_internal_size – In bytes. Limit the max internal size usage. Only take effect when there’s a PSRAM, and you want to alloc memory on internal RAM first.

  • mm_type – Type of memory manager

  • key – The key of encrypted model.

  • param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.

Model(const char *rodata_address_or_partition_label_or_path, const char *model_name, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, int max_internal_size = 0, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY, uint8_t *key = nullptr, bool param_copy = true)

Create the Model object by rodata address or partition label.

参数
  • rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.

  • model_name – The model name of packed models.

  • location – The model location.

  • max_internal_size – In bytes. Limit the max internal size usage. Only take effect when there’s a PSRAM, and you want to alloc memory on internal RAM first.

  • mm_type – Type of memory manager

  • key – The key of encrypted model.

  • param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.

Model(fbs::FbsModel *fbs_model, int internal_size = 0, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY)

Create the Model object by fbs_model.

参数
  • fbs_model – The fbs model.

  • internal_size – Internal ram size, in bytes

  • mm_type – Type of memory manager

virtual ~Model()

Destroy the Model object.

virtual esp_err_t load(const char *rodata_address_or_partition_label_or_path, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, uint8_t *key = nullptr, bool param_copy = true)

Load model graph and parameters from FLASH or sdcard.

参数
  • rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.

  • location – The model location.

  • key – The key of encrypted model.

  • param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.

返回

  • ESP_OK Success

  • ESP_FAIL Failed

virtual esp_err_t load(const char *rodata_address_or_partition_label_or_path, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, int model_index = 0, uint8_t *key = nullptr, bool param_copy = true)

Load model graph and parameters from FLASH or sdcard.

参数
  • rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.

  • location – The model location.

  • model_index – The model index of packed models.

  • key – The key of encrypted model.

  • param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.

返回

  • ESP_OK Success

  • ESP_FAIL Failed

virtual esp_err_t load(const char *rodata_address_or_partition_label_or_path, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, const char *model_name = nullptr, uint8_t *key = nullptr, bool param_copy = true)

Load model graph and parameters from FLASH or sdcard.

参数
  • rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.

  • location – The model location.

  • model_name – The model name of packed models.

  • key – The key of encrypted model.

  • param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.

返回

  • ESP_OK Success

  • ESP_FAIL Failed

virtual esp_err_t load(fbs::FbsModel *fbs_model)

Load model graph and parameters from Flatbuffers model.

参数

fbs_model – The FlatBuffers model

返回

  • ESP_OK Success

  • ESP_FAIL Failed

virtual void build(size_t max_internal_size, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY, bool preload = false)

Allocate memory for the model.

参数
  • max_internal_size – In bytes. Limit the max internal size usage. Only take effect when there’s a PSRAM, and you want to alloc memory on internal RAM first.

  • mm_type – Type of memory manager

  • preload – Whether to preload the model’s parameters to internal ram (not implemented yet)

virtual void run(runtime_mode_t mode = RUNTIME_MODE_SINGLE_CORE)

Run the model module by module.

参数

mode – Runtime mode.

virtual void run(TensorBase *input, runtime_mode_t mode = RUNTIME_MODE_SINGLE_CORE)

Run the model module by module.

参数
  • input – The model input.

  • mode – Runtime mode.

virtual void run(std::map<std::string, TensorBase*> &user_inputs, runtime_mode_t mode = RUNTIME_MODE_SINGLE_CORE, std::map<std::string, TensorBase*> user_outputs = {})

Run the model module by module.

参数
  • user_inputs – The model inputs.

  • mode – Runtime mode.

  • user_outputs – It’s for debug to pecify the output of the intermediate layer; Under normal use, there is no need to pass a value to this parameter. If no parameter is passed, the default is the graphical output, which can be obtained through Model::get_outputs().

void minimize()

Minimize the model.

esp_err_t test()

Test whether the model inference result is correct. The model should contain test_inputs and test_outputs. Enable export_test_values option in esp-ppq to use this api.

返回

esp_err_t

std::map<std::string, mem_info> get_memory_info()

Get memory info.

返回

Memory usage statistics on internal and PSRAM.

std::map<std::string, module_info> get_module_info()

Get module info.

返回

return Type and latency of each module.

void print_module_info(const std::map<std::string, module_info> &info, bool sort_module_by_latency = false)

Print the module info obtained by get_module_info function.

参数
  • info

  • sort_module_by_latency

void profile_memory()

Print model memory summary.

void profile_module(bool sort_module_by_latency = false)

Print module info summary. (Name, Type, Latency)

参数

sort_module_by_latency – True The module is printed in latency decreasing sort. False The module is printed in ONNX topological sort.

void profile(bool sort_module_by_latency = false)

Combination of profile_memory & profile_module.

参数

sort_module_by_latency – True The module is printed in latency decreasing sort. False The module is printed in ONNX topological sort.

virtual std::map<std::string, TensorBase*> &get_inputs()

Get inputs of model.

返回

The map of model input’s name and TensorBase*

virtual TensorBase *get_intermediate(const std::string &name)

Get intermediate TensorBase of model.

备注

When using memory manager, the content of TensorBase’s data may be overwritten by the outputs of other

参数

name – The name of intermediate Tensor. operators.

返回

The intermediate TensorBase*.

virtual std::map<std::string, TensorBase*> &get_outputs()

Get outputs of model.

返回

The map of model output’s name and TensorBase*

virtual void print()

Print the model.

inline virtual fbs::FbsModel *get_fbs_model()

Get the fbs model instance.

返回

fbs::FbsModel *

Header File

Macros

CONTEXT_PARAMETER_OFFSET

Offset for parameter tensors

Classes

class ModelContext

Model Context class including variable tensors and parameters.

Public Functions

inline ModelContext()

Constructor for ModelContext. Initializes the PSRAM and internal root pointers to nullptr.

inline ~ModelContext()

Destructor for ModelContext. Clears all resources and tensors.

int add_tensor(const std::string name, bool is_paramter = false, TensorBase *tensor = nullptr)

Adds a tensor to the parameter or variable list.

参数
  • name – The name of the tensor.

  • is_paramter – Whether the tensor is a parameter (default: false).

  • tensor – Pointer to the TensorBase object (default: nullptr).

返回

int Returns the index of the added tensor.

int push_back_tensor(TensorBase *tensor, bool is_paramter = false)

Push back a tensor.

参数
  • tensor – Pointer to the TensorBase object.

  • is_paramter – Whether the tensor is a parameter (default: false).

返回

int Returns the index of the added tensor.

void update_tensor(int index, TensorBase *tensor)

Updates the tensor at the specified index.

参数
  • index – The index of the tensor to update.

  • tensor – Pointer to the new TensorBase object.

TensorBase *get_tensor(int index)

Gets the tensor by its index.

参数

index – The index of the tensor.

返回

TensorBase* Returns the pointer to the TensorBase object, or nullptr if the index is invalid.

TensorBase *get_tensor(const std::string &name)

Gets the tensor by its name.

参数

name – The name of the tensor.

返回

TensorBase* Returns the pointer to the TensorBase object, or nullptr if the name is not found.

int get_tensor_index(const std::string &name)

Gets the tensor index by its name.

参数

name – The name of the tensor.

返回

int Returns index if the name is found, else -1

int get_variable_index(const std::string &name)

Gets the variable tensor index by its name.

参数

name – The name of the tensor.

返回

int Returns index if the name is found and is variable tensor, else -1

inline int get_variable_count()

Gets the count of variable tensors.

返回

int Returns the number of variable tensors.

inline int get_parameter_count()

Gets the count of parameter tensors.

返回

int Returns the number of parameter tensors.

bool root_alloc(size_t internal_size, size_t psram_size, int alignment = 16)

Allocates memory for PSRAM and internal roots.

参数
  • internal_size – The size of the internal memory in bytes.

  • psram_size – The size of the PSRAM memory in bytes.

  • alignment – The alignment of the memory in bytes.

返回

Bool Return true if the allocation is successful, false otherwise.

inline void *get_psram_root()

Gets the pointer to the PSRAM root.

返回

Void* Returns the pointer to the PSRAM root.

inline void *get_internal_root()

Gets the pointer to the internal root.

返回

Void* Returns the pointer to the internal root.

size_t get_tensor_memory_size(size_t &internal_size, size_t &param_size, size_t &flash_size)

Gets the size of the tensor memory in bytes.

参数
  • internal_size – The size of the internal memory used by the tensors in bytes.

  • param_size – The size of the parameter memory used by the tensors in bytes.

  • flash_size – The size of the flash used by the tensors in bytes.

返回

size_t Returns the size of the tensor memory in bytes.

size_t get_parameter_memory_size(size_t &internal_size, size_t &param_size, size_t &flash_size)

Gets the size of the parameters in bytes.

参数
  • internal_size – The size of the internal memory used by the parameters in bytes.

  • param_size – The size of the parameter memory used by the parameters in bytes.

  • flash_size – The size of the flash used by the parameters in bytes.

返回

size_t Returns the size of the parameters memory in bytes.

size_t get_variable_memory_size(size_t &internal_size, size_t &param_size, size_t &flash_size)

Gets the size of the variables in bytes.

参数
  • internal_size – The size of the internal memory used by the variables in bytes.

  • param_size – The size of the parameter memory used by the variables in bytes.

  • flash_size – The size of the flash used by the variables in bytes.

返回

size_t Returns the size of the variables memory in bytes.

inline void root_free()

Frees the memory allocated for PSRAM and internal roots. This function ensures proper cleanup of allocated memory.

inline void minimize()

Minimizes the context by clearing the name-to-index map. This is used to free unnecessary intermediate variables during the inference.

inline void clear()

Clears all resources and tensors in the context. This includes clearing variables, parameters, name-to-index map, and freeing memory.

Public Members

std::vector<TensorBase*> m_variables

Variable tensors of model, the first one is nullptr

std::vector<TensorBase*> m_parameters

Parameters of model, the first one is nullptr

Header File

Classes

class MemoryManagerBase

Memory manager base class, each model has its own memory manager TODO: share memory manager with different models.

Subclassed by dl::MemoryManagerGreedy

Public Functions

inline MemoryManagerBase(int alignment = 16)

Construct a new Memory Manager Base object.

参数

alignment – Memory address alignment

inline virtual ~MemoryManagerBase()

Destroy the MemoryManager object. Return resource.

virtual bool alloc(fbs::FbsModel *fbs_model, std::vector<dl::module::Module*> &execution_plan, ModelContext *context) = 0

Allocate memory for each tensor, include all input and output tensors.

参数
  • fbs_model – FlatBuffer’s Model

  • execution_plan – Topological sorted module list

  • contextModel context

返回

Bool Return true if the allocation is successful, false otherwise.

Public Members

int alignment

The root pointer needs to be aligned must be a power of two

class TensorInfo

Tensor info, include tensor name, shape, dtype, size, time range and call times, which is used to plan model memory.

Public Functions

TensorInfo(std::string &name, int time_begin, int time_end, std::vector<int> shape, dtype_t dtype, int exponent, bool is_internal = false)

Construct a new Tensor Info object.

参数
  • name – Tensor name

  • time_begin – Tensor lifetime begin

  • time_end – Tensor lifetime end

  • shape – Tensor shape

  • dtype – Tensor dtype

  • exponent – Tensor exponent

  • is_internal – Is tensor in internal RAM or not

inline ~TensorInfo()

Destroy the Tensor Info object.

void set_inplace_leader_tensor(TensorInfo *tensor)

Set the inplace leader tensor object.

参数

tensor – Inplace leader tensor

inline void set_inplace_follower_tensor(TensorInfo *tensor)

Set the inplace follower tensor object.

参数

tensor – Inplace follower tensor

inline TensorInfo *get_inplace_follower_tensor()

Get the inplace follower tensor object.

返回

TensorInfo* Inplace follower tensor

void update_time(int new_time)

Update Tensor lifetime.

参数

new_time – new tensor lifetime

TensorBase *create_tensor(void *internal_root, void *psram_root)

Create a TensorBase object according to TensorInfo.

参数
  • internal_root – Internal RAM root pointer

  • psram_root – PSRAM root pointer

返回

TensorBase*

inline bool is_inplaced()

Is inplaced or not.

返回

true if inplaced else false

inline uint32_t get_offset()

Get the tensor offset.

返回

uint32_t

inline void set_offset(uint32_t offset)

Set the tensor offset.

参数

offset

inline uint32_t get_internal_offset()

Get the internal offset.

返回

uint32_t

inline bool get_internal_state()

Get the internal state.

返回

true if is internal else false

inline void set_internal_state(bool is_internal)

Set the internal state.

参数

is_internal

inline void set_internal_offset(uint32_t offset)

Set the internal offset.

参数

offset

inline int get_time_end()

Get the liftetime end.

返回

int

inline int get_time_begin()

Get the liftetime begin.

返回

int

inline size_t get_size()

Get the tensor size.

返回

size_t

inline std::string get_name()

Get the tensor name.

返回

std::string

inline std::vector<int> get_shape()

Get the tensor shape.

返回

std::vector<int>

inline void print()

print tensor info

class MemoryChunk

Memory chunk, include size, is free, offset, alignment and tensor, which is used to simulate memory allocation.

Public Functions

MemoryChunk(size_t size, int is_free, int alignment = 16)

Construct a new Memory Chunk object.

参数
  • size – Memory chunk size

  • is_free – Whether free or not

  • alignment – Memory chunk alignment

MemoryChunk(TensorInfo *tensor, int alignment = 16)

Construct a new Memory Chunk object.

参数
  • tensorTensorInfo

  • alignment – Memory chunk alignment

inline ~MemoryChunk()

Destroy the Memory Chunk object.

MemoryChunk *merge_free_chunk(MemoryChunk *chunk)

Merge continuous free chunk.

参数

chunk

返回

MemoryChunk*

MemoryChunk *insert(TensorInfo *tensor)

Insert tensor into free chunk.

参数

tensor

返回

MemoryChunk*

MemoryChunk *extend(TensorInfo *tensor)

Extend free chunk and insert tensor.

参数

tensor

返回

MemoryChunk*

inline void free()

Free memory chunk, set is_free to true and set tensor to nullptr.

size_t get_aligned_size(size_t size)

get aligned size, which is 16/alignemt bytes aligned

参数

size

返回

size_t

Public Members

size_t size

Memeory chunk size

bool is_free

Whether memory chunk is free or not

int offset

Offset relative to root pointer

int alignment

Memory address alignment

TensorInfo *tensor

Info of the tensor which occupies the memory

Header File

Classes

class MemoryManagerGreedy : public dl::MemoryManagerBase

Greedy memory manager that allocates memory for tensors in execution order, prioritizing internal RAM allocation first.

Public Functions

inline MemoryManagerGreedy(int max_internal_size, int alignment = 16)

Constructs a greedy memory manager with specified constraints.

参数
  • max_internal_size – Maximum allowed internal RAM usage in bytes

  • alignment – Memory address alignment requirement (default: 16 bytes)

inline ~MemoryManagerGreedy()

Destructor that releases all managed memory resources.

virtual bool alloc(fbs::FbsModel *fbs_model, std::vector<dl::module::Module*> &execution_plan, ModelContext *context)

Allocates memory for all network tensors following greedy strategy.

参数
  • fbs_model – FlatBuffer model containing network architecture

  • execution_plan – Execution graph ordered by computation dependencies

  • context – Device-specific runtime configuration

返回

bool True if successful allocation, false if memory insufficient

void free()

Releases all allocated memory including tensor buffers and memory pools.