Model API Reference
This section covers model loading and static memory planning, making it convenient for users to directly load and run ESPDL models.
Header File
Macros
-
DL_LOG_INFER_LATENCY_INIT_WITH_SIZE(size)
-
DL_LOG_INFER_LATENCY_INIT()
-
DL_LOG_INFER_LATENCY_START()
-
DL_LOG_INFER_LATENCY_END()
-
DL_LOG_INFER_LATENCY_PRINT(prefix, key)
-
DL_LOG_INFER_LATENCY_END_PRINT(prefix, key)
-
DL_LOG_INFER_LATENCY_ARRAY_INIT_WITH_SIZE(n, size)
-
DL_LOG_INFER_LATENCY_ARRAY_INIT(n)
-
DL_LOG_INFER_LATENCY_ARRAY_START(i)
-
DL_LOG_INFER_LATENCY_ARRAY_END(i)
-
DL_LOG_INFER_LATENCY_ARRAY_PRINT(i, prefix, key)
-
DL_LOG_INFER_LATENCY_ARRAY_END_PRINT(i, prefix, key)
Classes
-
class Model
Neural Network Model.
Public Functions
-
Model(const char *rodata_address_or_partition_label_or_path, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, int max_internal_size = 0, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY, uint8_t *key = nullptr, bool param_copy = true)
Create the Model object by rodata address or partition label.
- 参数
rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.
location – The model location.
max_internal_size – In bytes. Limit the max internal size usage. Only take effect when there’s a PSRAM, and you want to alloc memory on internal RAM first.
mm_type – Type of memory manager
key – The key of encrypted model.
param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.
-
Model(const char *rodata_address_or_partition_label_or_path, int model_index, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, int max_internal_size = 0, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY, uint8_t *key = nullptr, bool param_copy = true)
Create the Model object by rodata address or partition label.
- 参数
rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.
model_index – The model index of packed models.
location – The model location.
max_internal_size – In bytes. Limit the max internal size usage. Only take effect when there’s a PSRAM, and you want to alloc memory on internal RAM first.
mm_type – Type of memory manager
key – The key of encrypted model.
param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.
-
Model(const char *rodata_address_or_partition_label_or_path, const char *model_name, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, int max_internal_size = 0, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY, uint8_t *key = nullptr, bool param_copy = true)
Create the Model object by rodata address or partition label.
- 参数
rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.
model_name – The model name of packed models.
location – The model location.
max_internal_size – In bytes. Limit the max internal size usage. Only take effect when there’s a PSRAM, and you want to alloc memory on internal RAM first.
mm_type – Type of memory manager
key – The key of encrypted model.
param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.
-
Model(fbs::FbsModel *fbs_model, int internal_size = 0, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY)
Create the Model object by fbs_model.
- 参数
fbs_model – The fbs model.
internal_size – Internal ram size, in bytes
mm_type – Type of memory manager
-
virtual esp_err_t load(const char *rodata_address_or_partition_label_or_path, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, uint8_t *key = nullptr, bool param_copy = true)
Load model graph and parameters from FLASH or sdcard.
- 参数
rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.
location – The model location.
key – The key of encrypted model.
param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.
- 返回
ESP_OK Success
ESP_FAIL Failed
-
virtual esp_err_t load(const char *rodata_address_or_partition_label_or_path, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, int model_index = 0, uint8_t *key = nullptr, bool param_copy = true)
Load model graph and parameters from FLASH or sdcard.
- 参数
rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.
location – The model location.
model_index – The model index of packed models.
key – The key of encrypted model.
param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.
- 返回
ESP_OK Success
ESP_FAIL Failed
-
virtual esp_err_t load(const char *rodata_address_or_partition_label_or_path, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, const char *model_name = nullptr, uint8_t *key = nullptr, bool param_copy = true)
Load model graph and parameters from FLASH or sdcard.
- 参数
rodata_address_or_partition_label_or_path – The address of model data while location is MODEL_LOCATION_IN_FLASH_RODATA. The label of partition while location is MODEL_LOCATION_IN_FLASH_PARTITION. The path of model while location is MODEL_LOCATION_IN_SDCARD.
location – The model location.
model_name – The model name of packed models.
key – The key of encrypted model.
param_copy – Set to false to avoid copy model parameters from FLASH to PSRAM. Only set this param to false when your PSRAM resource is very tight. This saves PSRAM and sacrifices the performance of model inference because the frequency of PSRAM is higher than FLASH. Only takes effect when MODEL_LOCATION_IN_FLASH_RODATA(CONFIG_SPIRAM_RODATA not set) or MODEL_LOCATION_IN_FLASH_PARTITION.
- 返回
ESP_OK Success
ESP_FAIL Failed
-
virtual esp_err_t load(fbs::FbsModel *fbs_model)
Load model graph and parameters from Flatbuffers model.
- 参数
fbs_model – The FlatBuffers model
- 返回
ESP_OK Success
ESP_FAIL Failed
-
virtual void build(size_t max_internal_size, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY, bool preload = false)
Allocate memory for the model.
- 参数
max_internal_size – In bytes. Limit the max internal size usage. Only take effect when there’s a PSRAM, and you want to alloc memory on internal RAM first.
mm_type – Type of memory manager
preload – Whether to preload the model’s parameters to internal ram (not implemented yet)
-
virtual void run(runtime_mode_t mode = RUNTIME_MODE_SINGLE_CORE)
Run the model module by module.
- 参数
mode – Runtime mode.
-
virtual void run(TensorBase *input, runtime_mode_t mode = RUNTIME_MODE_SINGLE_CORE)
Run the model module by module.
- 参数
input – The model input.
mode – Runtime mode.
-
virtual void run(std::map<std::string, TensorBase*> &user_inputs, runtime_mode_t mode = RUNTIME_MODE_SINGLE_CORE, std::map<std::string, TensorBase*> user_outputs = {})
Run the model module by module.
- 参数
user_inputs – The model inputs.
mode – Runtime mode.
user_outputs – It’s for debug to pecify the output of the intermediate layer; Under normal use, there is no need to pass a value to this parameter. If no parameter is passed, the default is the graphical output, which can be obtained through Model::get_outputs().
-
void minimize()
Minimize the model.
-
esp_err_t test()
Test whether the model inference result is correct. The model should contain test_inputs and test_outputs. Enable export_test_values option in esp-ppq to use this api.
- 返回
esp_err_t
-
std::map<std::string, mem_info> get_memory_info()
Get memory info.
- 返回
Memory usage statistics on internal and PSRAM.
-
std::map<std::string, module_info> get_module_info()
Get module info.
- 返回
return Type and latency of each module.
-
void print_module_info(const std::map<std::string, module_info> &info, bool sort_module_by_latency = false)
Print the module info obtained by get_module_info function.
- 参数
info –
sort_module_by_latency –
-
void profile_memory()
Print model memory summary.
-
void profile_module(bool sort_module_by_latency = false)
Print module info summary. (Name, Type, Latency)
- 参数
sort_module_by_latency – True The module is printed in latency decreasing sort. False The module is printed in ONNX topological sort.
-
void profile(bool sort_module_by_latency = false)
Combination of profile_memory & profile_module.
- 参数
sort_module_by_latency – True The module is printed in latency decreasing sort. False The module is printed in ONNX topological sort.
-
virtual std::map<std::string, TensorBase*> &get_inputs()
Get inputs of model.
- 返回
The map of model input’s name and TensorBase*
-
virtual TensorBase *get_intermediate(const std::string &name)
Get intermediate TensorBase of model.
备注
When using memory manager, the content of TensorBase’s data may be overwritten by the outputs of other
- 参数
name – The name of intermediate Tensor. operators.
- 返回
The intermediate TensorBase*.
-
virtual std::map<std::string, TensorBase*> &get_outputs()
Get outputs of model.
- 返回
The map of model output’s name and TensorBase*
-
virtual void print()
Print the model.
-
Model(const char *rodata_address_or_partition_label_or_path, fbs::model_location_type_t location = fbs::MODEL_LOCATION_IN_FLASH_RODATA, int max_internal_size = 0, memory_manager_t mm_type = MEMORY_MANAGER_GREEDY, uint8_t *key = nullptr, bool param_copy = true)
Header File
Macros
-
CONTEXT_PARAMETER_OFFSET
Offset for parameter tensors
Classes
-
class ModelContext
Model Context class including variable tensors and parameters.
Public Functions
-
inline ModelContext()
Constructor for ModelContext. Initializes the PSRAM and internal root pointers to nullptr.
-
inline ~ModelContext()
Destructor for ModelContext. Clears all resources and tensors.
-
int add_tensor(const std::string name, bool is_paramter = false, TensorBase *tensor = nullptr)
Adds a tensor to the parameter or variable list.
- 参数
name – The name of the tensor.
is_paramter – Whether the tensor is a parameter (default: false).
tensor – Pointer to the TensorBase object (default: nullptr).
- 返回
int Returns the index of the added tensor.
-
int push_back_tensor(TensorBase *tensor, bool is_paramter = false)
Push back a tensor.
- 参数
tensor – Pointer to the TensorBase object.
is_paramter – Whether the tensor is a parameter (default: false).
- 返回
int Returns the index of the added tensor.
-
void update_tensor(int index, TensorBase *tensor)
Updates the tensor at the specified index.
- 参数
index – The index of the tensor to update.
tensor – Pointer to the new TensorBase object.
-
TensorBase *get_tensor(int index)
Gets the tensor by its index.
- 参数
index – The index of the tensor.
- 返回
TensorBase* Returns the pointer to the TensorBase object, or nullptr if the index is invalid.
-
TensorBase *get_tensor(const std::string &name)
Gets the tensor by its name.
- 参数
name – The name of the tensor.
- 返回
TensorBase* Returns the pointer to the TensorBase object, or nullptr if the name is not found.
-
int get_tensor_index(const std::string &name)
Gets the tensor index by its name.
- 参数
name – The name of the tensor.
- 返回
int Returns index if the name is found, else -1
-
int get_variable_index(const std::string &name)
Gets the variable tensor index by its name.
- 参数
name – The name of the tensor.
- 返回
int Returns index if the name is found and is variable tensor, else -1
-
inline int get_variable_count()
Gets the count of variable tensors.
- 返回
int Returns the number of variable tensors.
-
inline int get_parameter_count()
Gets the count of parameter tensors.
- 返回
int Returns the number of parameter tensors.
-
bool root_alloc(size_t internal_size, size_t psram_size, int alignment = 16)
Allocates memory for PSRAM and internal roots.
- 参数
internal_size – The size of the internal memory in bytes.
psram_size – The size of the PSRAM memory in bytes.
alignment – The alignment of the memory in bytes.
- 返回
Bool Return true if the allocation is successful, false otherwise.
-
inline void *get_psram_root()
Gets the pointer to the PSRAM root.
- 返回
Void* Returns the pointer to the PSRAM root.
-
inline void *get_internal_root()
Gets the pointer to the internal root.
- 返回
Void* Returns the pointer to the internal root.
-
size_t get_tensor_memory_size(size_t &internal_size, size_t ¶m_size, size_t &flash_size)
Gets the size of the tensor memory in bytes.
- 参数
internal_size – The size of the internal memory used by the tensors in bytes.
param_size – The size of the parameter memory used by the tensors in bytes.
flash_size – The size of the flash used by the tensors in bytes.
- 返回
size_t Returns the size of the tensor memory in bytes.
-
size_t get_parameter_memory_size(size_t &internal_size, size_t ¶m_size, size_t &flash_size)
Gets the size of the parameters in bytes.
- 参数
internal_size – The size of the internal memory used by the parameters in bytes.
param_size – The size of the parameter memory used by the parameters in bytes.
flash_size – The size of the flash used by the parameters in bytes.
- 返回
size_t Returns the size of the parameters memory in bytes.
-
size_t get_variable_memory_size(size_t &internal_size, size_t ¶m_size, size_t &flash_size)
Gets the size of the variables in bytes.
- 参数
internal_size – The size of the internal memory used by the variables in bytes.
param_size – The size of the parameter memory used by the variables in bytes.
flash_size – The size of the flash used by the variables in bytes.
- 返回
size_t Returns the size of the variables memory in bytes.
-
inline void root_free()
Frees the memory allocated for PSRAM and internal roots. This function ensures proper cleanup of allocated memory.
-
inline void minimize()
Minimizes the context by clearing the name-to-index map. This is used to free unnecessary intermediate variables during the inference.
-
inline void clear()
Clears all resources and tensors in the context. This includes clearing variables, parameters, name-to-index map, and freeing memory.
Public Members
-
std::vector<TensorBase*> m_variables
Variable tensors of model, the first one is nullptr
-
std::vector<TensorBase*> m_parameters
Parameters of model, the first one is nullptr
-
inline ModelContext()
Header File
Classes
-
class MemoryManagerBase
Memory manager base class, each model has its own memory manager TODO: share memory manager with different models.
Subclassed by dl::MemoryManagerGreedy
Public Functions
-
inline MemoryManagerBase(int alignment = 16)
Construct a new Memory Manager Base object.
- 参数
alignment – Memory address alignment
-
inline virtual ~MemoryManagerBase()
Destroy the MemoryManager object. Return resource.
-
virtual bool alloc(fbs::FbsModel *fbs_model, std::vector<dl::module::Module*> &execution_plan, ModelContext *context) = 0
Allocate memory for each tensor, include all input and output tensors.
Public Members
-
int alignment
The root pointer needs to be aligned must be a power of two
-
inline MemoryManagerBase(int alignment = 16)
-
class TensorInfo
Tensor info, include tensor name, shape, dtype, size, time range and call times, which is used to plan model memory.
Public Functions
-
TensorInfo(std::string &name, int time_begin, int time_end, std::vector<int> shape, dtype_t dtype, int exponent, bool is_internal = false)
Construct a new Tensor Info object.
- 参数
name – Tensor name
time_begin – Tensor lifetime begin
time_end – Tensor lifetime end
shape – Tensor shape
dtype – Tensor dtype
exponent – Tensor exponent
is_internal – Is tensor in internal RAM or not
-
inline ~TensorInfo()
Destroy the Tensor Info object.
-
void set_inplace_leader_tensor(TensorInfo *tensor)
Set the inplace leader tensor object.
- 参数
tensor – Inplace leader tensor
-
inline void set_inplace_follower_tensor(TensorInfo *tensor)
Set the inplace follower tensor object.
- 参数
tensor – Inplace follower tensor
-
inline TensorInfo *get_inplace_follower_tensor()
Get the inplace follower tensor object.
- 返回
TensorInfo* Inplace follower tensor
-
void update_time(int new_time)
Update Tensor lifetime.
- 参数
new_time – new tensor lifetime
-
TensorBase *create_tensor(void *internal_root, void *psram_root)
Create a TensorBase object according to TensorInfo.
- 参数
internal_root – Internal RAM root pointer
psram_root – PSRAM root pointer
- 返回
TensorBase*
-
inline bool is_inplaced()
Is inplaced or not.
- 返回
true if inplaced else false
-
inline uint32_t get_offset()
Get the tensor offset.
- 返回
uint32_t
-
inline void set_offset(uint32_t offset)
Set the tensor offset.
- 参数
offset –
-
inline uint32_t get_internal_offset()
Get the internal offset.
- 返回
uint32_t
-
inline bool get_internal_state()
Get the internal state.
- 返回
true if is internal else false
-
inline void set_internal_state(bool is_internal)
Set the internal state.
- 参数
is_internal –
-
inline void set_internal_offset(uint32_t offset)
Set the internal offset.
- 参数
offset –
-
inline int get_time_end()
Get the liftetime end.
- 返回
int
-
inline int get_time_begin()
Get the liftetime begin.
- 返回
int
-
inline size_t get_size()
Get the tensor size.
- 返回
size_t
-
inline std::string get_name()
Get the tensor name.
- 返回
std::string
-
inline std::vector<int> get_shape()
Get the tensor shape.
- 返回
std::vector<int>
-
inline void print()
print tensor info
-
TensorInfo(std::string &name, int time_begin, int time_end, std::vector<int> shape, dtype_t dtype, int exponent, bool is_internal = false)
-
class MemoryChunk
Memory chunk, include size, is free, offset, alignment and tensor, which is used to simulate memory allocation.
Public Functions
-
MemoryChunk(size_t size, int is_free, int alignment = 16)
Construct a new Memory Chunk object.
- 参数
size – Memory chunk size
is_free – Whether free or not
alignment – Memory chunk alignment
-
MemoryChunk(TensorInfo *tensor, int alignment = 16)
Construct a new Memory Chunk object.
- 参数
tensor – TensorInfo
alignment – Memory chunk alignment
-
inline ~MemoryChunk()
Destroy the Memory Chunk object.
-
MemoryChunk *merge_free_chunk(MemoryChunk *chunk)
Merge continuous free chunk.
- 参数
chunk –
- 返回
MemoryChunk*
-
MemoryChunk *insert(TensorInfo *tensor)
Insert tensor into free chunk.
- 参数
tensor –
- 返回
MemoryChunk*
-
MemoryChunk *extend(TensorInfo *tensor)
Extend free chunk and insert tensor.
- 参数
tensor –
- 返回
MemoryChunk*
-
inline void free()
Free memory chunk, set is_free to true and set tensor to nullptr.
-
size_t get_aligned_size(size_t size)
get aligned size, which is 16/alignemt bytes aligned
- 参数
size –
- 返回
size_t
-
MemoryChunk(size_t size, int is_free, int alignment = 16)
Header File
Classes
-
class MemoryManagerGreedy : public dl::MemoryManagerBase
Greedy memory manager that allocates memory for tensors in execution order, prioritizing internal RAM allocation first.
Public Functions
-
inline MemoryManagerGreedy(int max_internal_size, int alignment = 16)
Constructs a greedy memory manager with specified constraints.
- 参数
max_internal_size – Maximum allowed internal RAM usage in bytes
alignment – Memory address alignment requirement (default: 16 bytes)
-
inline ~MemoryManagerGreedy()
Destructor that releases all managed memory resources.
-
virtual bool alloc(fbs::FbsModel *fbs_model, std::vector<dl::module::Module*> &execution_plan, ModelContext *context)
Allocates memory for all network tensors following greedy strategy.
- 参数
fbs_model – FlatBuffer model containing network architecture
execution_plan – Execution graph ordered by computation dependencies
context – Device-specific runtime configuration
- 返回
bool True if successful allocation, false if memory insufficient
-
void free()
Releases all allocated memory including tensor buffers and memory pools.
-
inline MemoryManagerGreedy(int max_internal_size, int alignment = 16)