DVP Camera Solution Introduction

[中文]

Note

This document is automatically translated using AI. Please excuse any detailed errors. The official English version is still in progress.

Common Application Scenarios

  1. Existing video demos

  • ESP32-S3 Speeding Milliseconds【High Frame Rate 125 fps】: ESP32-S3 with a global shutter camera, achieving a high frame rate of 125 FPS. It can capture the motion trajectory of moving objects, such as high frame rate image capture of writing pens and other scenarios. Combined with the network capabilities of ESP32-S3, it is the first choice for IoT smart camera solutions.

  • ESP32-S3 Watching Cute Cats Eating【High Resolution】: ESP32-S3-EYE AI development board, with a 200 W pixel camera. High-resolution real-time watching of the cat master eating, the first choice for pet monitoring.

  1. QR code scanning function

  • Product barcode scanning, train ticket scanning, land tax invoice scanning, resource scanning download, etc.

  • Auto focus + Internet

  1. Wi-Fi, LCD “Polaroid”

  • The photos taken are immediately refreshed on the webpage and displayed on the LCD.

  1. Scanner, smart car auxiliary navigation

  2. Face recognition

  • Technology or system for identity confirmation or identity search through faces

  1. Video surveillance, light-sensitive atmosphere lamp

  2. AI applications

  • Gesture recognition

  • Number recognition

  • Food recognition

  • Posture detection

Reference Materials

Performance Test Data Appendix

Camera-related solutions mainly focus on the following performance indicators:

  1. Sensor initialization time (especially for projects that need to capture immediately after startup)

  2. Supported resolution and data formats

  3. Frame rate under specified resolution and data format

  4. Transmission rate when used with Wi-Fi

  5. Supported image processing functions (currently mainly performed by software for JPEG encoding and decoding functions)

Related test code can be referred to Test Example.

Sensor Initialization Time

Sensor init time

Sensor

Init time (ms)

OV3660, JPEG

604

OV3660, RGB565

1301

OV2640, JPEG

200

OV5640, JPEG

240

Note

Please note the following when conducting the above tests:

  • A picture needs to be captured at the end of the test to verify correctness

  • You can refer to More Optimization Methods to optimize the system configuration as much as possible.

Supported Resolution and Data Formats

The supported resolution size depends entirely on the maximum performance of the camera, but the capabilities of DVP itself and CPU DMA are limited, and too high resolution will put pressure on data transmission:

  • If the DVP camera sensor can output JPEG type pictures, the corresponding resolution is recommended not to exceed 500 w pixels

  • If the DVP camera sensor cannot output JPEG type pictures (at this time the type of picture is YUV422/RGB565, etc.), the corresponding resolution is recommended not to exceed 100 w pixels

The data format mainly depends on the data format that the camera supports output. Mainly include:

  1. RGB

  2. YUV

  3. JPEG

  4. RAW Data

  5. BMP

  6. Only Y/Grayscale

When the camera itself does not support outputting JPEG data, it can be compressed by ESP32 to output JPEG data.

Note

Specifically, when the required resolution is too large, exceeding 1024*720, consider using a camera that supports JPEG encoding. At the same time, it should be clarified that JPEG encoding and decoding performed by ESP32 will put pressure on CPU and memory.

Frame Rate Under Specified Resolution and Data Format

Sensor init time

Output format

Chip model

Sensor model

Resolution

Frame rate

YUV422RGB565

ESP32-S3

SC030IOT

640*480

30 fps

JPEG

ESP32S3

OV5640

1600*1200

25 fps

Only Y/MONO

ESP32-S3/ESP32-S2

SC031GS

240*240

125 fps

The speed of different cameras varies greatly. When testing the JPEG speed, you should specify the parameters of JPEG compression, and try to shoot colorful pictures. Shooting monochrome objects will contain more low-frequency information, resulting in a small amount of data after JPEG compression, and the generated data is not representative.

The frame rate of the same camera varies greatly when the data format, resolution, size of the main clock XCLK, and idle interval time parameters are set differently. At the same time, to ensure the frame rate, the fb_count should not be less than 2 when initializing the camera. Currently, most of the driver parameters connected to the camera are not optimal, and the configuration methods of different cameras are not unified, so there is much room for optimization and adjustment of the above performance data. ESP32-S3 has an independent CAM DVP interface, and the peripheral interface rate is 2~3 times that of ESP32.

Transmission rate when used with Wi-Fi

Usually, when the user’s required data format, resolution, frame rate and other parameters are determined, based on the test data of Wi-Fi, we can preliminarily estimate whether the scheme is feasible.

Take JPEG@480*320@20fps as an example, usually a JPEG@480*320 picture is 30 KB ~ 50 KB, the frame rate requirement is 20 fps, then the speed required by Wi-Fi should be 600 KB/s ~ 1000 KB/s. By checking the ESP32-S3 Wi-Fi throughput, it can be found that ESP32-S3 meets the requirements.

Currently, the esp-rtsp example running on ESP32-S3 can achieve a frame rate of about 20 fps for 720p + MJPEG video stream.

ESP32-S3’s encoding and decoding performance

ESP32-S3 does not have hardware encoding and decoding capabilities, its driver code includes the software encoding component of TinyJPEG.

ESP32-S3's encoding and decoding performance