[WDT-3.15] {idf_target_name} Chip May Have A Live Lock Under Certain Conditions That Will Cause Interrupt Watchdog Issue

Affected revisions: v3.0 v3.1

Description

On ESP32 chip revision v3.0, when the following conditions are met at the same time, a live lock will occur, causing the CPUs to get stuck in the state of memory access and stop executing instructions.

  1. Dual-core system.

  2. Of the four Instruction/Data buses (IBUS/DBUS) that access external memory, three simultaneously initiate access requests to the same cache set, and all three requests result in cache misses.

Workarounds

When a live lock occurs, software proactively or passively recognizes and unlocks the cache line contention, and then the two cores complete their respective cache operations one after another, following a first-come, first-served policy, to resolve the live lock. The detailed process is as follows:

  1. If the live lock occurs when the instructions executed by the two cores are not in the critical section of the code, the various types of system interruptions will proactively release the cache line competition and resolve the live lock.

  2. If the live lock occurs when the instructions executed by the two cores are located in the critical section of the code, the system will mask interrupts at level 3 and below. Therefore, software needs to set up a high priority (level 4 or 5) interrupt for each core in advance, connect the interrupts to the same timer, and configure an appropriate timeout threshold. The timer timeout interrupt generated by the live lock will force both cores to enter the high-priority interrupt handler, thereby releasing the IBUS of both cores to resolve the live lock. The live lock resolution process is completed in three stages:

    1. In the first stage, both cores wait for the CPU write buffer to be cleared.

    2. In the second stage, one core (Core 0) waits and the other core (Core 1) executes instructions.

    3. In the third stage, Core 1 waits and Core 0 executes instructions.

Solution

No fix scheduled.