Common Socket Error Troubleshooting

[中文]

The following summarizes common error logs in ESP-IDF Socket, their corresponding causes, and possible solutions.

Example: transport_base: poll_read select error 113, errno = Software caused connection abort, fd = 56

Error Message Analysis:

  • poll_read select error 113

  • errno = 113 (Software caused connection abort)

  • fd = 56

Possible Causes:

  • 113 is usually due to TCP keepalive heartbeat timeout, retransmission timeout causing abnormal link disconnection.

Solutions:

  • Check for abnormal logs on the server side.

  • Ensure stable network connection.

  • Implement heartbeat mechanism at the application layer to avoid disconnection due to long periods of no data communication.

Note

When analyzing TCP flow abnormal issues, you can use the debug patch script to print TCP sequence number and other information to help locate the problem.

Usage: Enter the ESP-IDF main directory in the terminal and execute the command python <path to at_net_debug.py> to apply this patch. After applying the patch, subsequent ESP logs will include TCP sequence number (seq) and acknowledgment number (ack) and other debugging information.

Common errno and their analysis

Errno Definition

Corresponding Value

Corresponding ERR

Corresponding Explanation

Cause

Localization Measures

ENOMEM

12

ERR_MEM

Out of memory error

Failed to allocate memory or mailbox is full, msg sending failed

Print remaining memory and minimum memory, check if there is insufficient memory

ENOBUFS

105

ERR_BUF

Buffer error

1. Insufficient space, the header length has exceeded the allocated length 2. socket failed to allocate netconn 3. option len too long

Add log for positioning

EWOULDBLOCK

11

ERR_TIMEOUT / ERR_WOULDBLOCK

Timeout / Operation would block

1. Timeout – Receive Timeout (SO_RCVTIMEO set) 2. Operation would block – too fast send/receive calls under non-blocking 3. Operation would block – Send timeout (SO_SNDTIMEO set)

Add print at ERR_TIMEOUT; after nonblocking, connection/sending/receiving are immediately returned, the application layer should handle it well

EHOSTUNREACH

118

ERR_RTE

Routing problem

Can’t find the outgoing route (netif)

Add log for positioning, check if there is a satisfying routing interface for the destination IP

EINPROGRESS

119

ERR_INPROGRESS

Operation in progress

The operation is in progress, non-blocking mode connect will encounter, DNS query will also encounter

Normal errno

EINVAL

22

ERR_VAL

Illegal value

Parameter error (parameters set by option, fd passed in by select/poll)

Check if there is a problem with the parameters of the function call

EADDRINUSE

112

ERR_USE

Address in use

The address or port is already bound when binding

Check if the bound address or port is already bound, can be solved with SO_REUSEADDR

EALREADY

120

ERR_ALREADY

Already connecting

The socket is already connecting or is in the listen state, and the application layer calls connect or listen again

Check the code logic

EISCONN

127

ERR_ISCONN

Conn already established

The socket is already in Connected

Check the code logic

ENOTCONN

128

ERR_CONN / ERR_CLSD

Not connected / Connection closed

The socket is in an unconnected state or the other end has disconnected FIN

Generally, the other end has disconnected, and we are still using this socket to send data, confirm by capturing packets

ECONNABORTED

113

ERR_ABRT

Connection aborted

The connection is disconnected due to an error state or the other party’s reason, or the TCP timer times out, removing the connection (generally this end disconnects the connection)

Add log for positioning

ECONNRESET

104

ERR_RST

Connection reset

Received RST sent by the other end

Confirm by capturing packets

EIO

5

ERR_ARG

Illegal argument

Input parameter error

Check the operation that caused this problem, add log for positioning

EBADF

9

Bad file number

The socket is already invalid

Add log for positioning

ENOPROTOOPT

109

Protocol not available

The protocol is not supported, generally the option of getsockopt/setsockopt is not supported

Locate whether the option is supported

1

ERR_IF

Low-level netif error

netif interface error

Generally, the underlying interface is NULL, such as the interface index passed in when joining a multicast group, and the underlying query is NULL.

Summary

  • Through the errno code, the root cause of the socket problem can be quickly located.

  • Combined with log analysis, error handling can be optimized to improve system stability.

  • Use getsockopt() to read SO_ERROR to get more error information.

Hope this information can help troubleshoot ESP-IDF Socket related issues.