I'm running into a weird socket issue when running my own crawler. It opens and cloees a lot of TCP sockets rapidly due to the protocol design. It's something I have to live with. I'm very sure in code I have closed the socket correctly (verified via strace and debug prints) . Yet somehow I still hit the open socket limit o my system. Tools like Netdat also shows elevated numbers of open sockets. Upon further inspection. I find there's high numbers of socket fds inside /proc/<pid>/fd/. Here's a sample result I run
All commands are executed as root
# ls /proc/248298/fd/ -l | grep socket | wc -l
522
But when I run netstat to figure out which remote the sockets are connected to, while taking in account of system wide TIME_WAIT and CLOSE_WAIT sockets (since netstat won't associate them with my process anymore). The number is much lower.
# netstat -tulnap | egrep '(TIME_WAIT|CLOSE_WAIT|248298)' | wc -l
109
I've tried setting net.ipv4.tcp_tw_reuse to 1 as mitigation with not success.
What's the cause of this? And further more, why my closed sockets are still counted alive? Or is there ways to work around this?
OS: Linux
Distro: Ubuntu 22.04
Kernel: 5.15
CPU: x64