12

Following on from a problem described in "How is it that I can attach strace to a process that isn't in the output of ps?"

I'm trying to debug a process that hangs part way through.

By using strace -f on my parent process, I was able to determine that I have a bunch of threads that are just showing:

# strace -p 26334
Process 26334 attached - interrupt to quit
epoll_wait(607, {}, 4096, 500)          = 0
epoll_wait(607, {}, 4096, 500)          = 0
epoll_wait(607, {}, 4096, 500)          = 0
epoll_wait(607, {}, 4096, 500)          = 0
epoll_wait(607, ^C <unfinished ...>
Process 26334 detached

Investigating further:

# readlink /proc/26334/fd/607
anon_inode:[eventpoll]

My gut tells me that I've managed to get some threads in a deadlock situation, but I don't really know enough about epoll to move forward. Are there any commands that can give me some insight into what these threads are polling for, or which file descriptors this epoll descriptor maps to.

1 Answer 1

8

When you run strace the lines it's returning are system functions. In case it wasn't obvious epoll_wait() is a function that you can do a man epoll_wait to find out implementation details like so:

   epoll_wait, epoll_pwait - wait for an I/O event on an epoll file descriptor

The description for epoll:

The epoll API performs a similar task to poll(2): monitoring multiple file descriptors to see if I/O is possible on any of them. The epoll API can be used either as an edge-triggered or a level-triggered interface and scales well to large numbers of watched file descriptors.

So it would seem that you're process is blocking on file descriptors, waiting to see if I/O is possible on any of them.

I would change my tactics a bit and try and make use of lsof -p <pid> to see if you can narrow down what these files actually are.

5
  • 1
    any hint how to troubleshoot further? What is being waited on? Commented Apr 14, 2015 at 17:31
  • @ThorstenStaerk - is this directed to me? I haven't heard anything further since the OP posted this almost 1 year ago 8-). Commented Apr 15, 2015 at 3:36
  • 1
    this question is directed to anybody who can answer it. I hope our savior will find the way here due to the excellent site, question and answer. Commented Apr 15, 2015 at 6:10
  • @ThorstenStaerk I'd post it as a follow-up Q, if you want something more specific, no one will see this comment here except you and me. Be sure to reference this question and state what further details you're after. Commented Apr 15, 2015 at 10:52
  • 1
    lsof -p didn't help me much, I basically see a line like this: node 2281 tony 13u a_inode 0,14 0 11498 [eventpoll:14,16] Commented Jul 6, 2022 at 8:16

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.