TLDR
When spinning up multiple docker containers in which I run npm ci, I start getting pthread_create: Resource temporarily unavailable errors (less than 5 docker containers can run fine). I deduce there is some kind of thread limit somewhere, but I cannot find which one is blocking here.
configuration
- a Jenkins instance spins up docker containers for each build (connection through ssh into this docker container).
- in each container some build commands are run; I see the error often when using npm cisince this seems to create quite some threads; but I don't think the problem is related tonpmitself.
- all docker containers run on a single docker-host. It's specifications:
docker-host
- Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz with 12 cores, 220 GB RAM
- Centos 7
- Docker version 18.06.1-ce, build e68fc7a
- systemd version 219
- kernel 3.10.0-957.5.1.el7.x86_64
errors
I can see the error under different forms:
- jenkins failing to contact the docker container; errors like: java.lang.OutOfMemoryError: unable to create new native thread
- git clonefailing inside the container with ERROR: Error cloning remote repo 'origin' ... Caused by: java.lang.OutOfMemoryError: unable to create new native thread
- npm cifailing inside the container with node[1296]: pthread_create: Resource temporarily unavailable
Things I have investigated or tried
I looked quite a lot a this question.
- docker-host has systemdversion 219 and is hence does not have theTasksMaxattribute.
- /proc/sys/kernel/threads-max= 1798308
- kernel.pid_max= 49152
- number of threads (ps -elfT | wc -l) is typically 700, but with multiple containers running I have seen it climb to 4500.
- all builds run as some user with pid 1001 inside the docker container; however there is no user with pid 1001 on the docker-host so I don't know which limits apply to this user.
- I have already increased multiple limits for all users in /etc/security/limits.conf(see below)
- I created a dummy user with uid 1001 on docker-host and made sure it had also nproclimit set to unlimited. Logging onto that userulimit -u= unlimited. This still didn't solve the problem
/etc/security/limits.conf :
*               soft    nproc           unlimited
*               soft    stack           65536
*               soft    nofile          2097152
output of ulimit -a as root:
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 899154
max locked memory       (kbytes, -l) 1048576
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1048576
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 65536
cpu time               (seconds, -t) unlimited
max user processes              (-u) 899154
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
limits of my dockerd process (cat /proc/16087/limits where 16087 is pid of dockerd)
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            unlimited            unlimited            bytes     
Max core file size        unlimited            unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             unlimited            unlimited            processes 
Max open files            65536                65536                files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       899154               899154               signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us

TasksMaxattribute ofsystemd(I saw that it is enabled from this bug report). AddTasksMax=infinityto your Docker service file override and see if it helps or prints the warning that it is not available.systemd-219-42as per this errata announcement.systemctl status dockerit saysTasks: 135and there is no maximum between brackets so I still think that this is not the reason. Also my limit seems to lie at 4096 threads and not at 512.