0

I am using the xargs tool for managing running multiple commands in serial and parallel, i.e. 4 tasks simultaneously, while using the -a command line option to read from a list of commands to be executed, i.e.

xargs -t -P 4 -L 1 -I '%' -a files.txt runSh.sh

Where files.txt contains the list of configurations, which is passed to runSh.sh as a command line parameter.

My question is, can I append lines to files.txt while xargs is running, and will xargs add these additional commands to it's execution queue, or does it only read the files.txt input file once, at execution time?

Thanks

2
  • With that command lines, the files are not passed to runSh.sh, and there's a special quote and blank handling by xargs. -L combined with -I makes little sense. Maybe you meant xargs -t -P4 -n1 -d '\n' -a files.txt runSh.sh Commented Mar 12, 2021 at 18:14
  • Well, it works. It runs 4 commands at a time, when they finish it automatically starts the next one in the file such that there are always 4 running. The question I have is, can I append new lines to files.txt while xargs has already started and have it add them to the running queue, or do I have to wait for xargs to complete and then restart it with a new files.txt. Commented Mar 12, 2021 at 18:45

2 Answers 2

2

You can run it under strace to see what's happening:

$ seq 10 > files.txt
$ strace -tt -e read xargs -t -P 4 -n1 -d'\n' -a files.txt sleep
[...]
18:19:32.907311 read(3, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n", 512) = 21
sleep 1
18:19:32.908129 read(4, "", 4)          = 0
sleep 2
18:19:32.908830 read(4, "", 4)          = 0
sleep 3
18:19:32.909406 read(4, "", 4)          = 0
sleep 4
18:19:32.909977 read(4, "", 4)          = 0
18:19:33.912774 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=453051, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
sleep 5
18:19:33.914702 read(4, "", 4)          = 0
18:19:34.910440 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=453052, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
sleep 6
18:19:34.911021 read(4, "", 4)          = 0
18:19:35.911315 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=453053, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
sleep 7
18:19:35.912257 read(4, "", 4)          = 0
18:19:36.912158 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=453054, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
sleep 8
18:19:36.912623 read(4, "", 4)          = 0
18:19:38.916348 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=453176, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
sleep 9
18:19:38.917196 read(4, "", 4)          = 0
18:19:40.913135 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=453177, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
sleep 10
18:19:40.914137 read(4, "", 4)          = 0
18:19:40.914808 read(3, "", 512)        = 0
18:19:42.914324 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=453178, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
18:19:44.914685 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=453179, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
18:19:47.919202 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=453272, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
18:19:50.916332 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=453273, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
18:19:50.917068 +++ exited with 0 +++

As you can see, it reads up to 512 bytes worth of data initially which in my case is enough to read the whole contents of the file (21 bytes), then starts 4 processes.

As soon as the first of those sleep commands returns, it starts the next one.

When it has started all the commands resulting from what it has read initially, it read()s from that file descriptor 3 again, which returns nothing, which means end-of-file, after which it doesn't read any more.

So, xargs (GNU xargs here, as -P, -d are GNU specific) will only read additional data if it has been appended before xargs has started its last command.

If you want to always be able to add more data and make sure xargs reads it, you can change it to:

xargs -t -P 4 -n1 -d'\n' -a <(tail -fn +1 files.txt) sleep

(assuming a shell with process-substitution support such as ksh, zsh or bash)

This time, xargs will read from a pipe which will never end (end-of-file will never be seen on it). tail -f, and as a result xargs will wait forever for more data to come from that file.

1
  • I did a similar experiment, but without using strace - since I didn't know about it. I got the same basic result. Commented Mar 12, 2021 at 18:58
0

I just did an experiment. I made a small shell script that looks like this

echo $1
sleep 1m

I had a configuration file that looks like this

one
two
three
four

Then I started the command with

xargs -t -P 4 -L 1 -a input_lines.txt ./run2.sh

Once it started running I modified the input_lines.txt file so that it looked like

one
two
three
four
five

The execution completed, and it only output

./run2.sh one
./run2.sh two
one
./run2.sh three
two
./run2.sh four
three
four

Taken together this shows that xargs reads the input file specified by the -a command line at run time and uses that - you cannot modify the file during execution and have the modified inputs reflected in the execution.

three four

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.