2

Is there a quick, easy, and efficient way of running iterations in this for loop in parallel?

for i in `seq 1 5000`; do
  repid="$(printf "%05d" "$i")"
  inp="${repid}.inp"
  out="${repid}.out"
  /command "$inp" "$out"
done
3
  • Just put & after the /command line. Are you sure your system can handle 5000 instances of the process, though? Commented Nov 27, 2018 at 19:04
  • I do not want to run all 5000 instances all at once as that is impossible. I am thinking within the lines of how, for instance in R, certain packages can handle embarrassingly parallel workload in loops this using sockets or forking (r-bloggers.com/parallel-r-loops-for-windows-and-linux). Is there something like this in bash? Commented Nov 27, 2018 at 19:58
  • See xargs -P or GNU parallel. Commented Nov 27, 2018 at 20:03

2 Answers 2

2

If you want to take advantage of all your lovely CPU cores that you paid Intel so handsomely for, turn to GNU Parallel:

seq -f "%05g" 5000 | parallel -k echo command {}.inp {}.out

If you like the look of that, run it again without the -k (which keeps the output in order) and without the echo. You may need to enclose the command in single quotes:

seq -f "%05g" 5000 | parallel '/command {}.inp {}.out'

It will run 1 instance per CPU core in parallel, but, if you want say 32 in parallel, use:

seq ... | parallel -j 32 ...

If you want an "estimated time of arrival", use:

parallel --eta ...

If you want a progress meter, use:

parallel --progress ...

If you have bash version 4+, it can zero-pad brace expansions. And if your ARGMAX is big enough, so you can more simply use:

parallel 'echo command {}.inp {}.out' ::: {00001..05000}

You can check your ARGMAX with:

sysctl -a kern.argmax

and it tells you how many bytes long your parameter list can be. You are going to need 5,000 numbers at 5 digits plus a space each, so 30,000 minimum.


If you are on macOS, you can install GNU Parallel with homebrew:

brew install parallel
Sign up to request clarification or add additional context in comments.

1 Comment

echo {00001..050000000} |parallel -d" " 'echo command {}.inp {}.out'
1
for i in `seq 1 5000`; do
  repid="$(printf "%05d" "$i")"
  inp="${repid}.inp"
  out="${repid}.out"
  /command "$inp" "$out" &
done

1 Comment

While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.