5

I have a 0-byte-delimited file of records.

Record 1, Line 1
Record 1, Line 2
[zero byte]
Record 2, Line 1
Record 2, Line 2

I'd like to run the "process.sh" command once for each record, with the record as standard input:

bash process-one-record-stdin.sh <record-contents

Can I do this with xargs, parallel, or some other tool? (I know how using bash scripting, but I'd prefer to use built-in tools where possible)

Motivation:

magic-xargs-type-command-here -0 all-records.txt -- xargs -d"\n" -- bash process-one-record-arguments.sh

2 Answers 2

1

If you have GNU Parallel you can do this:

parallel --rrs --recend '\0' -N1 --pipe bash process-one-record-stdin.sh <record-contents

All new computers have multiple cores, but most programs are serial in nature and will therefore not use the multiple cores. However, many tasks are extremely parallelizeable:

  • Run the same program on many files
  • Run the same program for every line in a file
  • Run the same program for every block in a file

GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.

If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:

Simple scheduling

GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:

GNU Parallel scheduling

Installation

If GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:

(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash

For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README

Learn more

See more examples: http://www.gnu.org/software/parallel/man.html

Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html

Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel

4
  • I already subscribe to the development list. Normally I'd accept this as the answer because of the line at the top which does answer the question, but this really is utter spam and I don't want to encourage that kind of thing. I appreciate the effort you've gone to, sorry. Commented Oct 25, 2014 at 3:48
  • I am sorry you feel that way. Stackexchange encourages not to simply give 1 line answers, but more thorough answers. I can see why you would find most of the answer redundant to you, as you apparently know most of this; but since you did not describe your knowledge, it would be equally fair to assume that you had only heard of Parallel, but never actually used it. To avoid answers like this in the future, describe what you already know. In this case you could have written that you had read the man page, watched the intro videos and walked through the tutorial without finding an answer. Commented Oct 26, 2014 at 12:31
  • (BTW: Welcome to Unix/Stackexchange - just noticed this is your first question). Commented Oct 26, 2014 at 21:06
  • Ole -- the problem is not that I know the information, but that the information (esp about parallelism) is irrelevant to the question asked. It feels copy-pasted with a small bit at the top answering the question. Commented Oct 28, 2014 at 7:23
1

Can I do this with xarg

With xargs, options to use:

--null -0 Input items are terminated by a null character instead of by whitespace,

-n max-args Use at most max-args arguments per command line.

$ echo -ne "line 111\0000line 222\0000\0000line 333\0000\0000" | \
     xargs -I '{}' --null -n 1 bash -c "echo handling this input: '{}'. OK"
handling this input: line 111. OK
handling this input: line 222. OK
handling this input: . OK
handling this input: line 333. OK
handling this input: . OK
1
  • This answer does not feed each record into standard input of a program--it's not answering the question. (Also, on my sample, I'm not just getting echo--there's some bash injection going on) Commented Oct 23, 2014 at 10:17

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.