Skip to main content
Tweeted twitter.com/StackUnix/status/978825540173029378
added 69 characters in body
Source Link

I have a script dataProcessing.pl that accepts a tab-delimited .txt file and performs extensive processing tasks on the contained data. Multiple input files exist (file1.txt file2.txt file3.txt) which are currently looped over as part of a bash script, that invokes perl during each iteration (i.e. input files are processed one at a time).

I wish however to run multiple instances of Perl (if possible), and process all input files simultaneously via xargs. I'm aware that you can run something akin to:

perl -e 'print "Test" x 100' | xargs -P 100

However I want to pass a different file for each parallel instance of Perl opened (one instance works on file1.txt, one works on file2.txt and so forth). File handle or file path can be passed to Perl as an argument. How can I do this? I am not sure how I would pass the file names to xargs for example.

I have a script dataProcessing.pl that accepts a tab-delimited .txt file and performs extensive processing tasks on the contained data. Multiple input files exist (file1.txt file2.txt file3.txt) which are currently looped over as part of a bash script, that invokes perl during each iteration (i.e. input files are processed one at a time).

I wish however to run multiple instances of Perl (if possible), and process all input files simultaneously via xargs. I'm aware that you can run something akin to:

perl -e 'print "Test" x 100' | xargs -P 100

However I want to pass a different file for each parallel instance of Perl opened (one instance works on file1.txt, one works on file2.txt and so forth). File handle or file path can be passed to Perl as an argument. How can I do this?

I have a script dataProcessing.pl that accepts a tab-delimited .txt file and performs extensive processing tasks on the contained data. Multiple input files exist (file1.txt file2.txt file3.txt) which are currently looped over as part of a bash script, that invokes perl during each iteration (i.e. input files are processed one at a time).

I wish however to run multiple instances of Perl (if possible), and process all input files simultaneously via xargs. I'm aware that you can run something akin to:

perl -e 'print "Test" x 100' | xargs -P 100

However I want to pass a different file for each parallel instance of Perl opened (one instance works on file1.txt, one works on file2.txt and so forth). File handle or file path can be passed to Perl as an argument. How can I do this? I am not sure how I would pass the file names to xargs for example.

Source Link

Running multiple instances of perl via xargs

I have a script dataProcessing.pl that accepts a tab-delimited .txt file and performs extensive processing tasks on the contained data. Multiple input files exist (file1.txt file2.txt file3.txt) which are currently looped over as part of a bash script, that invokes perl during each iteration (i.e. input files are processed one at a time).

I wish however to run multiple instances of Perl (if possible), and process all input files simultaneously via xargs. I'm aware that you can run something akin to:

perl -e 'print "Test" x 100' | xargs -P 100

However I want to pass a different file for each parallel instance of Perl opened (one instance works on file1.txt, one works on file2.txt and so forth). File handle or file path can be passed to Perl as an argument. How can I do this?