I would like to utilize all the cores (48) in AWS to run my job. I have 6 million lists to run and each job runs for a less than a sec [real 0m0.004s user 0m0.005s sys 0m0.000s]. My following execution uses all the cores but is NOT 100%.
gnu_parallel -a list.lst --load 100% --joblog process.log sh job_run.sh {} >>score.out
job_run.sh
#!/bin/bash
i=$1
TMP_DIR=/home/ubuntu/test/$i
mkdir -p $TMP_DIR
cd $TMP_DIR/
m=`echo $i|awk -F '-' '{print $2}'`
n=`echo $i|awk -F '-' '{print $3}'`
cp /home/ubuntu/aligned/$m $TMP_DIR/
cp /home/ubuntu/aligned/$n $TMP_DIR/
printf '%s ' "$i"
/home/ubuntu/test/prog -s1 $m -s2 $n | grep 'GA'
cd $TMP_DIR/../
rm -rf $TMP_DIR
exit 0
/home/ubuntu/test/prog. How are we supposed to know how to speed that up?--load 100%AFAIK that is meant as a throttle to potentially slow things down rather than a target to speed things up to. By default it will use all cores fully anyway. Get rid of the 2awkprocesses too, and usebashParameter Substitution instead.