I would like to use the following routine in my submission script for GAMESS calculations. I am not entirely sure if this is the optimum way to go.
This function would need one of the messaging routines, message, to function properly, but I decided to put echo in front of it, since this is basically what it does any way. It is taken from an earlier review of me. I am not posting the whole script due to its length, and I am not yet finished. But in order to avoid rewriting it I hope to get some insight here. I need to program similar routines for different keywords of the program, so I'd like to use this one as a template.
The function parses an input file for the GAMESS program and substitutes part of it if present with values supplied through the script in order to match the request, that is later given to the queueing system. I'll include one input file in this post.
What I am completely unsure about is that I am calling tr in more than one occasion and if there is a batter way to extract the command group(s) in the first place. Note that there could be multiple occurrences throughout the whole input.
#!/bin/sh
#Values obtained from main script
file=$1
mem=$2
#
# Parsing the specified inputfile
#
modifySystemGroup ()
{
# $1 is the used inputfile
# $2 is the requested memory in mwords
local readSystemGroup
local truncSystemGroup
# Extracting the SYSTEM command group, delete newlines
readSystemGroup=$(sed -n "/[[:space:]]\+\$[sS][yY][sS][tT][eE][mM][ ]*/,/\$[eE][nN][dD]/p" $1\
| tr -d "\r\n" )
if [ -z "$readSystemGroup" ]; then
echo message "No \$SYSTEM group detected. Default will be added."
else
echo read : \'"$readSystemGroup"\'
#Delete any $SYSTEM an $END words
truncSystemGroup=${readSystemGroup//\$[sS][yY][sS][tT][eE][mM]/}
truncSystemGroup=${truncSystemGroup//\$[eE][nN][dD]/}
#Delete any MWORDS statement since it will be replaced by script values
if [[ $truncSystemGroup =~ [mM][wW][oO][rR][dD][sS]=[[:digit:]]*[[:space:]] ]]; then
truncSystemGroup=${truncSystemGroup//${BASH_REMATCH[0]}/}
fi
fi
writeSystemGroup=$(echo " \$SYSTEM MWORDS=$2 $truncSystemGroup \$END" | tr -s [:space:] )
echo message "Applied 'MWORDS=$2' to the input file."
}
modifySystemGroup $file $mem
echo write: \'"$writeSystemGroup"\'
Sample input file (It is very important, that the keywords $... are indented by at least on space. GAMESS would ignore them otherwise.)
$CONTRL
EXETYP=CHECK
$END
$SYSTEM MWORDS=100
$END
$CONTRL
SCFTYP=RHF
RUNTYP=OPTIMIZE
$END
$SYSTEM Parall=.t.
$END
$GUESS GUESS=HUCKEL $END
$DATA
Water
C1
HYDROGEN 1.0 -0.754909 0.000000 0.563845
S 3
1 13.0107010 0.19682158E-01
2 1.9622572 0.13796524
3 0.44453796 0.47831935
S 1
1 0.12194962 1.0000000
P 1
1 0.8000000 1.0000000
HYDROGEN 1.0 0.754909 0.000000 0.563845
S 3
1 13.0107010 0.19682158E-01
2 1.9622572 0.13796524
3 0.44453796 0.47831935
S 1
1 0.12194962 1.0000000
P 1
1 0.8000000 1.0000000
OXYGEN 8.0 0.000000 0.000000 -0.087201
S 5
1 2266.1767785 -0.53431809926E-02
2 340.87010191 -0.39890039230E-01
3 77.363135167 -0.17853911985
4 21.479644940 -0.46427684959
5 6.6589433124 -0.44309745172
S 1
1 0.80975975668 1.0000000
S 1
1 0.25530772234 1.0000000
P 3
1 17.721504317 0.43394573193E-01
2 3.8635505440 0.23094120765
3 1.0480920883 0.51375311064
P 1
1 0.27641544411 1.0000000
D 1
1 1.2000000 1.0000000
$END
Here are alternative headers (part until the empty line), that all need to result in the same line returned from the function:
$CONTRL
EXETYP=CHECK
$END
$SYSTEM
MWORDS=100
$END
$CONTRL
SCFTYP=RHF
RUNTYP=OPTIMIZE
$END
$CONTRL
EXETYP=CHECK
$END
$SYSTEM
MWORDS=100 Parall=.t.
$END
$CONTRL
SCFTYP=RHF
RUNTYP=OPTIMIZE
$END
$CONTRL
EXETYP=CHECK
$END
$SYSTEM
MWORDS=100 Parall=.t. $END
$CONTRL
SCFTYP=RHF
RUNTYP=OPTIMIZE
$END
$CONTRL
EXETYP=CHECK
$END
$SYSTEM
MWORDS=100 Parall=.t. $END
$CONTRL
SCFTYP=RHF
RUNTYP=OPTIMIZE
$END
$CONTRL
EXETYP=CHECK
$END
$SYSTEM
MWORDS=100
Parall=.t.
$END
$CONTRL
SCFTYP=RHF
RUNTYP=OPTIMIZE
$END
There could be many, many more cases, that split the command group in different statements, adding other keywords, spanning multiple lines, and more that I cannot think of.
For my purposes, even when the MWORDS=... statement is missing, it needs to be inserted, hence the following must also produce the same result as the other examples:
$CONTRL
EXETYP=CHECK
$END
$CONTRL
SCFTYP=RHF
RUNTYP=OPTIMIZE
$END
$SYSTEM Parall=.t.
$END
Expected output
The output of the function itself must produce be the same. The examples differ obviously in what is read from the file. Hence echo read : \'"$readSystemGroup"\' produces different things, but this is just a control statement. That can be ignored - I will probably delete it in the final version anyway, so the log file is a bit cleaner.
The important information is in any case what is stored in $writeSystemGroup, which must be in any case the same.
Currently the above routine produces this, given $mem=500:
read : ' ... '
message Applied 'MWORDS=500' to the input file.
write: ' $SYSTEM MWORDS=500 Parall=.t. $END'
And that is the behaviour I desire.