I am working on a little complicated shell script for the first time and below is what it should do:
- During startup, it figures out what is my
clientidby looking athost-mapping.txtfile. If I cannot findclientidfor my hostname then I need to exit from shell script with non zero status code and log error message. - Now once I have valid
clientid, I will extractprimary filesfromprimary-mappings.txtfile andsecondary filesfromsecondary-mappings.txtfile for that validclientid. If for whatever reason, I cannot find either primary or secondary files for thatclientidfrom that file, then I will exit from shell script and log an error message. - Now once I have valid primary and secondary files for that
clientidthen I will start copying those files in parallel usinggnu-parallelfromlocal_server. All primary files will go toprimaryfolder and all secondary files will go tosecondaryfolder. If files are not there inhold1folder on remote servers then it should be there onhold2folder. - Now once all the files are copied, I will verify at the end to make sure all the primary and secondary files are present for that
clientidin those two folders but if for whatever reason, I cannot find those files then I want to exit from the shell script with message that tells me what files are missing.
Below is my script and it does the job but I would like to see if there is any better or efficient way to do above things since this is my first time writing little complicated script so wanted to check this out. As of now, I don't have mechanism to exit out of shell script if I cannot find primary or secondary files for that clientid and also I don't have mechanism to exit out of shell script if during the verification phase some files are missing.
#!/bin/bash
path=/home/goldy/scripts
mapfiles=(primary-mappings.txt secondary-mappings.txt)
hostfile=host-mapping.txt
machines=(machine1769.abc.host.com proctek5461.def.host.com letyrs87541.pqr.host.com)
# folders on local box where to copy files
primary=/data01/primary
secondary=/data02/secondary
# folders on remote servers from where to copy files
export hold1=/data/snapshot/$1
export hold2=/data/snapshot/$2
date1=$(date +"%s")
# this will tell me what's my clientid given my current hostname
getProperty () {
prop_value=$(hostname -f)
prop_key=`cat $path/$hostfile | grep "$prop_value" | cut -d'=' -f1`
echo $(echo $prop_key | tr -dc '0-9')
}
# if I can't find clientid for my hostname, then I will log a message
# and exit out of shell script with non zero status code
clientid=$(getProperty)
[ -z "$clientid" ] && { echo "cannot find clientid for $(hostname -f)"; exit 1; }
# now once I have valid clientid, then I will get primary and secondary mapping
# from the "host-mapping.txt" file
declare -a arr
mappingsByClientID () {
id=$1 # 1 to 5
file=$path/${mapfiles[$2]} # 0 to 1
arr=($(sed -r "s/.*\b${id}=\[([^]\]+).*/\1/; s/,/ /g" $file))
echo "${arr[@]}"
}
# assign output of function to an array
pri=($(mappingsByClientID $clientid 0))
snd=($(mappingsByClientID $clientid 1))
echo "primary files: ${pri[@]}"
echo "secondary files: ${snd[@]}"
# figure out which machine you want to use to start copying files from
case $(hostname -f) in
*abc.host.com)
local_server=("${machines[0]}")
;;
*def.host.com)
local_server=("${machines[1]}")
;;
*pqr.host.com)
local_server=("${machines[2]}")
;;
*) echo "unknown host: $(hostname -f), exiting." && exit 1 ;;
# ?
esac
export local="$local_server"
# deleting files before we start copying
find "$primary" -maxdepth 1 -type f -exec rm -fv {} \;
find "$secondary" -maxdepth 1 -type f -exec rm -fv {} \;
do_copy() {
el=$1
primsec=$2
(scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold1"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1) || (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)
}
export -f do_copy
# copy files in parallel
parallel -j "$3" do_copy {} $primary ::: ${pri[@]} &
parallel -j "$3" do_copy {} $secondary ::: ${snd[@]} &
wait
echo "all files copied"
# this is for verification to see all files got copied or not
# in primary and secondary folder
set -- "$primary" "$secondary"
typeset -n array
for array in pri snd; do
for num in "${array[@]}"; do
name="hello_monthly_${num}_999_1.data"
if [ ! -f "$1/$name" ]; then
{ echo "$name" not found in "$1" >&2 && exit 1; }
fi
done
shift
done
date2=$(date +"%s")
diff=$(($date2-$date1))
echo "Total Time Taken - $(($diff / 3600)) hours and $(((diff/60) % 60)) minutes and $(($diff % 60)) seconds elapsed."
Below is my host-mapping.txt file and it will have lot more entries. Here value is a valid hostname and key will be string "k" followd by some number and that number should be there in mapping files.
k1=machineA.abc.com
k2=machineB.abc.com
k3=machineC.def.com
k4=machineD.pqr.com
k5=machineO.abc.com
And below is my sample mapping files:
primary_mappings.txt
{1=[343, 0, 686, 1372, 882, 196], 2=[687, 1, 1373, 883, 197, 736, 1030, 1569], 3=[1374, 2, 884, 737, 198, 1570], 4=[1375, 1032, 1424, 3, 885, 1228], 5=[1033, 1425, 4, 200, 886]}
secondary_mappings.txt
{1=[1152, 816, 1488, 336, 1008], 2=[1153, 0, 817, 337, 1489, 1009, 1297], 3=[1, 1154, 1490, 338], 4=[1155, 2, 339, 1491, 819, 1299, 1635], 5=[820, 1492, 340, 3, 1156]}
For example: clientid 1 has 343, 0, 686, 1372, 882, 196 primary files and 1152, 816, 1488, 336, 1008 secondary files. Similarly for other clientids as well.