Revisions to How can I create a multidimensional array, or something similar, with bash?

semicolon -> comma; srsly

Source Link

edited Apr 3, 2023 at 15:01

41
2

There are of course situations in which a shell script is simpler/more practical than a Python or Perl ... script, such as when a script serves as "glue" between an input source and subsequent processing.

OP didn't mention what they wanted to do with the data, so the following is a summary of a few of the possibilities of dealing with key-value data in shell script. I've also assumed that the fields in the input are semicoloncomma-separated (as would be the case with a en_US-locale *.csv for example).

Possibility #1: linear processing with an array per line from an input file

# generic shell script
rownum=0
while IFS=';'IFS=',' read -r loca locb ccode ncode gend; do 
    # do something with fields 'loca',... etc per row
    process_row "$rownum" "$loca" "$locb" "$ccode" "$ncode" "$gend" || break
    rownum=$(( rownum + 1 ))
done < sourcefile.csv

Possibility #2: linear processing using the positional element list, with e.g. input from a function/program

# generic shell script
oifs="$IFS"; newline="
"; IFS="$newline"
for var in $(your_input_source_command_that_emits_csv); do
    IFS=';';IFS=','; set -f; set -- $var; set +f; IFS="$oifs"
    process_row "$@" || break
    IFS="$newline"
done
IFS="$oifs"

Possibility #3: load data into a multidimensional array for subsequent non-linear processing

# can be done with ksh, zsh or even, uh, bash.
declare/local -A arry=()
rownum=0; # <= first dimension is rownum
while read -r lin; do
    for key in 'loca' 'locb' 'ccode' 'ncode' 'gend'; do
        arry["$rownum.$key"]="${lin%%";"*lin%%,*}"; lin="${lin#*";"lin#*,"}"
    done
    rownum=$(( rownum + 1 ))
done < fsource # or < <(function_or_program)

Here, the "dimension separator" is a dot, but any other non-digit char would be ok too. One could use any character (including $'\n',$'\a' etc) that doesn't appear in any dimension name.

There are of course situations in which a shell script is simpler/more practical than a Python or Perl ... script, such as when a script serves as "glue" between an input source and subsequent processing.

OP didn't mention what they wanted to do with the data, so the following is a summary of a few of the possibilities of dealing with key-value data in shell script. I've also assumed that the fields in the input are semicolon-separated (as would be the case with a *.csv for example).

Possibility #1: linear processing with an array per line from an input file

# generic shell script
rownum=0
while IFS=';' read -r loca locb ccode ncode gend; do 
    # do something with fields 'loca',... etc per row
    process_row "$rownum" "$loca" "$locb" "$ccode" "$ncode" "$gend" || break
    rownum=$(( rownum + 1 ))
done < sourcefile.csv

Possibility #2: linear processing using the positional element list, with e.g. input from a function/program

# generic shell script
oifs="$IFS"; newline="
"; IFS="$newline"
for var in $(your_input_source_command_that_emits_csv); do
    IFS=';'; set -f; set -- $var; set +f; IFS="$oifs"
    process_row "$@" || break
    IFS="$newline"
done
IFS="$oifs"

Possibility #3: load data into a multidimensional array for subsequent non-linear processing

# can be done with ksh, zsh or even, uh, bash.
declare/local -A arry=()
rownum=0; # <= first dimension is rownum
while read -r lin; do
    for key in 'loca' 'locb' 'ccode' 'ncode' 'gend'; do
        arry["$rownum.$key"]="${lin%%";"*}"; lin="${lin#*";"}"
    done
    rownum=$(( rownum + 1 ))
done < fsource # or < <(function_or_program)

Here, the "dimension separator" is a dot, but any other non-digit char would be ok too. One could use any character (including $'\n',$'\a' etc) that doesn't appear in any dimension name.

There are of course situations in which a shell script is simpler/more practical than a Python or Perl ... script, such as when a script serves as "glue" between an input source and subsequent processing.

OP didn't mention what they wanted to do with the data, so the following is a summary of a few of the possibilities of dealing with key-value data in shell script. I've also assumed that the fields in the input are comma-separated (as would be the case with a en_US-locale *.csv for example).

Possibility #1: linear processing with an array per line from an input file

# generic shell script
rownum=0
while IFS=',' read -r loca locb ccode ncode gend; do 
    # do something with fields 'loca',... etc per row
    process_row "$rownum" "$loca" "$locb" "$ccode" "$ncode" "$gend" || break
    rownum=$(( rownum + 1 ))
done < sourcefile.csv

Possibility #2: linear processing using the positional element list, with e.g. input from a function/program

# generic shell script
oifs="$IFS"; newline="
"; IFS="$newline"
for var in $(your_input_source_command_that_emits_csv); do
    IFS=','; set -f; set -- $var; set +f; IFS="$oifs"
    process_row "$@" || break
    IFS="$newline"
done
IFS="$oifs"

Possibility #3: load data into a multidimensional array for subsequent non-linear processing

# can be done with ksh, zsh or even, uh, bash.
declare/local -A arry=()
rownum=0; # <= first dimension is rownum
while read -r lin; do
    for key in 'loca' 'locb' 'ccode' 'ncode' 'gend'; do
        arry["$rownum.$key"]="${lin%%,*}"; lin="${lin#*,"}"
    done
    rownum=$(( rownum + 1 ))
done < fsource # or < <(function_or_program)

Here, the "dimension separator" is a dot, but any other non-digit char would be ok too. One could use any character (including $'\n',$'\a' etc) that doesn't appear in any dimension name.

added 7 characters in body

Source Link

edited Apr 3, 2023 at 13:17

user567692

41
2

There are of course situations in which a shell script is simpler/more practical than a Python or Perl ... script, such as when a script serves as "glue" between an input source and subsequent processing.

OP didn't mention what they wanted to do with the data, so the following is a summary of a few of the possibilities of dealing with key-value data in shell script. I've also assumed that the fields in the input are semicolon-separated (as would be the case with a *.csv for example).

Possibility #1: linear processing with an array per line from an input file

# generic shell script
rownum=0
while IFS=';' read -r loca locb ccode ncode gend; do 
    # do something with fields 'loca',... etc per row
    process_row "$rownum" "$loca" "$locb" "$ccode" "$ncode" "$gend" || break
    rownum=$(( rownum + 1 ))
done < sourcefile.csv

Possibility #2: linear processing using the positional element list, with e.g. input from a function/program

# generic shell script
oifs="$IFS"; newline="
"; IFS="$newline"
for var in $(your_input_source_command_that_emits_csv); do
    IFS=';'; set -f; set -- $var; set +f; IFS="$oifs"
    process_row "$@" || break
    IFS="$newline"
done
IFS="$oifs"

Possibility #3: load data into a multidimensional array for subsequent non-linear processing

# can be done with ksh, zsh or even, uh, bash.
declare/local -A arry=()
rownum=0; # <= first dimension is rownum
while read -r lin; do
    for key in 'loca' 'locb' 'ccode' 'ncode' 'gend'; do
        arry["$rownum.$key"]="${lin%%";"*}"; lin="${lin#*";"}"
    done
    rownum=$(( rownum + 1 ))
done < fsource # or < <(function_or_program)

Here, the "dimension separator" is a dot, but any other non-digit char would be ok too. One could use any character (including $'\n',$'\a' etc) that doesn't appear in any dimension name.

There are of course situations in which a shell script is simpler/more practical than a Python or Perl ... script, such as when a script serves as "glue" between an input source and subsequent processing.

OP didn't mention what they wanted to do with the data, so the following is a summary of a few of the possibilities of dealing with key-value data in shell script. I've also assumed that the fields in the input are semicolon-separated (as would be the case with a *.csv for example).

Possibility #1: linear processing with an array per line from an input file

# generic shell script
rownum=0
while IFS=';' read -r loca locb ccode ncode gend; do 
    # do something with fields 'loca',... etc per row
    process_row "$rownum" "$loca" "$locb" "$ccode" "$ncode" "$gend" || break
    rownum=$(( rownum + 1 ))
done < sourcefile.csv

Possibility #2: linear processing using the positional element list, with e.g. input from a function/program

# generic shell script
oifs="$IFS"; newline="
"; IFS="$newline"
for var in $(your_input_source_command_that_emits_csv); do
    IFS=';'; set -f; set -- $var; set +f; IFS="$oifs"
    process_row "$@" || break
    IFS="$newline"
done
IFS="$oifs"

Possibility #3: load data into a multidimensional array for subsequent non-linear processing

# can be done with ksh, zsh or even, uh, bash.
declare/local -A arry=()
rownum=0; # <= first dimension is rownum
while read -r lin; do
    for 'loca' 'locb' 'ccode' 'ncode' 'gend'; do
        arry["$rownum.$key"]="${lin%%";"*}"; lin="${lin#*";"}"
    done
    rownum=$(( rownum + 1 ))
done < fsource # or < <(function_or_program)

Here, the "dimension separator" is a dot, but any other non-digit char would be ok too. One could use any character (including $'\n',$'\a' etc) that doesn't appear in any dimension name.

There are of course situations in which a shell script is simpler/more practical than a Python or Perl ... script, such as when a script serves as "glue" between an input source and subsequent processing.

OP didn't mention what they wanted to do with the data, so the following is a summary of a few of the possibilities of dealing with key-value data in shell script. I've also assumed that the fields in the input are semicolon-separated (as would be the case with a *.csv for example).

Possibility #1: linear processing with an array per line from an input file

# generic shell script
rownum=0
while IFS=';' read -r loca locb ccode ncode gend; do 
    # do something with fields 'loca',... etc per row
    process_row "$rownum" "$loca" "$locb" "$ccode" "$ncode" "$gend" || break
    rownum=$(( rownum + 1 ))
done < sourcefile.csv

Possibility #2: linear processing using the positional element list, with e.g. input from a function/program

# generic shell script
oifs="$IFS"; newline="
"; IFS="$newline"
for var in $(your_input_source_command_that_emits_csv); do
    IFS=';'; set -f; set -- $var; set +f; IFS="$oifs"
    process_row "$@" || break
    IFS="$newline"
done
IFS="$oifs"

Possibility #3: load data into a multidimensional array for subsequent non-linear processing

# can be done with ksh, zsh or even, uh, bash.
declare/local -A arry=()
rownum=0; # <= first dimension is rownum
while read -r lin; do
    for key in 'loca' 'locb' 'ccode' 'ncode' 'gend'; do
        arry["$rownum.$key"]="${lin%%";"*}"; lin="${lin#*";"}"
    done
    rownum=$(( rownum + 1 ))
done < fsource # or < <(function_or_program)

Here, the "dimension separator" is a dot, but any other non-digit char would be ok too. One could use any character (including $'\n',$'\a' etc) that doesn't appear in any dimension name.

Source Link

answered Apr 3, 2023 at 13:00

user567692

41
2

There are of course situations in which a shell script is simpler/more practical than a Python or Perl ... script, such as when a script serves as "glue" between an input source and subsequent processing.

OP didn't mention what they wanted to do with the data, so the following is a summary of a few of the possibilities of dealing with key-value data in shell script. I've also assumed that the fields in the input are semicolon-separated (as would be the case with a *.csv for example).

Possibility #1: linear processing with an array per line from an input file

# generic shell script
rownum=0
while IFS=';' read -r loca locb ccode ncode gend; do 
    # do something with fields 'loca',... etc per row
    process_row "$rownum" "$loca" "$locb" "$ccode" "$ncode" "$gend" || break
    rownum=$(( rownum + 1 ))
done < sourcefile.csv

Possibility #2: linear processing using the positional element list, with e.g. input from a function/program

# generic shell script
oifs="$IFS"; newline="
"; IFS="$newline"
for var in $(your_input_source_command_that_emits_csv); do
    IFS=';'; set -f; set -- $var; set +f; IFS="$oifs"
    process_row "$@" || break
    IFS="$newline"
done
IFS="$oifs"

Possibility #3: load data into a multidimensional array for subsequent non-linear processing

# can be done with ksh, zsh or even, uh, bash.
declare/local -A arry=()
rownum=0; # <= first dimension is rownum
while read -r lin; do
    for 'loca' 'locb' 'ccode' 'ncode' 'gend'; do
        arry["$rownum.$key"]="${lin%%";"*}"; lin="${lin#*";"}"
    done
    rownum=$(( rownum + 1 ))
done < fsource # or < <(function_or_program)

Here, the "dimension separator" is a dot, but any other non-digit char would be ok too. One could use any character (including $'\n',$'\a' etc) that doesn't appear in any dimension name.

Stack Exchange Network

Return to Answer