Skip to main content
added 5 characters in body
Source Link
FelixJN
  • 14.1k
  • 2
  • 36
  • 55

Via awk and the GNU-feature (!) of defining array traversal. Note: stores the whole file in RAM once, but you said "over 100 volumes" so I assume the file is not incredibly large.

The idea is

  1. separate records by empty lines (two newlines in a row, no TAB assumed)
  2. use parentheses as field separators: get lines into array with volume number as index identifier. Therefore the number needs to be separated out with sub
  3. sort output by "volume X" index
  4. simply replace the numbers (293G etc) for each entry in a sorted manner

Script:

BEGIN { RS=ORS="\n\n"RS="" ; ORS="\n\n" ; FS="[()]" }

{id=$2 ; sub(/volume /,"",id) ; vol[id]=$0}    

END {PROCINFO["sorted_in"]="@ind_num_asc"
    n=293n=292
    for ( id in vol ) { gsub(/^\t.../,"\t"n++,vol[id]) ; print vol[id] } }

Run via

awk -f script inputfile

Via awk and the GNU-feature (!) of defining array traversal. Note: stores the whole file in RAM once, but you said "over 100 volumes" so I assume the file is not incredibly large.

The idea is

  1. separate records by empty lines (two newlines in a row, no TAB assumed)
  2. use parentheses as field separators: get lines into array with volume number as index identifier. Therefore the number needs to be separated out with sub
  3. sort output by "volume X" index
  4. simply replace the numbers (293G etc) for each entry in a sorted manner

Script:

BEGIN { RS=ORS="\n\n" ; FS="[()]" }

{id=$2 ; sub(/volume /,"",id) ; vol[id]=$0}    

END {PROCINFO["sorted_in"]="@ind_num_asc"
    n=293
    for ( id in vol ) { gsub(/^\t.../,"\t"n++,vol[id]) ; print vol[id] } }

Run via

awk -f script inputfile

Via awk and the GNU-feature (!) of defining array traversal. Note: stores the whole file in RAM once, but you said "over 100 volumes" so I assume the file is not incredibly large.

The idea is

  1. separate records by empty lines (two newlines in a row, no TAB assumed)
  2. use parentheses as field separators: get lines into array with volume number as index identifier. Therefore the number needs to be separated out with sub
  3. sort output by "volume X" index
  4. simply replace the numbers (293G etc) for each entry in a sorted manner

Script:

BEGIN { RS="" ; ORS="\n\n" ; FS="[()]" }

{id=$2 ; sub(/volume /,"",id) ; vol[id]=$0}    

END {PROCINFO["sorted_in"]="@ind_num_asc"
    n=292
    for ( id in vol ) { gsub(/^\t.../,"\t"n++,vol[id]) ; print vol[id] } }

Run via

awk -f script inputfile
Post Undeleted by FelixJN
added 36 characters in body
Source Link
FelixJN
  • 14.1k
  • 2
  • 36
  • 55

Via awk and the GNU-feature (!) of defining array traversal. Note: stores the whole file in RAM once, but you said "over 100 volumes" so I assume the file is not incredibly large.

The idea is

  1. separate records by empty lines (two newlines in a row, no TAB assumed)
  2. use parentheses as field separators: get lines into array with "volume X"volume number as index identifier. Therefore the number needs to be separated out with sub
  3. sort output by "volume X" index
  4. simply replace the numbers (293G etc) for each entry in a sorted manner

Script:

BEGIN { RS=ORS="\n\n" ; FS="[()]" }

{vol[$2]=$0id=$2 ; sub(/volume /,"",id) ; vol[id]=$0}    

END {PROCINFO["sorted_in"]="@ind_num_asc"
    j=293n=293
    for ( iid in vol ) { gsub(/^\t.../,"\t"j++"\t"n++,vol[i]vol[id]) ; print vol[i]vol[id] } }

Run via

awk -f script inputfile

Via awk and the GNU-feature (!) of defining array traversal. Note: stores the whole file in RAM once, but you said "over 100 volumes" so I assume the file is not incredibly large.

The idea is

  1. separate records by empty lines (two newlines in a row, no TAB assumed)
  2. use parentheses as field separators: get lines into array with "volume X" as index identifier
  3. sort output by "volume X" index
  4. simply replace the numbers (293G etc) for each entry in a sorted manner

Script:

BEGIN { RS=ORS="\n\n" ; FS="[()]" }

{vol[$2]=$0}

END {PROCINFO["sorted_in"]="@ind_num_asc"
    j=293
    for ( i in vol ) { gsub(/^\t.../,"\t"j++,vol[i]) ; print vol[i] } }

Run via

awk -f script inputfile

Via awk and the GNU-feature (!) of defining array traversal. Note: stores the whole file in RAM once, but you said "over 100 volumes" so I assume the file is not incredibly large.

The idea is

  1. separate records by empty lines (two newlines in a row, no TAB assumed)
  2. use parentheses as field separators: get lines into array with volume number as index identifier. Therefore the number needs to be separated out with sub
  3. sort output by "volume X" index
  4. simply replace the numbers (293G etc) for each entry in a sorted manner

Script:

BEGIN { RS=ORS="\n\n" ; FS="[()]" }

{id=$2 ; sub(/volume /,"",id) ; vol[id]=$0}    

END {PROCINFO["sorted_in"]="@ind_num_asc"
    n=293
    for ( id in vol ) { gsub(/^\t.../,"\t"n++,vol[id]) ; print vol[id] } }

Run via

awk -f script inputfile
Post Deleted by FelixJN
Source Link
FelixJN
  • 14.1k
  • 2
  • 36
  • 55

Via awk and the GNU-feature (!) of defining array traversal. Note: stores the whole file in RAM once, but you said "over 100 volumes" so I assume the file is not incredibly large.

The idea is

  1. separate records by empty lines (two newlines in a row, no TAB assumed)
  2. use parentheses as field separators: get lines into array with "volume X" as index identifier
  3. sort output by "volume X" index
  4. simply replace the numbers (293G etc) for each entry in a sorted manner

Script:

BEGIN { RS=ORS="\n\n" ; FS="[()]" }

{vol[$2]=$0}

END {PROCINFO["sorted_in"]="@ind_num_asc"
    j=293
    for ( i in vol ) { gsub(/^\t.../,"\t"j++,vol[i]) ; print vol[i] } }

Run via

awk -f script inputfile