Return to Answer

added 850 characters in body

Source Link

edited Apr 10, 2018 at 10:54

A simple awk one-liner solves your example:

awk '/^Entry/{k=$0;next}{g[k]=g[k]"\n"$0}END{for(k in g)print k g[k]}' file1 file2

I suppose you know that basically awk processes input lines one after another according to a program. This creates an array namedparticular gawk, indexed by the entry titles, program is specified as first argument and concatenates every subsequent line under the appropriate value. In the end, it prints the whole arrayconsists of three statements. Let’s analyze them one by one:

/^Entry/{k=$0;next} means: if the processed line matches /^Entry/, store it in the variable k and go to the next cycle ignoring the following statements.

{g[k]=g[k]"\n"$0} has no preceding condition, so it is always executed, and means: update the value stored in the dictionary g with the key k: the new value has to be the concatenation of the (possibly empty) previous value g[k], a carriage return "\n", and the current line.

END{for(k in g)print k g[k]} has an END condition and is therefore executed when all input lines have been processed. It says: for each key in g, that is, for each title which has appeared in the input files, print the associated value, which is the concatenation of all lines found in input files under that title.

To use it IRL, You have to replace /^Entry/ with the correct pattern (probably /^\$\$\$/).

A simple awk one-liner solves your example:

awk '/^Entry/{k=$0;next}{g[k]=g[k]"\n"$0}END{for(k in g)print k g[k]}' file1 file2

This creates an array named g, indexed by the entry titles, and concatenates every subsequent line under the appropriate value. In the end, it prints the whole array.

To use it IRL, You have to replace /^Entry/ with the correct pattern (probably /^\$\$\$/).

A simple awk one-liner solves your example:

awk '/^Entry/{k=$0;next}{g[k]=g[k]"\n"$0}END{for(k in g)print k g[k]}' file1 file2

I suppose you know that basically awk processes input lines one after another according to a program. This particular awk program is specified as first argument and consists of three statements. Let’s analyze them one by one:

/^Entry/{k=$0;next} means: if the processed line matches /^Entry/, store it in the variable k and go to the next cycle ignoring the following statements.

{g[k]=g[k]"\n"$0} has no preceding condition, so it is always executed, and means: update the value stored in the dictionary g with the key k: the new value has to be the concatenation of the (possibly empty) previous value g[k], a carriage return "\n", and the current line.

END{for(k in g)print k g[k]} has an END condition and is therefore executed when all input lines have been processed. It says: for each key in g, that is, for each title which has appeared in the input files, print the associated value, which is the concatenation of all lines found in input files under that title.

To use it IRL, You have to replace /^Entry/ with the correct pattern (probably /^\$\$\$/).

Source Link

answered Apr 10, 2018 at 9:14

Dario

A simple awk one-liner solves your example:

awk '/^Entry/{k=$0;next}{g[k]=g[k]"\n"$0}END{for(k in g)print k g[k]}' file1 file2

This creates an array named g, indexed by the entry titles, and concatenates every subsequent line under the appropriate value. In the end, it prints the whole array.

To use it IRL, You have to replace /^Entry/ with the correct pattern (probably /^\$\$\$/).