Skip to main content
added 827 characters in body
Source Link
Ed Morton
  • 35.8k
  • 6
  • 25
  • 60

Using GNU awk for multi-char RS and \s shorthand:

$ awk 'BEGIN{RS=ORS=";\n";RS=ORS="\";\n"; FS="\\s*\n\\s*"} /^pattern2/{$1=$1} 1' file
pattern2 "xxx xxxxxx xxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx yyyy yyyyyy yy yyyyyyyyyy yyyyyyy";
pattern3
"xxx xxxxxx xxxxxxxx
xxx xxxxxx xxxxxxxx";
pattern2 "xxx xxxxxx xxxxxxxx";

Note that the above produces the expected output from the question while the current sed answers do not as their third output line would start with pattern2<blank><blank>"xxxx instead of pattern2<blank>"xxxx. It will also work even if any of the quoted strings contain a ; at the end of a line which the sed answers would fail with, e.g. given this input (note the ; at the end of the 5th line inside a quoted string):

$ cat file
pattern2
"xxx xxxxxx xxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx";
pattern2
"xxxx xxxxxxx xxxxxxxxx xxxxxxxxx;
yyyy yyyyyy yy yyyyyyyyyy yyyyyyy";
pattern3
"xxx xxxxxx xxxxxxxx
xxx xxxxxx xxxxxxxx";
pattern2
"xxx xxxxxx xxxxxxxx";

$ awk 'BEGIN{RS=ORS="\";\n"; FS="\\s*\n\\s*"} /^pattern2/{$1=$1} 1' file
pattern2 "xxx xxxxxx xxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx; yyyy yyyyyy yy yyyyyyyyyy yyyyyyy";
pattern3
"xxx xxxxxx xxxxxxxx
xxx xxxxxx xxxxxxxx";
pattern2 "xxx xxxxxx xxxxxxxx";

Using GNU awk for multi-char RS and \s shorthand:

$ awk 'BEGIN{RS=ORS=";\n"; FS="\\s*\n\\s*"} /^pattern2/{$1=$1} 1' file
pattern2 "xxx xxxxxx xxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx yyyy yyyyyy yy yyyyyyyyyy yyyyyyy";
pattern3
"xxx xxxxxx xxxxxxxx
xxx xxxxxx xxxxxxxx";
pattern2 "xxx xxxxxx xxxxxxxx";

Note that the above produces the expected output from the question while the current sed answers do not as their third output line would start with pattern2<blank><blank>"xxxx instead of pattern2<blank>"xxxx.

Using GNU awk for multi-char RS and \s shorthand:

$ awk 'BEGIN{RS=ORS="\";\n"; FS="\\s*\n\\s*"} /^pattern2/{$1=$1} 1' file
pattern2 "xxx xxxxxx xxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx yyyy yyyyyy yy yyyyyyyyyy yyyyyyy";
pattern3
"xxx xxxxxx xxxxxxxx
xxx xxxxxx xxxxxxxx";
pattern2 "xxx xxxxxx xxxxxxxx";

Note that the above produces the expected output from the question while the current sed answers do not as their third output line would start with pattern2<blank><blank>"xxxx instead of pattern2<blank>"xxxx. It will also work even if any of the quoted strings contain a ; at the end of a line which the sed answers would fail with, e.g. given this input (note the ; at the end of the 5th line inside a quoted string):

$ cat file
pattern2
"xxx xxxxxx xxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx";
pattern2
"xxxx xxxxxxx xxxxxxxxx xxxxxxxxx;
yyyy yyyyyy yy yyyyyyyyyy yyyyyyy";
pattern3
"xxx xxxxxx xxxxxxxx
xxx xxxxxx xxxxxxxx";
pattern2
"xxx xxxxxx xxxxxxxx";

$ awk 'BEGIN{RS=ORS="\";\n"; FS="\\s*\n\\s*"} /^pattern2/{$1=$1} 1' file
pattern2 "xxx xxxxxx xxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx; yyyy yyyyyy yy yyyyyyyyyy yyyyyyy";
pattern3
"xxx xxxxxx xxxxxxxx
xxx xxxxxx xxxxxxxx";
pattern2 "xxx xxxxxx xxxxxxxx";
deleted 2 characters in body
Source Link
Ed Morton
  • 35.8k
  • 6
  • 25
  • 60

Using GNU awk for multi-char RS and \s shorthand:

$ awk 'BEGIN{RS=ORS=";\n"; FS="\\s*\n\\s*"} /^pattern2/{$1=$1} 1' file
pattern2 "xxx xxxxxx xxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx yyyy yyyyyy yy yyyyyyyyyy yyyyyyy";
pattern3
"xxx xxxxxx xxxxxxxx
xxx xxxxxx xxxxxxxx";
pattern2 "xxx xxxxxx xxxxxxxx";

Note that the above produces the expected output from the question while the 2 current sed answers do not as their third output line would start with pattern2<blank><blank>"xxxx instead of pattern2<blank>"xxxx.

Using GNU awk for multi-char RS and \s shorthand:

$ awk 'BEGIN{RS=ORS=";\n"; FS="\\s*\n\\s*"} /^pattern2/{$1=$1} 1' file
pattern2 "xxx xxxxxx xxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx yyyy yyyyyy yy yyyyyyyyyy yyyyyyy";
pattern3
"xxx xxxxxx xxxxxxxx
xxx xxxxxx xxxxxxxx";
pattern2 "xxx xxxxxx xxxxxxxx";

Note that the above produces the expected output from the question while the 2 current sed answers do not as their third output line would start with pattern2<blank><blank>"xxxx instead of pattern2<blank>"xxxx.

Using GNU awk for multi-char RS and \s shorthand:

$ awk 'BEGIN{RS=ORS=";\n"; FS="\\s*\n\\s*"} /^pattern2/{$1=$1} 1' file
pattern2 "xxx xxxxxx xxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx yyyy yyyyyy yy yyyyyyyyyy yyyyyyy";
pattern3
"xxx xxxxxx xxxxxxxx
xxx xxxxxx xxxxxxxx";
pattern2 "xxx xxxxxx xxxxxxxx";

Note that the above produces the expected output from the question while the current sed answers do not as their third output line would start with pattern2<blank><blank>"xxxx instead of pattern2<blank>"xxxx.

added 206 characters in body
Source Link
Ed Morton
  • 35.8k
  • 6
  • 25
  • 60

Using GNU awk for multi-char RS and \s shorthand:

$ awk 'BEGIN{RS=ORS=";\n"; FS="\\s*\n\\s*"} /^pattern2/{$1=$1} 1' file
pattern2 "xxx xxxxxx xxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx yyyy yyyyyy yy yyyyyyyyyy yyyyyyy";
pattern3
"xxx xxxxxx xxxxxxxx
xxx xxxxxx xxxxxxxx";
pattern2 "xxx xxxxxx xxxxxxxx";

Note that the above produces the expected output from the question while the 2 current sed answers do not as their third output line would start with pattern2<blank><blank>"xxxx instead of pattern2<blank>"xxxx.

Using GNU awk for multi-char RS and \s shorthand:

$ awk 'BEGIN{RS=ORS=";\n"; FS="\\s*\n\\s*"} /^pattern2/{$1=$1} 1' file
pattern2 "xxx xxxxxx xxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx yyyy yyyyyy yy yyyyyyyyyy yyyyyyy";
pattern3
"xxx xxxxxx xxxxxxxx
xxx xxxxxx xxxxxxxx";
pattern2 "xxx xxxxxx xxxxxxxx";

Using GNU awk for multi-char RS and \s shorthand:

$ awk 'BEGIN{RS=ORS=";\n"; FS="\\s*\n\\s*"} /^pattern2/{$1=$1} 1' file
pattern2 "xxx xxxxxx xxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx";
pattern2 "xxxx xxxxxxx xxxxxxxxx xxxxxxxxx yyyy yyyyyy yy yyyyyyyyyy yyyyyyy";
pattern3
"xxx xxxxxx xxxxxxxx
xxx xxxxxx xxxxxxxx";
pattern2 "xxx xxxxxx xxxxxxxx";

Note that the above produces the expected output from the question while the 2 current sed answers do not as their third output line would start with pattern2<blank><blank>"xxxx instead of pattern2<blank>"xxxx.

Source Link
Ed Morton
  • 35.8k
  • 6
  • 25
  • 60
Loading