Timeline for Can a shell script find and replace patterns inside regions that match a regex?
Current License: CC BY-SA 4.0
7 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Jan 6, 2021 at 15:53 | comment | added | Ed Morton |
@Quasímodo I see that functionality and bug in the spec is now specifically addressed in the gawk manual (gnu.org/software/gawk/manual/gawk.html#Multiple-Line): When RS is set to the empty string and FS is set to a single character, the newline character always acts as a field separator. This is in addition to whatever field separations result from FS....Note that language in the POSIX specification implies that this special feature should apply when FS is a regexp. However, Unix awk has never behaved that way, nor has gawk. This is essentially a bug in POSIX.
|
|
| Jan 6, 2021 at 15:22 | comment | added | Ed Morton | @Quasímodo Ah, now I remember - I had raised the issue with the gawk providers (see lists.gnu.org/archive/html/bug-gawk/2019-04/msg00029.html) and THEY were going to follow up with the standards folks to get it fixed there. Unfortunately at that point I lost interest and didn't pursue the standard change, I expect it is in the queue somewhere. | |
| Jan 6, 2021 at 15:14 | comment | added | Ed Morton |
Right and the current version of the POSIX spec is unfortunately wrong where it says a <newline> shall always be a field separator, no matter what the value of FS is but that's not quite true, it should end with ...if FS is a single char as that's how all awks actually berhave. I'm pretty sure I have a bug report open against the spec about that, let me check...
|
|
| Jan 6, 2021 at 15:13 | comment | added | Quasímodo | Oh, that's it. Quoting verbatim: "The newline shall always be a field separator." So the newline being a field separator does not mean it is the only one. What a tricky wording in the specification! | |
| Jan 6, 2021 at 15:09 | comment | added | Ed Morton | @Quasímodo null RS doesn't cause the field separator to be a newline, it causes it to include a newline. It still also includes blank and tab (or whatever else the FS is set to if it's a single char). | |
| Jan 6, 2021 at 13:13 | comment | added | Quasímodo |
Any idea why omitting -F'\n' slightly changes the output even though the specification says a null RS causes the field separator to always be a newline?
|
|
| Jan 6, 2021 at 1:14 | history | answered | Ed Morton | CC BY-SA 4.0 |