Revisions to sed inplace in selective block

added 6 characters in body

Source Link

edited Aug 23 at 8:36

584.5k
96
1.1k
1.7k

With -I{} alone, it splits on unquoted newlines, removes trailing blanks, and handles some forms of quotes so would choke on file paths that contain those. It also means running one sed invocation for each file. You're also not using the {} place holder in the command.

With -I{}, it splits on unquoted newlines, removes trailing blanks, and handles some forms of quotes so would choke on file paths that contain those. It also means running one sed invocation for each file. You're also not using the {} place holder in the command.

With -I{} alone, it splits on unquoted newlines, removes trailing blanks, and handles some forms of quotes so would choke on file paths that contain those. It also means running one sed invocation for each file. You're also not using the {} place holder in the command.

added 4 characters in body

Source Link

edited Aug 23 at 8:30

Stéphane Chazelas

584.5k
96
1.1k
1.7k

-p enables the sed mode where files are processed one record at a time where the equivalent of sed's pattern space is the $_ variable (on which s{pattern}{replacement}flags operates by default, like sed's s).
-i for in-place editing (since copied by some sed implementations).
-0777 changes the record separator from the default of newline (like in sed) to some impossible byte value, so the files are processed as a whole (the slurp mode). Same as -g in newer versions of perl.
then we have a s{pattern}{replacement}gmxe where:
- x allows adding whitespace (and comments) in the pattern to improve legibility.
- m makes it so that ^ matches at the start of every line in the subject instead of just asat the start of the subject.
- e is for the replacement to be interpretedevaluated as perl code.
- \s is for any whitespace (well ASCII only by default) including newline, similar to [[:space:]] in POSIX regexp, and \h for horizontal whitespace (ISO8859-1 ones by default, that is space, tab, and non-breaking-space encoded as 0xA0, but importantly not newline; similar to POSIX' [[:blank:]]).
- ++ is like + but non-backtracking. Can help the matcher not get lost in a backtracking maze if there were unmatched {/}s.
- The important part in there is (?1) which recalls the regexp in the first (...) capture group so allows for recursive regexps.
- The replacement applies another s{pattern}{replacement}gmrx to $& which is what was matched by the first regexp with:
  - r returns the result of the substitution instead of applying it in place to $&
  - \K marks the start of what's to Keep from the match, so we don't discard what's matched by what's to the left of it.

By default grep and sed take Basic Regular Expressions, and Extended Regular Expressions with -E. A few grep implementations (such as GNU grep when built with optional PCRE2 support) can support perl-like regexps with a -P option, but very few sed implementations do.

-p enables the sed mode where files are processed one record at a time where the equivalent of sed's pattern space is the $_ variable (on which s{pattern}{replacement}flags operates by default, like sed's s).
-i for in-place editing (since copied by some sed implementations).
-0777 changes the record separator from the default of newline (like in sed) to some impossible byte value, so the files are processed as a whole (the slurp mode). Same as -g in newer versions of perl.
then we have a s{pattern}{replacement}gmxe where:
- x allows adding whitespace (and comments) in the pattern to improve legibility.
- m makes it so that ^ matches at the start of every line in the subject instead of just as the start of the subject.
- e is for the replacement to be interpreted as perl code.
- \s is for any whitespace (well ASCII only by default) including newline, similar to [[:space:]] in POSIX regexp, and \h for horizontal whitespace (ISO8859-1 ones by default, that is space, tab, and non-breaking-space encoded as 0xA0, but importantly not newline; similar to POSIX' [[:blank:]]).
- ++ is like + but non-backtracking. Can help the matcher not get lost in a backtracking maze if there were unmatched {/}s.
- The important part in there is (?1) which recalls the regexp in the first (...) capture group so allows for recursive regexps.
- The replacement applies another s{pattern}{replacement}gmrx to $& which is what was matched by the first regexp with:
  - r returns the result of the substitution instead of applying it in place to $&
  - \K marks the start of what's to Keep from the match, so we don't discard what's matched to the left of it.

By default grep and sed take Basic Regular Expressions, and Extended Regular Expressions with -E. A few grep implementations (such as grep when built with optional PCRE2 support) can support perl-like regexps with a -P option, but very few sed implementations do.

-p enables the sed mode where files are processed one record at a time where the equivalent of sed's pattern space is the $_ variable (on which s{pattern}{replacement}flags operates by default, like sed's s).
-i for in-place editing (since copied by some sed implementations).
-0777 changes the record separator from the default of newline (like in sed) to some impossible byte value, so the files are processed as a whole (the slurp mode). Same as -g in newer versions of perl.
then we have a s{pattern}{replacement}gmxe where:
- x allows adding whitespace (and comments) in the pattern to improve legibility.
- m makes it so that ^ matches at the start of every line in the subject instead of just at the start of the subject.
- e is for the replacement to be evaluated as perl code.
- \s is for any whitespace (well ASCII only by default) including newline, similar to [[:space:]] in POSIX regexp, and \h for horizontal whitespace (ISO8859-1 ones by default, that is space, tab, and non-breaking-space encoded as 0xA0, but importantly not newline; similar to POSIX' [[:blank:]]).
- ++ is like + but non-backtracking. Can help the matcher not get lost in a backtracking maze if there were unmatched {/}s.
- The important part in there is (?1) which recalls the regexp in the first (...) capture group so allows for recursive regexps.
- The replacement applies another s{pattern}{replacement}gmrx to $& which is what was matched by the first regexp with:
  - r returns the result of the substitution instead of applying it in place to $&
  - \K marks the start of what's to Keep from the match, so we don't discard what's matched by what's to the left of it.

By default grep and sed take Basic Regular Expressions, and Extended Regular Expressions with -E. A few grep implementations (such as GNU grep when built with optional PCRE2 support) can support perl-like regexps with a -P option, but very few sed implementations do.

added 6 characters in body

Source Link

edited Aug 23 at 6:55

Stéphane Chazelas

584.5k
96
1.1k
1.7k

-p enables the sed mode where files are processed one record at a time where the equivalent of sed's pattern space is the $_ variable (on which s{pattern}{replacement}flags operates by default, like sed's s).
-i for in-place editing (since copied by some sed implementations).
-0777 changes the record separator from the default of newline (like in sed) to some impossible byte value, so the files are processed as a whole (the slurp mode). Same as -g in newer versions of perl.
then we have a s{pattern}{replacement}gmxe where:
- x allows adding whitespace (and comments) in the pattern to improve legibility.
- m makes it so that ^ matches at the start of every line in the subject instead of just as the start of the subject.
- e is for the replacement to be interpreted as perl code.
- \s is for any whitespace (well ASCII only by default) including newline, similar to [[:space:]] in POSIX regexp, and \h for horizontal whitespace (ISO8859-1 ones by default, that is space, tab, and non-breaking-space encoded as 0xA0, but importantly not newline; similar to POSIX' [[:blank:]]).
- ++ is like + but non-backtracking. Can help the matcher not get lost in a backtracking maze if there were unmatched {/}s.
- The important part in there is (?1) which recalls the regexp in the first (...) capture group so allows for recursive regexps.
- The replacement applies another s{pattern}{replacement}gmrx to $& which is what was matched by the first regexp with:
  - r returns the result of the substitution instead of applying it in place to $&
  - \K marks the start of what's to Keep from the match, so we don't discard what's matched to the left of it.

Some grep and sed implementations support \s and/or *? with -E though. The latter is now specified by POSIX for extended regular expressions since the 2024 edition, but few implementations support it yet as of 2025.

No grep implementation that I know has a slurp mode that matches the regexp against the whole file, though pcre2grep comes close with its -M for multiline mode, and GNU grep with its -z to process NUL-delimiter records (text files are not meant to contain NULs). By default, like sed, they work on one line at a time. GNU sed also has -z for NUL-delimited records, or you can load the whole input into the pattern space programmatically in sed code with a -e :1 -e '$!{N;b1' -e '}' though beware some seds have a relatively low limit on the size of their pattern space.

With -I{}, it splits on unquoted newlines, removeremoves trailing blanks, and handles some forms of quotes so would choke on file paths that contain those. It also means running one sed invocation for each file. You're also not using the {} place holder in the command.

It should be grep -rlZ ... | xargs -r0 sed ... (-Z like -r being a non-standard GNU extension).

-p enables the sed mode where files are processed one record at a time where the equivalent of sed's pattern space is the $_ variable (on which s{pattern}{replacement}flags operates by default, like sed's s).
-i for in-place editing (since copied by some sed implementations).
-0777 changes the record separator from the default of newline (like in sed) to some impossible byte value, so the files are processed as a whole. Same as -g in newer versions of perl.
then we have a s{pattern}{replacement}gmxe where:
- x allows adding whitespace (and comments) in the pattern to improve legibility.
- m makes it so that ^ matches at the start of every line in the subject instead of just as the start of the subject.
- e is for the replacement to be interpreted as perl code.
- \s is for any whitespace (well ASCII only by default) including newline, similar to [[:space:]] in POSIX regexp, and \h for horizontal whitespace (ISO8859-1 ones by default, that is space, tab, and non-breaking-space encoded as 0xA0, but importantly not newline; similar to POSIX' [[:blank:]]).
- ++ is like + but non-backtracking. Can help the matcher not get lost in a backtracking maze if there were unmatched {/}s.
- The important part in there is (?1) which recalls the regexp in the first (...) capture group so allows for recursive regexps.
- The replacement applies another s{pattern}{replacement}gmrx to $& which is what was matched by the first regexp with:
  - r returns the result of the substitution instead of applying it in place to $&
  - \K marks the start of what's to Keep from the match, so we don't discard what's matched to the left of it.

Some grep and sed implementations support \s and/or *? with -E though. The latter is now specified by POSIX for extended regular expressions since the 2024 edition, but few implementations support it yet as of 2025.

With -I{}, it splits on unquoted newlines, remove trailing blanks, and handles some forms of quotes so would choke file paths that contain those. You're also not using the {} place holder in the command.

It should be grep -rlZ ... | xargs -r0 ... (-Z like -r being a GNU extension).

-p enables the sed mode where files are processed one record at a time where the equivalent of sed's pattern space is the $_ variable (on which s{pattern}{replacement}flags operates by default, like sed's s).
-i for in-place editing (since copied by some sed implementations).
-0777 changes the record separator from the default of newline (like in sed) to some impossible byte value, so the files are processed as a whole (the slurp mode). Same as -g in newer versions of perl.
then we have a s{pattern}{replacement}gmxe where:
- x allows adding whitespace (and comments) in the pattern to improve legibility.
- m makes it so that ^ matches at the start of every line in the subject instead of just as the start of the subject.
- e is for the replacement to be interpreted as perl code.
- \s is for any whitespace (well ASCII only by default) including newline, similar to [[:space:]] in POSIX regexp, and \h for horizontal whitespace (ISO8859-1 ones by default, that is space, tab, and non-breaking-space encoded as 0xA0, but importantly not newline; similar to POSIX' [[:blank:]]).
- ++ is like + but non-backtracking. Can help the matcher not get lost in a backtracking maze if there were unmatched {/}s.
- The important part in there is (?1) which recalls the regexp in the first (...) capture group so allows for recursive regexps.
- The replacement applies another s{pattern}{replacement}gmrx to $& which is what was matched by the first regexp with:
  - r returns the result of the substitution instead of applying it in place to $&
  - \K marks the start of what's to Keep from the match, so we don't discard what's matched to the left of it.

Some grep and sed implementations support \s and/or *? with -E though. The latter is now specified by POSIX for extended regular expressions since the 2024 edition, but few implementations support it yet as of 2025.

No grep implementation that I know has a slurp mode that matches the regexp against the whole file, though pcre2grep comes close with its -M for multiline mode, and GNU grep with its -z to process NUL-delimiter records (text files are not meant to contain NULs). By default, like sed, they work on one line at a time. GNU sed also has -z for NUL-delimited records, or you can load the whole input into the pattern space programmatically in sed code with a -e :1 -e '$!{N;b1' -e '}' though beware some seds have a relatively low limit on the size of their pattern space.

With -I{}, it splits on unquoted newlines, removes trailing blanks, and handles some forms of quotes so would choke on file paths that contain those. It also means running one sed invocation for each file. You're also not using the {} place holder in the command.

It should be grep -rlZ ... | xargs -r0 sed ... (-Z like -r being a non-standard GNU extension).

added 6 characters in body

Source Link

edited Aug 23 at 6:46

Stéphane Chazelas

584.5k
96
1.1k
1.7k

Loading

added 6 characters in body

Source Link

edited Aug 23 at 6:25

Stéphane Chazelas

584.5k
96
1.1k
1.7k

Loading

added 6 characters in body

Source Link

edited Aug 23 at 6:19

Stéphane Chazelas

584.5k
96
1.1k
1.7k

Loading

added 311 characters in body

Source Link

edited Aug 23 at 5:59

Stéphane Chazelas

584.5k
96
1.1k
1.7k

Loading

simplified regex

Source Link

edited Aug 23 at 5:49

Stéphane Chazelas

584.5k
96
1.1k
1.7k

Loading

simplified regex

Source Link

edited Aug 23 at 5:24

Stéphane Chazelas

584.5k
96
1.1k
1.7k

Loading

Better / suffix than -H as we likely don't want to process conf_dir if it's itself a symlink to a regular file (as perl -i like sed -i where supported breaks symlinks)

Source Link

edited Aug 22 at 18:04

Stéphane Chazelas

584.5k
96
1.1k
1.7k

Loading

added 1642 characters in body

Source Link

edited Aug 22 at 16:23

Stéphane Chazelas

584.5k
96
1.1k
1.7k

Loading

added 1642 characters in body

Source Link

edited Aug 22 at 16:15

Stéphane Chazelas

584.5k
96
1.1k
1.7k

Loading

Source Link

answered Aug 22 at 16:02

Stéphane Chazelas

584.5k
96
1.1k
1.7k

Loading

Stack Exchange Network

Return to Answer