Skip to main content
2 of 13
added 1642 characters in body
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k

It's more a job for perl whose regexp can more easily find matching pairs. Anyway, the few sed implementations that have a -i option for in-place editing have copied it from perl and in ways incompatible between each other.

If we assume that {/} are always matched even inside "..." strings, that could be

perl -0777 -i -pe '
  s{
    ^ \h* filter \s* \{
      ( (?: [^{}]++ | \{ (?1) \} ) * )
    \}
  }{
    $& =~ s{ ^ \h* \K \# \h* (year_n \h* =) }{$1}gmrx
  }gmxe' -- your-file

Where:

  • -p enables the sed mode where files are processed one record at a time where the equivalent of sed's pattern space is the $_ variable (on which s{pattern}{replacement}flags operates by default), like sed's s.
  • -i for in-place editing (copied by some sed implementations).
  • -0777 changes the record separator from the default of newline (like in sed) to some impossible byte value, so the files are processed as a whole. Same as -g in newer versions of perl.
  • then we have a s{pattern}{replacement}gmxe where:
    • x allows adding whitespace (and comments) in the pattern to improve legibility.
    • m makes it so that ^ matches at the start of every line in the subject instead of just as the start of the subject.
    • e is for the replacement to be interpreted as perl code.
    • \s is for any whitespace (well ASCII only by default) including newline, similar to [[:space:]] in POSIX regexp, and \h for horizontal whitespace (ISO8859-1 ones by default, that is space, tab, and non-breaking-space encoded as 0xA0, but importantly not newline; similar to POSIX' [[:blank:]]).
    • The important part in there is (?1) which recalls the regexp in the first (...) capture group so allows for recursive regexps.
    • The replacement applies another s{pattern}{replacement}gmrx to $& which is what was matched by the first regexp with:
      • r returns the result of the substitution instead of applying it in place to $&
      • \K marks the start of what's to Keep from the match, so we don't discard what's matched to the left of it.
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k