TL;DR
sed -En 'y/,/\n/;/^[^\n]*foo[^\n]*(\n|$)/P;D'
or, portable but less elegant:
sed -En 'y/,/\n/;y/_\n/\n_/;/^[^_]*foo[^_]*(_|$)/{y/\n_/_\n/;P;y/_\n/\n_/;}
y/\n_/_\n/;D;y/_\n/\n_/'
Full answer
The sed(1p) manual page in the POSIX standard describes the s command:
Substitute the replacement string for the first instance
of the regular expression RE in the pattern space.
and
g Make the substitution for all non-overlapping matches of the regular expression, not just the first one.
Also, the definition of pattern space:
In default operation, sed cyclically shall append a line of input, less its terminating <newline> character, into the pattern space.
so, the pattern space is a location in memory which initially holds the entire input line. So, the s/RE/replacement/ command, even with the g modifier, does not split the pattern space. It just Substitutes the RE with the replacement. After executing
printf "a,b,c\n" | sed 's/,/\
/g'
the pattern space will hold the literal text:
a
b
c
To prove this point:
$ printf "one,two,three\nfour,five,six\n" | sed 's/,/\
/g;s/^\(.*\)$/[\1]/'
[one
two
three]
[four
five
six]
or, using GNU sed (in OpenBSD named gsed):
$ printf "a,b,c\n" | gsed --debug 'y/,/\n/'
SED PROGRAM:
y/,/
/
INPUT: 'STDIN' line 1
PATTERN: a,b,c
COMMAND: y/,/
/
PATTERN: a\nb\nc
END-OF-CYCLE:
a
b
c
When the pattern space is subsequently matched, the RE from the match is applied to the entire pattern space as above, not its individual "lines".
What can manipulate the "lines" in the pattern space, however, are the commands P and D:
[2addr]P Write the pattern space, up to the first <newline>, to standard output.
and
[2addr]D If the pattern space contains no <newline>, delete the pattern space and start a normal new cycle as if the d command was issued. Otherwise, delete the initial segment of the pattern space through the first <newline>, and start the next cycle with the resultant pattern space and without reading any new input.
So, we can use the following command:
$ printf "a,b,c foo\nd foo,e,foo f\ng,h,i\n" |
sed -En 'y/,/\n/;/^[^\n]*foo[^\n]*(\n|$)/P;D'
c foo
d foo
foo f
or, in the more readable form:
$ printf "a,b,c foo\nd foo,e,foo f\ng,h,i\n" |
sed -En 'y/,/\n/
/^[^\n]*foo[^\n]*(\n|$)/P
D'
c foo
d foo
foo f
Explanation
y/,/\n/ - Substitute all the characters , with newlines (similar to the tr(1) utility).
/^[^\n]*foo[^\n]*(\n|$)/P - Print the initial portion of pattern space up to the newline, but only if the initial part of the pattern space up to the newline contains foo.
D - Delete the initial portion of pattern space up to the newline and start the next cycle (reading a new input line to a pattern space only if there was no newline, aka at the final "line" in the pattern space).
Notes
- As @Kusalananda noted, if the intention is to parse CSV, sed is not a good tool for the job. For just the case from this question, awk(1) could be used (note: @Ed Morton's solution is more concise):
printf "a,b foo,c\nd,e,foo g\n" | awk '
BEGIN{FS=","}
{
for (i = 1; i <= NF; i++)
{
if ($i ~ /foo/)
{
print $i
}
}
}'
but it doesn't handle quoted fields (eg. for printf "a,b foo,\"not a foo, delimiter\"\nd,e,foo g\n" it would output "not a foo instead of not a foo, delimiter), etc.
- Regarding the comment by @Stéphane Chazelas about portability, there is a possibility that
\n inside of the [] might not match a newline on some systems. However, as of this writing (2024-09-01), both OpenBSD 7.5 sed and GNU sed 4.9 from OpenBSD (without setting POSIXLY_CORRECT) interpret [^\n] as "match any character except newline". Using the method from the accepted answer to sed: Portable solution to match "any character but newline", which juggles with exchanging the characters _ and newlines by using y, but is less elegant, we get:
$ printf "a,b,c foo\nd foo,e,foo f\ng,h,i\n" |
sed -En 'y/,/\n/;y/_\n/\n_/;/^[^_]*foo[^_]*(_|$)/{y/\n_/_\n/;P;y/_\n/\n_/;}
y/\n_/_\n/;D;y/_\n/\n_/'
\nin the replacement regex is not portable. Using a backslash followed by a newline should work in most implementations of sed, since it is defined by POSIX.s/,/\n/gcan also be replaced by the (portable)y/,/\n/.sedcould be replaced bytr , '\n'. However, I have a hunch that why they want to change commas into newlines might be to process CSV fields (which means they should be using other tools entierly), but that's just my mind reading neurons flashing randomly.