1. Home
2. Questions
3. Unanswered
4. AI Assist Labs
5. Tags
7. Chat
8. Users
10. Companies
Teams

Ask questions, find answers and collaborate at work with Stack Overflow for Teams.
Try Teams for free Explore Teams
Teams
Ask questions, find answers and collaborate at work with Stack Overflow for Teams. Explore Teams

grep with continuation lines

Ask Question

Asked 10 years, 5 months ago

Modified 9 years, 6 months ago

Viewed 3k times

6

How can I grep/awk/sed a file looking for some pattern, and print the entire line (including continuation lines if the matched line ends with \?

File foo.txt contains:

something
whatever
thisXXX line \
    has a continuation line
blahblah
a \
multipleXXX \
continuation \
line

What should I execute to get (not necessarily in one line, not necessarily removing multiple spaces):

thisXXX line has a continuation line
a multipleXXX continuation line

BTW I'm using bash and fedora21, so it does not need to be POSIX-compliant (but I'll appreciate a solution if it is POSIX)

edited Jun 4, 2015 at 16:37

cuonglm

158k41 gold badges342 silver badges420 bronze badges

asked Jun 4, 2015 at 15:16

Carlos Campderrós

2,1812 gold badges16 silver badges18 bronze badges

Do you want the search to span over continuation lines? i.e. if you're searching for hello, does hel\␤lo match?

Gilles 'SO- stop being evil'
– Gilles 'SO- stop being evil'

2015-06-04 22:08:57 +00:00
Commented Jun 4, 2015 at 22:08
@gilles, yes, same as with sh

Carlos Campderrós
– Carlos Campderrós

2015-06-05 13:43:06 +00:00
Commented Jun 5, 2015 at 13:43

Add a comment |

7 Answers 7

Sorted by:

6

Another approach using perl to remove newlines that are preceded by \ and whitespace:

$ perl -pe 's/\\\n/ /' file | grep XXX
thisXXX line      has a continuation line
a  multipleXXX  continuation  line

To remove extra spaces, pass it through sed:

$ perl -pe 's/\\\n/ /' file | grep XXX | sed 's/  */ /g'
thisXXX line has a continuation line
a multipleXXX continuation line

edited Jun 4, 2015 at 22:12

Gilles 'SO- stop being evil'

865k205 gold badges1.8k silver badges2.3k bronze badges

answered Jun 4, 2015 at 16:05

terdon♦

252k69 gold badges480 silver badges718 bronze badges

According to the shell rules for continuation lines, 's/\\\n/ /' should be changed to 's/\\\n//', i.e. backspace + newline should be replaced by nothing, not by a space.

vinc17
– vinc17

2022-01-13 13:53:31 +00:00
Commented Jan 13, 2022 at 13:53
@vinc17 if there is no trailing space in the first line, you will change hello\nworld to helloworld instead of hello world. That's why I want to replace with a space.

terdon
– terdon ♦

2022-01-13 15:38:48 +00:00
Commented Jan 13, 2022 at 15:38
@terdon But this is not the correct rule. For instance, if you type echo "foo`, then Enter, then bar"` in a shell, you get foobar, not foo bar. If you want a space between foo and bar, then put one either before the backslash or at the beginning of the second line (just before bar).

vinc17
– vinc17

2022-01-15 00:09:17 +00:00
Commented Jan 15, 2022 at 0:09
@vnc but why are you thinking about shell commands? The question doesn't mention shell commands, the file's extension doesn't suggest a shell script and the OP's example is words, not code. You would be right for code, of course, but this doesn't seem to be about code.

terdon
– terdon ♦

2022-01-15 10:43:57 +00:00
Commented Jan 15, 2022 at 10:43
Nice, this did the trick, wrapped the perl in a shell script so that I could invoke it easily from find(1): find . -type f -exec mysearch.sh {} +. To remove space (and tabs) uses sed 's/[[:space:]][[:space:]]*/ /g'

alls0rts
– alls0rts

2024-09-24 14:16:34 +00:00
Commented Sep 24, 2024 at 14:16

Add a comment |

5

With POSIX sed:

$ sed -e '
:1
/\\$/{N
  s/\n//              
  t1
}
/\\/!d 
s/\\[[:blank:]]*//g
' file

edited Jun 4, 2015 at 16:24

answered Jun 4, 2015 at 16:18

cuonglm

158k41 gold badges342 silver badges420 bronze badges

@don_crissti Pipe this into grep XXX

Gilles 'SO- stop being evil'
– Gilles 'SO- stop being evil'

2015-06-04 22:10:58 +00:00
Commented Jun 4, 2015 at 22:10
@don_crissti: I don't see matching XXX in requirement.

cuonglm
– cuonglm

2015-06-05 01:07:03 +00:00
Commented Jun 5, 2015 at 1:07
@Gilles - no, it doesn't work like that. Change OP's input replacing something with XXX (without a trailing backslash) on first line and then try piping this sed command to grep. You won't get the XXX line in the final output. choroba's solution fails in a similar manner while jimmij's prints the second line too (it shouldn't).

don_crissti
– don_crissti

2015-06-05 17:04:08 +00:00
Commented Jun 5, 2015 at 17:04

Add a comment |

5

With pcregrep without changing structure of the lines:

pcregrep -M '^(.|\\\n)*XXX(.|\n)*?[^\\]$' file

answered Jun 4, 2015 at 16:32

jimmij

48.7k20 gold badges136 silver badges141 bronze badges

As long as a backslash + newline cannot appear in the regexp XXX, this is a nice solution as it can be used recursively with -r (with the other solutions, one would need to use find in such a case). There is only a possible issue with [^\\]$ if the matched text is at the end of the file. I think that this should be corrected to '^(\\\n|.)*XXX(\\\n|.)*' (with \\\n before .).

vinc17
– vinc17

2022-01-13 13:34:46 +00:00
Commented Jan 13, 2022 at 13:34
I meant: As long as a backslash + newline cannot appear in the text that should be matched. Indeed, as said in a comment to the question, when searching for hello, hel\␤lo should match. In such a case, this would need to introduce (\\\n)* at each "point" of the regexp!

vinc17
– vinc17

2022-01-13 13:50:02 +00:00
Commented Jan 13, 2022 at 13:50

Add a comment |

5

Perl to the rescue:

perl -ne 'if (/\\$/) { $l .= $_ }
          else { print $l, $_ if $l =~ /XXX/;
                 $l = "";
          }' foo.txt

$l works as an accumulator. -n processes the input line by line (cf. sed), if the line ends in a backslash, it's added to the accumulator, if not, the accumulator plus the line is printed provided it matches XXX, and the accumulator is emptied.

edited Jun 5, 2015 at 7:23

answered Jun 4, 2015 at 15:24

choroba

49.4k7 gold badges92 silver badges118 bronze badges

Add a comment |

4

My twist:

perl -0777 -ne '                           # read the entire file into $_
    s{ [[:blank:]]* \\ \n [[:blank:]]* }   # join continued lines
     { }gx;
    print grep {/XXX/} split /(?<=\n)/     # print the matching lines
' foo.txt

thisXXX line has a continuation line
a multipleXXX continuation line

answered Jun 4, 2015 at 19:25

glenn jackman

88.5k16 gold badges124 silver badges179 bronze badges

Add a comment |

3

I'd say Perl is the simplest here. It isn't POSIX, though it's in the default installation of most non-embedded unices. If you want POSIX, use awk.

awk '{if (/\\$/) printf "%s" $0; else print}'

This collapses continuation lines. If you want to find patterns that spread over a continuation, pipe this into grep. If you want to match only uninterrupted patterns, let awk accumulate continued lines and do the matching.

awk '{
    if (sub(/\\$/,"")) {
        line = line $0;
    } else {
        if (/XXX/) print;
        line = "";
    }
}'

edited Apr 13, 2017 at 12:36

CommunityBot

1

answered Jun 5, 2015 at 1:06

Gilles 'SO- stop being evil'

865k205 gold badges1.8k silver badges2.3k bronze badges

Add a comment |

0

This is a small improvement to Gilles awk solution (thanks Gilles!), but does require nawk:

nawk '{if (/\\$/) {$0=substr($0,1,length($0)-2); printf "%s",$0} else print}'

This will create a continuous line if the line wraps, but does not include the "\" and space character. (I found this helpful when grepping for PATH statements since the "\" can lead to confusion when interpreting the results.)

edited Apr 28, 2016 at 19:03

answered Apr 28, 2016 at 18:56

newscripter

11 bronze badge

Add a comment |

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.