2

Search words that start with b and end with o and contains an i or g in a text file.

A command has to be typed in the terminal displaying a word that has the specifications as described above.

I've used the following, but is does not stop at one word and contains white spaces:

~$  egrep -n '\bb.*(i|g).*o\b'

I'm using Linux Ubuntu and unsure of how to do the statement. I've tried several instructions, all to fail. Can anyone help me unravel the regex statement, seeing as I struggle to do so?

An example: Say I have the following random words in a text file:

boo djhg
bio jdjjf
dgdhd bgo
ghhh

Then the words 'boo' , 'bio' and 'bgo' needs to be highlighted.

3
  • Sounds like homework? Commented May 29, 2017 at 18:57
  • I am a third year IT student, who needs to learn how to use specific Linux functions. I've been surfing the web for hours trying to figure out the command, but without any success I turned to this questionnaire sight in order to help me. I couldn't find the directory as well, so I set upon using a textfile instead. My problem is that I struggle with the regex functions Commented May 29, 2017 at 19:45
  • 1
    boo does not fulfil the criteria, surely? Commented May 29, 2017 at 20:33

3 Answers 3

5

The command you're looking for is grep, and the regular expression you want is b[[:alnum:]]*[ig][[:alnum:]]*o.

  • [[:alnum:]] will match a single alphanumeric character.
  • * will match any number (including zero) of the previous expression.
  • [ig] will match a single i or g.
  • All other characters (b and o) in this particular regular expression matches themselves.

The use of [[:alnum::]]* rather than .* avoids matching words that contain spaces.

grep is used like

grep OPTIONS 'EXPRESSION' INPUT-FILES

and will output the lines matching EXPRESSION to its standard output (the terminal, in this case).

In this case, you would want to use the -w and -o options, which forces the expression to match words (strings of characters surrounded by non-word characters) and to only return the matched data (not the whole line).

$ grep -w -o 'b[[:alnum:]]*[ig][[:alnum:]]*o' words
bio
bgo

You mentioned that you wanted to highlight the matched words. This is something that GNU grep can do. I'm dropping the -o option here to get the whole line of each match, otherwise you'll just get the same result as previously, but highlighted, which would be boring.

$ grep --color -w 'b[[:alnum:]]*[ig][[:alnum:]]*o' words
bio jdjjf
dgdhd bgo

As you can see, this only shows the matches on lines that contain matches. To see the full input (even lines with no match), with the matches highlighted, we have to drop the -w option and do

$ grep --color -E '\bb[[:alnum:]]*[ig][[:alnum:]]*o\b|$' words
boo djhg
bio jdjjf
dgdhd bgo
ghhh

We had to add the -E option since | is an extended regular expression. The \b will match at any word boundary.

0
0

I would use grep to do this:

egrep -i "^b.*(i|g)+.*o$" /usr/share/dict/words
  • ^b start with "b"
  • .* anything any times
  • (i|g)* "i" or "g" one or more times
  • o$ end with "o"
2
  • Will not work for lines with more than one word Commented May 29, 2017 at 20:24
  • That's true, it's just for this specific work. we can change ^ and $ with \b to get it work I guess :) Commented May 29, 2017 at 20:44
0
set -f; for w in `cat /usr/share/dict/words`; do
   case $w in b*[ig]*o ) echo "$w" ;; esac
done
# you could as well say: $(< /usr/share/dict/words) in place of the backquoted cat.
# if your version of bash supports it.

We are splitting the words file into words $w and then doing a wildcard checking on it.

  • Wildcard pattern is: b*[ig]*o which is to be read as:
  • $w must begin with the letter "b".
  • $w must end with the letter "o".
  • $w must contain either an "i" or a "g" somewhere in between for it to match
  • Upon a successful match, we display the word.
1
  • The $(<...) ksh feature was added to bash in 2.02 released 20 years ago, so it's quite safe to assume it will be supported. Contrary to when using $(cat) however, read errors are not reported (and that's also the case for all other shells supporting the feature I've tested: ksh93, zsh, mksh) Commented May 29, 2017 at 20:56

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.