3

I am for the first time using grep and after reading the manual I decide to use [:digit:] instead of \d or [0-9] for matching numbers. I found out that in grep, it is actually [[:digit:]] that matches numbers. While I managed to understand why there are double brackets, I cannot figure out a way to match with multiple integers.

echo 'i100s'|grep -o '[[:digit:]]'

will print (as expected):

1
0
0

But if I try

echo 'i100s'|grep -o '[[:digit:]]+'

or

echo 'i100s'|grep -o '[[:digit:]]{0,3}'

or

echo 'i100s'|grep -o '[[:digit:]]\+'
echo 'i100s'|grep -o '[[:digit:]]\{0,3\}'

It will fail to match anything. Why?

3
  • 2
    What OS / what implementation of grep are you using? Commented Jul 3, 2019 at 23:33
  • 1
    Some of your examples will work when using GNU grep. Are you on MacOS? In any case, just add the -E option and then echo 'i100s'|grep -o '[[:digit:]]+' and echo 'i100s'|grep -o '[[:digit:]]{0,3}' should work on any platform. Commented Jul 3, 2019 at 23:46
  • It is CentOS. Adding -E option will make it work. It's quite interesting to know that -E is not necessary for GNU grep to understand the '+' sign Commented Jul 10, 2019 at 18:35

1 Answer 1

5

The + operator was an innovation after the very oldest version of grep. Prior to that, you had to express + as a single instance followed by the same instance with a *. Not too elegant. Clearly the range operator is also in the same category. You'll run into the same problem on Vim when doing a search, unless you preface it with a \v (the vim equivalent of -E).

So as John1024 points out, on a Mac,

$echo 'i100s'| grep -Eo '[[:digit:]]+'
100
$echo 'i100s'| egrep -o '[[:digit:]]+'
100

The documentation for gnu grep (which is available on the Mac through homebrew or other package managers) says its default behavior is -F (assumes basic regular expression), but like john1024 says, my experience is that it supports the advanced expressions without using -E or the egrep variant. If you install gnu grep with homebrew, it helpfully installs it as ggrep, so that if there is some incompatibility between say script behaviors using the Mac's grep and ggrep, you can resolve it by changing symbolic links from one executable to the other.

3
  • 1
    -F is for fixed string. The default in any grep implementation including GNU grep is BRE (-G in GNU grep, like -o not a standard option) not -F. [[:digit:]]+ is standard in ERE, the standard equivalent for BRE in [[:digit:]]\{1,\}. [[:digit:]]{1,} is also standard in grep -E, but wont work in the original egrep (and still in some current egrep implementation like on Solaris; egrep is not a standardized command). Commented Jul 4, 2019 at 5:21
  • The -E extension is needed in order to match for 100 in my case. Likely it is due to the version of my OS... Commented Jul 10, 2019 at 18:30
  • Yeah, Stéphane makes a good point. I was unaware that egrep is a moving target in terms of portability so grep -E seems more reliable. But overall, the problem is simply that you need something beyond the original grep reg-ex language to use even features as common as the '+' operator, let alone some of the others you tried. Nothing wrong with your expressions. You just needed to let grep know that you were going to use a more modern set of operators. Commented Jul 10, 2019 at 19:24

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.