2

Why is output for grep -o the same with or without LC_ALL=C? There is a difference for grep with no flags for me as expected but there's no difference for grep -o. Does grep -o always use LC_ALL=C or something else?

[aa@bb grep-test]$ cat input.txt
aa bb
CC cc
dd ee

[aa@bb grep-test]$ LC_ALL=C grep -o [A-Z] input.txt
C
C
[aa@bb grep-test]$ grep -o [A-Z] input.txt
C
C
[aa@bb grep-test]$ LC_ALL=C grep [A-Z] input.txt
CC cc
[aa@bb grep-test]$ grep [A-Z] input.txt
aa bb
CC cc
dd ee
[aa@bb grep-test]$ grep -V
GNU grep 2.6.3
...
[aa@bb src]$ ./grep -V
grep (GNU grep) 2.27
...
[aa@bb src]$ ./grep [A-Z] ../../test
CC cc
[aa@bb src]$ 
[aa@bb grep-test]$ grep a input.txt
aa bb
[aa@bb grep-test]$ grep C input.txt
CC cc
[aa@bb grep-test]$ locale
LANG=en_IE
LC_CTYPE="en_IE"
LC_NUMERIC="en_IE"
LC_TIME="en_IE"
LC_COLLATE="en_IE"
LC_MONETARY="en_IE"
LC_MESSAGES="en_IE"
LC_PAPER="en_IE"
LC_NAME="en_IE"
LC_ADDRESS="en_IE"
LC_TELEPHONE="en_IE"
LC_MEASUREMENT="en_IE"
LC_IDENTIFICATION="en_IE"
LC_ALL=
[aa@bb grep-test]$ bash --version
GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
...
[aa@bb grep-test]$ xxd input.txt
0000000: 6161 2062 620a 4343 2063 630a 6464 2065  aa bb.CC cc.dd e
0000010: 650a 0a                                  e..
[aa@bb grep-test]$ cat -A input.txt
aa bb$
CC cc$
dd ee$
$
[aa@bb grep-test]$
10
  • I think what has to be going on is that the line ending character of the file input.txt is not being recognized as such. I'm not sure why else grep [A-Z] input.txt would match every line of input.txt. How was input.txt created? What happens if you try to match another line, say `grep a input.txt'? Does that also match every line of input.txt? Commented Jan 18, 2017 at 21:52
  • 1
    grep 2.6.3 is pretty old. Can you try with a newer version so as to discriminate between a (already fixed) bug in grep, and a legitimate functioning due to specificities of your environment? Commented Jan 18, 2017 at 22:12
  • It was created with cat > input.txt and I typed into the terminal. Commented Jan 18, 2017 at 23:09
  • 1
    Well the Bug fixes section of the NEWS file for GNU grep 2.8 rather cryptically notes that "grep's interpretation of range expression is now more consistent with that of other tools. [bug present since multi-byte character set support was introduced in 2.5.2, though the steps needed to reproduce it changed in grep-2.6]" - maybe that gives you a lead? Commented Jan 19, 2017 at 0:05
  • 1
    It is also fixed in 2.7, the next release after 2.6.3 Commented Jan 19, 2017 at 0:35

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.