I have a text file and I want to count the number of occurrences the word Orange or orange appears in the text file. orange or Orange should not be part of a word(i.e. "oranges","orangeade", etc).
I was thinking about using grep.
Any suggestions?
With GNU grep:
grep -wo '[Oo]range' filename | wc -l
Here the -w makes it only match whole words; the -o makes it split each occurrence of a match on the same line into its own line and suppresses other output, and the expression will match "Orange" or "orange".
(If you want it to be wholly case-insensitive, and also match "ORANGE" and "OraNGe", etc., you can add the -i flag and simply use 'orange' for the pattern.)
This is then passed to wc which counts the words. Since each is on its own line you could use wc -l or wc -w and the results will be the same.
Using perl g modifiers on match operator :
perl -lne 'push @C,/\b[Oo]range\b/g;}{print ++$#C' file
The grep will find and count (-c key) only know patterns. So if you are interested how many times a word orange appears in the text:
grep -c -e "\borange\b" file
But if you are interested in building a dictionary, with unknown set of words - you would need to switch to awk or perl.
-c option counts lines only, even if -o is used. This isn't going to give the right number if "orange" occurs multiple times on the same line.