Skip to main content
added 145 characters in body
Source Link
Edgar Magallon
  • 5.1k
  • 3
  • 15
  • 29
grep -RhFh '\*''*' | tr -s ' ' | sort | uniq -c
grep -hFh '\*''*' * 2>/dev/null | tr -s ' ' | sort | uniq -c
grep -hFh '\*''*' **/* 2>/dev/null | tr -s ' ' | sort | uniq -c | sed 's/.$//'

Or to avoid using 2>/dev/null:

find . -type f -exec grep -Fh '*' {} + | tr -s ' ' | sort | uniq -c | sed 's/.$//'

The section grep -RhFh '\*''*' means that will match any line which has a * at the end of this one. -R is recursive (so it will search in all files you have in the current directory) and -h suppress printing the filenames whose matches the pattern and -F is for using literal strings (the '*' behaves as a string and not as pattern).
About tr -s ' ' I'm removing repeated spaces between every line, for example having this:

grep -hFh '\*''*'  * | tr -s ' ' | sort | uniq -c | sed 's/.$//'
#or
grep -hFh '\*''*' * | tr -s ' ' | sort | uniq -c | sed 's/\*//'

I think and hope there are better ways to achieve that, because here I'm using 5several commands to get the desired output. So as I said it may result in slow performance.

grep -Rh '\*' | tr -s ' ' | sort | uniq -c
grep -h '\*' * | tr -s ' ' | sort | uniq -c
grep -h '\*' **/* 2>/dev/null | tr -s ' ' | sort | uniq -c | sed 's/.$//'

The section grep -Rh '\*' means that will match any line which has a * at the end of this one. -R is recursive (so it will search in all files you have in the current directory) and -h suppress printing the filenames whose matches the pattern.
About tr -s ' ' I'm removing repeated spaces between every line, for example having this:

grep -h '\*'  * | tr -s ' ' | sort | uniq -c | sed 's/.$//'
#or
grep -h '\*' * | tr -s ' ' | sort | uniq -c | sed 's/\*//'

I think and hope there are better ways to achieve that, because here I'm using 5 commands to get the desired output. So as I said it may result in slow performance.

grep -Fh '*' | tr -s ' ' | sort | uniq -c
grep -Fh '*' * 2>/dev/null | tr -s ' ' | sort | uniq -c
grep -Fh '*' **/* 2>/dev/null | tr -s ' ' | sort | uniq -c | sed 's/.$//'

Or to avoid using 2>/dev/null:

find . -type f -exec grep -Fh '*' {} + | tr -s ' ' | sort | uniq -c | sed 's/.$//'

The section grep -Fh '*' means that will match any line which has a * at the end of this one. -h suppress printing the filenames whose matches the pattern and -F is for using literal strings (the '*' behaves as a string and not as pattern).
About tr -s ' ' I'm removing repeated spaces between every line, for example having this:

grep -Fh '*'  * | tr -s ' ' | sort | uniq -c | sed 's/.$//'
#or
grep -Fh '*' * | tr -s ' ' | sort | uniq -c | sed 's/\*//'

I think and hope there are better ways to achieve that, because here I'm using several commands to get the desired output. So as I said it may result in slow performance.

added 249 characters in body
Source Link
Edgar Magallon
  • 5.1k
  • 3
  • 15
  • 29

I'm not sure if this can affect the performance (if you have very larges files I would think this command should be slow):

grep -Rh '\*' | tr -s ' ' | sort | uniq -c

More portable:

grep -h '\*' * | tr -s ' ' | sort | uniq -c

And if you have sub-directories with more files you want to search inside:

grep -h '\*' **/* 2>/dev/null | tr -s ' ' | sort | uniq -c | sed 's/.$//'

The section grep -Rh '\*' means that will match any line which has a * at the end of this one. -R is recursive (so it will search in all files you have in the current directory) and -h suppress printing the filenames whose matches the pattern.
About tr -s ' ' I'm removing repeated spaces between every line, for example having this:

Need *
Word   buzz *
Need *
More   *
More *
Word   *
More   *
More *
Word   *
Word   *
Need *
More *

the tr command will parse it to:

Need *
Word buzz *
Need *
More *
More *
Word *
More *
More *
Word *
Word *
Need *
More *

The content above is piped to sort to have this output:

More *
More *
More *
More *
More *
Need *
Need *
Need *
Word *
Word *
Word *
Word buzz *

And finally with uniq -c I'm prefixing lines by the number of occurrences of every word which is what you want.

The sort command is important, if you do not use it, the expected result will be different

According to the output above, the final output (by using uniq -c) will be:

5 More *
3 Need *
3 Word *
1 Word buzz *

If you want to remove the * you can pipe to sed to remove the last character or *:

grep -Rhh '\*'  * | tr -s ' ' | sort | uniq -c | sed 's/.$//'
#or
grep -Rhh '\*' * | tr -s ' ' | sort | uniq -c | sed 's/\*//'

I think and hope there are better ways to achieve that, because here I'm using 5 commands to get the desired output. So as I said it may result in slow performance.

I'm not sure if this can affect the performance (if you have very larges files I would think this command should be slow):

grep -Rh '\*' | tr -s ' ' | sort | uniq -c

The section grep -Rh '\*' means that will match any line which has a * at the end of this one. -R is recursive (so it will search in all files you have in the current directory) and -h suppress printing the filenames whose matches the pattern.
About tr -s ' ' I'm removing repeated spaces between every line, for example having this:

Need *
Word   buzz *
Need *
More   *
More *
Word   *
More   *
More *
Word   *
Word   *
Need *
More *

the tr command will parse it to:

Need *
Word buzz *
Need *
More *
More *
Word *
More *
More *
Word *
Word *
Need *
More *

The content above is piped to sort to have this output:

More *
More *
More *
More *
More *
Need *
Need *
Need *
Word *
Word *
Word *
Word buzz *

And finally with uniq -c I'm prefixing lines by the number of occurrences of every word which is what you want.

The sort command is important, if you do not use it, the expected result will be different

According to the output above, the final output (by using uniq -c) will be:

5 More *
3 Need *
3 Word *
1 Word buzz *

If you want to remove the * you can pipe to sed to remove the last character or *:

grep -Rh '\*' | tr -s ' ' | sort | uniq -c | sed 's/.$//'
#or
grep -Rh '\*' | tr -s ' ' | sort | uniq -c | sed 's/\*//'

I think and hope there are better ways to achieve that, because here I'm using 5 commands to get the desired output. So as I said it may result in slow performance.

I'm not sure if this can affect the performance (if you have very larges files I would think this command should be slow):

grep -Rh '\*' | tr -s ' ' | sort | uniq -c

More portable:

grep -h '\*' * | tr -s ' ' | sort | uniq -c

And if you have sub-directories with more files you want to search inside:

grep -h '\*' **/* 2>/dev/null | tr -s ' ' | sort | uniq -c | sed 's/.$//'

The section grep -Rh '\*' means that will match any line which has a * at the end of this one. -R is recursive (so it will search in all files you have in the current directory) and -h suppress printing the filenames whose matches the pattern.
About tr -s ' ' I'm removing repeated spaces between every line, for example having this:

Need *
Word   buzz *
Need *
More   *
More *
Word   *
More   *
More *
Word   *
Word   *
Need *
More *

the tr command will parse it to:

Need *
Word buzz *
Need *
More *
More *
Word *
More *
More *
Word *
Word *
Need *
More *

The content above is piped to sort to have this output:

More *
More *
More *
More *
More *
Need *
Need *
Need *
Word *
Word *
Word *
Word buzz *

And finally with uniq -c I'm prefixing lines by the number of occurrences of every word which is what you want.

The sort command is important, if you do not use it, the expected result will be different

According to the output above, the final output (by using uniq -c) will be:

5 More *
3 Need *
3 Word *
1 Word buzz *

If you want to remove the * you can pipe to sed to remove the last character or *:

grep -h '\*'  * | tr -s ' ' | sort | uniq -c | sed 's/.$//'
#or
grep -h '\*' * | tr -s ' ' | sort | uniq -c | sed 's/\*//'

I think and hope there are better ways to achieve that, because here I'm using 5 commands to get the desired output. So as I said it may result in slow performance.

deleted 60 characters in body
Source Link
Edgar Magallon
  • 5.1k
  • 3
  • 15
  • 29

I'm not sure if this can affect the performance (if you have very larges files I would think this command should be slow):

grep -Rh '\*' | tr --squeeze-repeatss ' ' | sort | uniq -c

The section grep -Rh '\*' means that will match any line which has a * at the end of this one. -R is recursive (so it will search in all files you have in the current directory) and -h suppress printing the filenames whose matches the pattern.
About tr --squeeze-repeatss ' ' I'm removing repeated spaces between every line, for example having this:

Need *
Word   buzz *
Need *
More   *
More *
Word   *
More   *
More *
Word   *
Word   *
Need *
More *

the tr command will parse it to:

Need *
Word buzz *
Need *
More *
More *
Word *
More *
More *
Word *
Word *
Need *
More *

The content above is piped to sort to have this output:

More *
More *
More *
More *
More *
Need *
Need *
Need *
Word *
Word *
Word *
Word buzz *

And finally with uniq -c I'm prefixing lines by the number of occurrences of every word which is what you want.

The sort command is important, if you do not use it, the expected result will be different

According to the output above, the final output (by using uniq -c) will be:

5 More *
3 Need *
3 Word *
1 Word buzz *

If you want to remove the * you can pipe to sed to remove the last character or *:

grep -Rh '\*' | tr --squeeze-repeatss ' ' | sort | uniq -c | sed 's/.$//'
#or
grep -Rh '\*' | tr --squeeze-repeatss ' ' | sort | uniq -c | sed 's/\*//'

I think and hope there are better ways to achieve that, because here I'm using 5 commands to get the desired output. So as I said it may result in slow performance.

I'm not sure if this can affect the performance (if you have very larges files I would think this command should be slow):

grep -Rh '\*' | tr --squeeze-repeats ' ' | sort | uniq -c

The section grep -Rh '\*' means that will match any line which has a * at the end of this one. -R is recursive (so it will search in all files you have in the current directory) and -h suppress printing the filenames whose matches the pattern.
About tr --squeeze-repeats ' ' I'm removing repeated spaces between every line, for example having this:

Need *
Word   buzz *
Need *
More   *
More *
Word   *
More   *
More *
Word   *
Word   *
Need *
More *

the tr command will parse it to:

Need *
Word buzz *
Need *
More *
More *
Word *
More *
More *
Word *
Word *
Need *
More *

The content above is piped to sort to have this output:

More *
More *
More *
More *
More *
Need *
Need *
Need *
Word *
Word *
Word *
Word buzz *

And finally with uniq -c I'm prefixing lines by the number of occurrences of every word which is what you want.

The sort command is important, if you do not use it, the expected result will be different

According to the output above, the final output (by using uniq -c) will be:

5 More *
3 Need *
3 Word *
1 Word buzz *

If you want to remove the * you can pipe to sed to remove the last character or *:

grep -Rh '\*' | tr --squeeze-repeats ' ' | sort | uniq -c | sed 's/.$//'
#or
grep -Rh '\*' | tr --squeeze-repeats ' ' | sort | uniq -c | sed 's/\*//'

I think and hope there are better ways to achieve that, because here I'm using 5 commands to get the desired output. So as I said it may result in slow performance.

I'm not sure if this can affect the performance (if you have very larges files I would think this command should be slow):

grep -Rh '\*' | tr -s ' ' | sort | uniq -c

The section grep -Rh '\*' means that will match any line which has a * at the end of this one. -R is recursive (so it will search in all files you have in the current directory) and -h suppress printing the filenames whose matches the pattern.
About tr -s ' ' I'm removing repeated spaces between every line, for example having this:

Need *
Word   buzz *
Need *
More   *
More *
Word   *
More   *
More *
Word   *
Word   *
Need *
More *

the tr command will parse it to:

Need *
Word buzz *
Need *
More *
More *
Word *
More *
More *
Word *
Word *
Need *
More *

The content above is piped to sort to have this output:

More *
More *
More *
More *
More *
Need *
Need *
Need *
Word *
Word *
Word *
Word buzz *

And finally with uniq -c I'm prefixing lines by the number of occurrences of every word which is what you want.

The sort command is important, if you do not use it, the expected result will be different

According to the output above, the final output (by using uniq -c) will be:

5 More *
3 Need *
3 Word *
1 Word buzz *

If you want to remove the * you can pipe to sed to remove the last character or *:

grep -Rh '\*' | tr -s ' ' | sort | uniq -c | sed 's/.$//'
#or
grep -Rh '\*' | tr -s ' ' | sort | uniq -c | sed 's/\*//'

I think and hope there are better ways to achieve that, because here I'm using 5 commands to get the desired output. So as I said it may result in slow performance.

deleted 11 characters in body
Source Link
Edgar Magallon
  • 5.1k
  • 3
  • 15
  • 29
Loading
Source Link
Edgar Magallon
  • 5.1k
  • 3
  • 15
  • 29
Loading