- Split the input into words, one per line.
- Sort the resulting list of words (lines).
- Squash multiple occurences.
- Sort by occurrence count.
To split the input into words, replace any character that you deem to be a word separator by a newline.
<input_file \
tr -sc '[:alpha:]' '\n' |   # Add digits, -, \', ... if you consider them word constituents
sort |
uniq -c |
sort -k 1nr
 
                