I have array of words and I want to get a hash, where keys are words and values are word count.
Is there any more beautiful way then my:
result = Hash.new(0)
words.each { |word| result[word] += 1 }
return result
The imperative approach you used is probably the fastest implementation in Ruby. With a bit of refactoring, you can write a one-liner:
wf = Hash.new(0).tap { |h| words.each { |word| h[word] += 1 } }
Another imperative approach using Enumerable#each_with_object:
wf = words.each_with_object(Hash.new(0)) { |word, acc| acc[word] += 1 }
A functional/immutable approach using existing abstractions:
wf = words.group_by(&:itself).map { |w, ws| [w, ws.length] }.to_h
Note that this is still O(n) in time, but it traverses the collection three times and creates two intermediate objects along the way.
Finally: a frequency counter/histogram is a common abstraction that you'll find in some libraries like Facets: Enumerable#frequency.
require 'facets'
wf = words.frequency
str.split(" ").reduce(Hash.new(0)) { |h,w| puts h[w] += 1; h }?100.times { words.inject(Hash.new 0) { |h, w| h[w] += 1; h } }: avg 1.17s. Imperative: 100.times { hist = Hash.new 0; words.each { |w| hist[w] += 1 } }: avg 1.09s. words was an array of 10k random words, generation of the array alone took 0.2s avg. i.e. Imperative was about 9% faster.group_by(&:itself)each_with_object fits better here than reduce IMO.Posted on a related question, but posting here for visibility as well:
Ruby 2.7 onwards will have the Enumerable#tally method that will solve this.
From the trunk documentation:
Tallys the collection. Returns a hash where the keys are the elements and the values are numbers of elements in the collection that correspond to the key.
["a", "b", "c", "b"].tally #=> {"a"=>1, "b"=>2, "c"=>1}
With inject:
str = 'I have array of words and I want to get a hash, where keys are words'
result = str.split.inject(Hash.new(0)) { |h,v| h[v] += 1; h }
=> {"I"=>2, "have"=>1, "array"=>1, "of"=>1, "words"=>2, "and"=>1, "want"=>1, "to"=>1, "get"=>1, "a"=>1, "hash,"=>1, "where"=>1, "keys"=>1, "are"=>1}
I don't know about the efficiency.
result[word]doesn't exist it'll throw an exception because there's no+for nil.resultis initialized with 0, so if key doesn't exist it will be 0, not nil