使用散列来收集有关单词使用情况的统计信息

Question

我有一个空散列，想用词频填充它。如果哈希遇到一个新词，它必须将该词作为键启动。如果它找到一个它以前见过的词，它只会增加该键的值。是否有此代码的重构版本？

my_hash = {}
@huge_word_list.words.each do |word|
    my_hash[word] ? my_hash[word] += 1 : my_hash[word] = 1
end

Answer 1

您可以使用默认值初始化散列，在本例中为 0:

my_hash = Hash.new(0)
@huge_word_list.words.each { |word| my_hash[word] += 1 }

Answer 2

@huge_word_list.words.inject({}) { |h, word| h[word] = h[word].to_i + 1; h }

Answer 3

这是我首选的方式，明确而简洁：

@huge_word_list.each_with_object(Hash.new(0)) { |word, h| h[word] += 1 }

Answer 4

我会填充散列 Hash.new(0)，但由于已经有 237 个答案这样做，我会尝试另一个：

def count_words(words)
  h = words.group_by { |w| w }
  h.merge(h) { |*_,a| a.size }
end

text =<<_
Peter Piper picked a peck of pickled peppers for the piper whose
name is also Peter. After doing so, Peter pickled a peck of peppers
he picked.
_

words = text.tr(".,;:?'\"()",'').downcase.split
  #=> ["peter", "piper", "picked", "a", "peck", "of", "pickled",
  #    "peppers", "for", "the", "piper", "whose", "name", "is",
  #    "also", "peter", "after", "doing", "so", "peter", "pickled",
  #    "a", "peck", "of", "peppers", "he", "picked"] 

count_words(words)
  #=> {"peter"=>3, "piper"=>2, "picked"=>2, "a"=>2, "peck"=>2,
  #    "of"=>2, "pickled"=>2, "peppers"=>2, "for"=>1, "the"=>1,
  #    "whose"=>1, "name"=>1, "is"=>1, "also"=>1, "after"=>1,
  #    "doing"=>1, "so"=>1, "he"=>1}

使用散列来收集有关单词使用情况的统计信息

Using a hash to collect statistics on word usage

ruby

hash