如何修改 ruby 2.1.2 中的哈希数组?

How to modify arrays of hashes in ruby 2.1.2?

我有一个名为 array_of_hash:

的哈希数组
array_of_hash = [
 {:name=>"1", :address=>"USA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
 {:name=>"5", :address=>"UK", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC"},
 {:name=>"6", :address=>"CANADA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"CD"},
 {:name=>"29", :address=>"GERMANY", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE"},
 {:name=>"30", :address=>"CHINA", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"FG"}
]

我希望通过键 :name 的连续值对这些散列进行分组。第一组将是单独的 "1",因为 :name => "1".succ #=> "2" 没有密钥。第二组将包含值为 "5""6" 的散列。第三组将是数组中的最后两个哈希值,其中 :name=>29:name=>30.

我想要的哈希数组应该如下所示:

[
   {:name=>"1", :address=>"USA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
   {:name=>"5-6", :address=>"UK,CANADA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC,CD"},
   {:name=>"29-30", :address=>"GERMANY,CHINA", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE, FG"},
]

用例二

array_of_hash = [
 {:name=>"1", :address=>"USA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
 {:name=>"2", :address=>"UK", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC"},
 {:name=>"3", :address=>"CANADA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"CD"},
 {:name=>"29", :address=>"GERMANY", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE"},
 {:name=>"30", :address=>"CHINA", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"FG"}
]

用例 II 的预期结果

[
   {:name=>"1-3", :address=>"USA,UK,CANADA", :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB,BC,CD"},
   {:name=>"29-30", :address=>"GERMANY,CHINA", :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE, FG"},
]

到目前为止我做了什么:

new_array_of_hashes = []
new_array_of_hashes << { name: array_of_hashes.map {|h| h[:name].to_i}} << {address: array_of_hashes.map {|h| h[:address]}} << {collection: array_of_hashes.map {|h| h[:collection]}} << {sequence: array_of_hashes.map {|h| h[:sequence]}}

[{:name=>[1, 5, 6, 29, 30]},
 {:address=>["USA", "UK", "CANADA", "GERMANY", "CHINA"]},
 {:collection=>
[["LAND", "WATER", "OIL", "TREE", "SAND"],
["LAND", "WATER", "OIL", "TREE", "SAND"],
["LAND", "WATER", "OIL", "TREE", "SAND"],
["LAPTOP", "SHIP", "MOUNTAIN"],
["LAPTOP", "SHIP", "MOUNTAIN"]]},
 {:sequence=>["AB", "BC", "CD", "DE", "FG"]}]

我只会组合

首先,让我们制作一个我们最终想要的组的数组。我们将使用 Ruby 的 Array#slice_when 方法,该方法使用当前和下一个数组元素遍历数组,允许我们比较两者。如果名称(转换为整数)不连续或集合不相同,我们的条件将指示 Ruby 对数组进行切片。

>> groups = array_of_hash.slice_when { |i, j| i[:name].to_i + 1 != j[:name].to_i || i[:collection] != j[:collection] }.to_a

但是因为您使用的是 Ruby 2.1,所以您需要使用 slice_before 并使用局部变量来跟踪以前的元素。根据 documentation,我们可以通过首先启动一个局部变量来完成此操作:

>> prev = array_of_hash[0]

然后在遍历数组时重置它和第二个局部变量:

>> groups = array_of_hash.slice_before { |e| prev, prev2 = e, prev; prev2[:name].to_i + 1 != prev[:name].to_i || prev2[:collection] != prev[:collection] }.to_a

无论哪种情况,groups 现在应该如下所示:

=> [[{:name=>"1",
   :address=>"USA",
   :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
   :sequence=>"AB"}],
 [{:name=>"5",
   :address=>"UK",
   :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
   :sequence=>"BC"},
  {:name=>"6",
   :address=>"CANADA",
   :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
   :sequence=>"CD"}],
 [{:name=>"29",
   :address=>"GERMANY",
   :collection=>["LAPTOP", "SHIP", "MOUNTAIN"],
   :sequence=>"DE"},
  {:name=>"30",
   :address=>"CHINA",
   :collection=>["LAPTOP", "SHIP", "MOUNTAIN"],
   :sequence=>"FG"}]]

现在我们获取生成的数组并将其元素映射到一个新的散列,按照您指定的格式进行格式化。

对于:name,我们取组的第一个和最后一个元素,调用.uniq消除重复项,并用连字符连接它们。 (如果只有一个元素,join returns单个元素不变。)

对于:collection,我们简单地使用在组的第一个元素中找到的集合。

对于:sequence,我们用逗号连接组中每个元素的序列。 (同样,单个元素返回不变。)

>> groups.map { |group| {name: [group.first[:name], group.last[:name]].uniq.join('-'), 
                         collection: group.first[:collection], 
                         sequence: group.map { |e| e[:sequence] }.join(',') } }

=> [{:name=>"1",
  :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
  :sequence=>"AB"},
 {:name=>"5-6",
  :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"],
  :sequence=>"BC,CD"},
 {:name=>"29-30",
  :collection=>["LAPTOP", "SHIP", "MOUNTAIN"],
  :sequence=>"DE,FG"}]
def slice_when(array)
  big = []
  small = []
  last_index = array.size - 1
  (0..last_index).each do |i|
    small << array[i]
    if last_index == i || yield(array[i], array[i + 1])
      big << small
      small = []
    end
  end
  big
end

如果您不想使用,可以尝试使用此 slice_before。请记住,它已经 returns 一个 Array,而不是 Enumurator

代码

def aggregate(array_of_hash)
  array_of_hash.chunk_while { |g,h| h[:name] == g[:name].succ }.
    flat_map { |a| a.chunk { |g| g[:collection] }.map { |_c,b| combine(b) } }
end

def combine(arr)
  names     = values_for_key(arr, :name)
  addresses = values_for_key(arr, :address)
  sequences = values_for_key(arr, :sequence)
  arr.first.merge {
    name: names.size==1 ? names.first : "%s-%s" % [names.first, names[-1]],
    address:  addresses.join(','),
    sequence: sequences.join(',')
  }
end

def values_for_key(arr, key)
  arr.map { |h| h[key] }
end

例子

aggregate(array_of_hash)
  #=> [{:name=>"1", :address=>"USA",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
  #    {:name=>"5-6", :address=>"UK,CANADA",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC,CD"},
  #    {:name=>"29-30", :address=>"GERMANY,CHINA",
  #     :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE,FG"}]   

这是第二个例子。

array_of_hash[2][:collection] = ['dog', 'cat', 'pig']
  #=> [{:name=>"1", :address=>"USA",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"},
  #    {:name=>"5", :address=>"UK",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC"},
  #    {:name=>"6", :address=>"CANADA",
  #     :collection=>["dog", "cat", "pig"], :sequence=>"CD"},
  #    {:name=>"29-30", :address=>"GERMANY,CHINA",
  #     :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE,FG"}]

在此示例中,:name=>"5":name=>"6" 的哈希无法分组,因为 :collection 的值不同。这个问题没有说明这种情况是否会发生。如果不能,代码仍然正确,但可以简化为以下内容。

def aggregate(array_of_hash)
  array_of_hash.chunk_while { |g,h| h[:name] == g[:name].succ }.
    map { |a| combine(a) }
end

说明

对于上面的例子,步骤如下。

e0 = array_of_hash.chunk_while { |g,h| h[:name] == g[:name].succ }
  #=> #<Enumerator: #<Enumerator::Generator:0x007fa25e022f30>:each>

参见 Enumerable#chunk_while,它在 Ruby v.2.3 中首次亮相。

此枚举器将生成​​以下要传递给 Enumerable#flat_map 的元素。

e0.to_a
  #=> [[{:name=>"1", :address=>"USA",
  #      :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}],
  #    [{:name=>"5", :address=>"UK",
  #      :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"BC"},
  #     {:name=>"6", :address=>"CANADA",
  #      :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"CD"}],
  #    [{:name=>"29", :address=>"GERMANY",
  #      :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"DE"},
  #     {:name=>"30", :address=>"CHINA",
  #      :collection=>["LAPTOP", "SHIP", "MOUNTAIN"], :sequence=>"FG"}]
  #   ] 

e0.flat_map { |a| a.chunk { |g| g[:collection] }.map { |_,b| combine(b) } }

returns例子中得到的哈希数组。考虑由 e0 生成并传递给块并分配给块变量的第一个元素,由 flat_map.

a = e0.next
  #=> [{:name=>"1", :address=>"USA",
  #     :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}] 

区块计算因此

e1 = a.chunk { |g| g[:collection] }
  #=> #<Enumerator: #<Enumerator::Generator:0x007fa25c857158>:each> 
e1.to_a
  #=> [[["LAND", "WATER", "OIL", "TREE", "SAND"],
  #     [{:name=>"1", :address=>"USA",
  #       :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}]
  #    ]
  #   ] 

_c,b = e1.next
  #=> [["LAND", "WATER", "OIL", "TREE", "SAND"],
  #    [{:name=>"1", :address=>"USA",
  #      :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}]
  #   ] 
  # _c
  #   #=> ["LAND", "WATER", "OIL", "TREE", "SAND"] 
  # b #=> [{:name=>"1", :address=>"USA",
  #         :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}] 
combine(b)
  #=> {:name=>"1", :address=>"USA",
  #    :collection=>["LAND", "WATER", "OIL", "TREE", "SAND"], :sequence=>"AB"}

其余计算类似。