合并和求和 Ruby 中的嵌套哈希

Merging & Summing nested hashes in Ruby

我想做的与 Chartkick 中的问题 , but I have one additional problem in that the nested values of my hash need to have their dates grouped and the values of each date summed. The goal is to create a Multiple Series Graph 非常相似。

查询,抓取月份范围例如:

arr = LineItem.includes(:order, :product)
              .where(orders: {order_date: Date.parse("Jan 1 2020")..Date.parse("Feb 1 2020")})
              .map { |line_item| { name: line_item.product.model_number, data: { line_item.order.order_date.strftime('%a %b %d, %Y') => line_item.order_quantity } } }

输出哈希:

 => [
{:name=>"FR-GP02", :data=>{"Mon Jan 20, 2020"=>2}}, 
{:name=>"FR-GP02", :data=>{"Mon Jan 20, 2020"=>5}}, 
{:name=>"FR-GP02", :data=>{"Tue Jan 21, 2020"=>1}}, 
{:name=>"FR-GP02", :data=>{"Tue Jan 21, 2020"=>3}}, 
{:name=>"FR-GP02", :data=>{"Wed Jan 22, 2020"=>1}}, 
{:name=>"FR-GP04", :data=>{"Mon Jan 20, 2020"=>2}}, 
{:name=>"FR-GP04", :data=>{"Tue Jan 21, 2020"=>4}}, 
{:name=>"FR-GP04", :data=>{"Tue Jan 21, 2020"=>3}}, 
{:name=>"FR-GP04", :data=>{"Tue Jan 21, 2020"=>6}}, 
{:name=>"FR-GP04", :data=>{"Wed Jan 22, 2020"=>3}}, 
{:name=>"FR-GP01", :data=>{"Tue Jan 21, 2020"=>5}}, 
{:name=>"FR-GP01", :data=>{"Thu Jan 23, 2020"=>3}}, 
{:name=>"FR-GP01", :data=>{"Thu Jan 23, 2020"=>1}}, 
...

我预期的哈希值;应该对名称进行分组,然后对日期进行分组并对值求和:

 => [
{:name=>"FR-GP02", :data=>{"Mon Jan 20, 2020"=>7, "Tue Jan 21, 2020"=>4, "Wed Jan 22, 2020"=>1}}, 
{:name=>"FR-GP04", :data=>{"Mon Jan 20, 2020"=>2, "Tue Jan 21, 2020"=>13, "Wed Jan 22, 2020"=>3}}, 
{:name=>"FR-GP01", :data=>{"Tue Jan 21, 2020"=>5, "Thu Jan 23, 2020"=>4}}, 
...

但是,在 运行 这段代码之后:

arr.group_by {|h| h[:name]}.map { |k,v| { name: k, data: v.map {|h| h[:data]}.reduce(&:merge)}}

这是输出:

 => [
{:name=>"RP-AP02", :data=>{"Mon Jan 20, 2020"=>2, "Tue Jan 21, 2020"=>1, "Wed Jan 22, 2020"=>1}},
{:name=>"RP-AP04", :data=>{"Mon Jan 20, 2020"=>2, "Tue Jan 21, 2020"=>4, "Wed Jan 22, 2020"=>3}}, 
{:name=>"RP-AP01", :data=>{"Tue Jan 21, 2020"=>5, "Thu Jan 23, 2020"=>3}},
...

生成的输出确实对 namedata 进行了分组,但并未对数量求和。作为例子,我在这里按天分组,但也希望选择按周和月分组。在过去的 8 个小时里,我也尝试过使用 Groupdate 无济于事。

有很多方法可以获得所需的 return 值。这是两个。首先我定义 arr.

arr = [
  {:name=>"FR-GP02", :data=>{"Mon Jan 20, 2020"=>2}}, 
  {:name=>"FR-GP02", :data=>{"Mon Jan 20, 2020"=>5}}, 
  {:name=>"FR-GP02", :data=>{"Tue Jan 21, 2020"=>1}}, 
  {:name=>"FR-GP02", :data=>{"Tue Jan 21, 2020"=>3}}, 
  {:name=>"FR-GP02", :data=>{"Wed Jan 22, 2020"=>1}}, 
  {:name=>"FR-GP04", :data=>{"Mon Jan 20, 2020"=>2}}, 
  {:name=>"FR-GP04", :data=>{"Tue Jan 21, 2020"=>4}}, 
  {:name=>"FR-GP04", :data=>{"Tue Jan 21, 2020"=>3}}, 
  {:name=>"FR-GP04", :data=>{"Tue Jan 21, 2020"=>6}}, 
  {:name=>"FR-GP04", :data=>{"Wed Jan 22, 2020"=>3}}, 
  {:name=>"FR-GP01", :data=>{"Tue Jan 21, 2020"=>5}}, 
  {:name=>"FR-GP01", :data=>{"Thu Jan 23, 2020"=>3}}, 
  {:name=>"FR-GP01", :data=>{"Thu Jan 23, 2020"=>1}}]

第一个计算使用方法Enumerable#group_by and Hash#transform_values

arr.group_by { |h| h[:name] }
   .map do |k,v|
     { name: k,
       data: v.group_by do |h|
               h[:data].keys.first
             end.transform_values { |a| a.sum { |h| h[:data].values.first }}
     }
end
  #=> [{:name=>"FR-GP02", :data=>{"Mon Jan 20, 2020"=>7,
                                  "Tue Jan 21, 2020"=>4,
                                  "Wed Jan 22, 2020"=>1}},
       {:name=>"FR-GP04", :data=>{"Mon Jan 20, 2020"=>2,
                                  "Tue Jan 21, 2020"=>13,
                                  "Wed Jan 22, 2020"=>3}},
       {:name=>"FR-GP01", :data=>{"Tue Jan 21, 2020"=>5,
                                  "Thu Jan 23, 2020"=>4}}]

注:

arr.group_by { |h| h[:name] }
  #=> {"FR-GP02"=>[{:name=>"FR-GP02", :data=>{"Mon Jan 20, 2020"=>2}},
                   {:name=>"FR-GP02", :data=>{"Mon Jan 20, 2020"=>5}},
                   {:name=>"FR-GP02", :data=>{"Tue Jan 21, 2020"=>1}},
                   {:name=>"FR-GP02", :data=>{"Tue Jan 21, 2020"=>3}},
                   {:name=>"FR-GP02", :data=>{"Wed Jan 22, 2020"=>1}}],
       "FR-GP04"=>[{:name=>"FR-GP04", :data=>{"Mon Jan 20, 2020"=>2}},
                   {:name=>"FR-GP04", :data=>{"Tue Jan 21, 2020"=>4}},
                   {:name=>"FR-GP04", :data=>{"Tue Jan 21, 2020"=>3}},
                   {:name=>"FR-GP04", :data=>{"Tue Jan 21, 2020"=>6}},
                   {:name=>"FR-GP04", :data=>{"Wed Jan 22, 2020"=>3}}],
       "FR-GP01"=>[{:name=>"FR-GP01", :data=>{"Tue Jan 21, 2020"=>5}},
                   {:name=>"FR-GP01", :data=>{"Thu Jan 23, 2020"=>3}},
                   {:name=>"FR-GP01", :data=>{"Thu Jan 23, 2020"=>1}}]}

map 的块变量最初等于以下内容:

k = "FR-GP02"
v = [{:name=>"FR-GP02", :data=>{"Mon Jan 20, 2020"=>2}},
     {:name=>"FR-GP02", :data=>{"Mon Jan 20, 2020"=>5}},
     {:name=>"FR-GP02", :data=>{"Tue Jan 21, 2020"=>1}},
     {:name=>"FR-GP02", :data=>{"Tue Jan 21, 2020"=>3}},
     {:name=>"FR-GP02", :data=>{"Wed Jan 22, 2020"=>1}}]

然后创建的第一个散列中 :data 的值计算如下:

f = v.group_by do |h|
      h[:data].keys.first
    end
  #=> {"Mon Jan 20, 2020"=>[
  #      {:name=>"FR-GP02", :data=>{"Mon Jan 20, 2020"=>2}},
  #      {:name=>"FR-GP02", :data=>{"Mon Jan 20, 2020"=>5}}],
  #    "Tue Jan 21, 2020"=>[
  #      {:name=>"FR-GP02", :data=>{"Tue Jan 21, 2020"=>1}},
  #      {:name=>"FR-GP02", :data=>{"Tue Jan 21, 2020"=>3}}],
  #    "Wed Jan 22, 2020"=>[
  #      {:name=>"FR-GP02", :data=>{"Wed Jan 22, 2020"=>1}}]}

最后,

f.transform_values { |a| a.sum { |h| h[:data].values.first }}
  #=> {"Mon Jan 20, 2020"=>7, "Tue Jan 21, 2020"=>4, "Wed Jan 22, 2020"=>1}

这是获得所需结果的第二种方法。

arr.each_with_object(Hash.new(0)) do |g,h|
  d, n = g[:data].flatten
  h[[g[:name], d]] += n
end.group_by { |(name, _),_| name }
   .map do |name,arr|
     { name: name, data: arr.each_with_object({}) { |((_,d),t),h| h[d] = t } }
    end
  #=> (as above)

步骤如下

s = arr.each_with_object(Hash.new(0)) do |g,h|
  d, n = g[:data].flatten
  h[[g[:name], d]] += n
end
  #=> {["FR-GP02", "Mon Jan 20, 2020"]=>7,
  #    ["FR-GP02", "Tue Jan 21, 2020"]=>4,
  #    ["FR-GP02", "Wed Jan 22, 2020"]=>1,
  #    ["FR-GP04", "Mon Jan 20, 2020"]=>2,
  #    ["FR-GP04", "Tue Jan 21, 2020"]=>13,
  #    ["FR-GP04", "Wed Jan 22, 2020"]=>3,
  #    ["FR-GP01", "Tue Jan 21, 2020"]=>5,
  #    ["FR-GP01", "Thu Jan 23, 2020"]=>4}

这使用 Hash::new 的形式,它接受一个称为其 默认值 的参数(通常在这里为零)并且没有块。如果定义了哈希

h = Hash.new(0)

并且——可能在添加键值对之后——没有键kh[k]将return作为默认值。这意味着在表达式

h[[g[:name], d]] += n

如果 h 没有键 [g[:name], d] 在添加 n 之前,该键的 h 的值被初始化为零。如果 h 确实有该键,则该键的当前值增加 n

继续计算,

t = s.group_by { |(name,_),_| name }
  #=> {"FR-GP02"=>[[["FR-GP02", "Mon Jan 20, 2020"], 7],
  #                [["FR-GP02", "Tue Jan 21, 2020"], 4],
  #                [["FR-GP02", "Wed Jan 22, 2020"], 1]],
  #    "FR-GP04"=>[[["FR-GP04", "Mon Jan 20, 2020"], 2],
  #                [["FR-GP04", "Tue Jan 21, 2020"], 13],
  #                [["FR-GP04", "Wed Jan 22, 2020"], 3]],
  #    "FR-GP01"=>[[["FR-GP01", "Tue Jan 21, 2020"], 5],
  #                [["FR-GP01", "Thu Jan 23, 2020"], 4]]}

最后,

t.map do |name,arr|
  { name: name, data: arr.each_with_object({}) { |((_,d),t),h| h[d] = t } }
end
  #=> (as above)

在这里和之前,我充分利用了 Ruby 的强大技术 Array decomposition. See also this article