Rails 数组内的同步范围

Rails sync ranges inside of array

目前我在 Rails 的 Ruby 同步时间段有问题。

我有 2 个 none 到 n 个哈希数组需要比较,然后在“结果”数组中有同步时间段。

举个例子:

time_periods_a = [
  { start_date: '01/10/2021', end_date: '31/10/2021',
    additional_attribute: 5 },
  { start_date: '01/11/2021', end_date: '30/11/2021',
    additional_attribute: 10 }
]

time_periods_b = [
  { start_date: '01/10/2021', end_date: '31/12/2021',
    additional_attribute: 20 }

结果应该是:

[
  { start_date: '01/10/2021', end_date: '31/10/2021',
    additional_attribute_a: 5, additional_attribute_b: 20 },
  { start_date: '01/11/2021', end_date: '30/11/2021',
    additional_attribute_a: 10, additional_attribute_b: 20 },
  { start_date: '01/12/2021', end_date: '31/12/2021',
    additional_attribute_a: 0, additional_attribute_b: 20 }
]

也许你有一个简单的解决方案,我已经绝望了2天。

编辑:

由于我没有很好地解释这个问题,这里是一个编辑:

这两个数组可能包含重叠的散列。一般来说,这些必须分成几个时期。

也许是另一个例子,更短但更复杂。

time_periods_a: [1 - 2, 3 - 6, 7 - 9] time_periods_b: [ 2-5, 6-9]

这需要导致:[1-1, 2-2, 3-5, 6-6, 7-9]

这个数字只是月份索引,因为start_date每次都在月初,end_date在月底。

不要求两个数组都包含相同的句点,但这是可能的。 这就是为什么我添加了附加属性。 如果一个时间段包含在 1 个数组中但不包含在另一个数组中,那么我只需要添加这个时间段 additional_attribute: 0

编辑我之前使用的 #2 代码

  def format_date_ranges(example_model)
    first_array = example_model.first_array
    second_array = example_model.second_array
    return first_array.map { |hash| hash.merge('additional_attribute_b' => hash['additional_attribute'], 'additional_attribute_b' => 0) } if second_array.blank?
    return first_array.map { |hash| hash.merge('additional_attribute_a' => hash['additional_attribute'], 'additional_attribute_b' => 0) } if second_array.blank?

    iterate_count = [first_array.size, second_array.size].max
    result_array = []
    iterate_count.times do |index|
      relevant_first_hash = first_hash[index]
      releveant_second_hash = second_hash[index]
      if relevant_first_hash.nil?
        result_array << result_hash(0,
                                    releveant_second_hash['additional_attribute'],
                                    releveant_second_hash['start_date'],
                                    releveant_second_hash['end_date'],
                                    '')
        next
      end

      if relevant_second_hash.nil?
        result_array << result_hash(relevant_first_hash['additional_attribute'],
                                    0,
                                    relevant_first_hash['start_date'],
                                    relevant_first_hash['end_date'],
                                    relevant_first_hash['interval'])
        next
      end

      first_start_date = relevant_first_hash['start_date'].to_date
      first_end_date = relevant_first_hash['end_date'].to_date
      second_start_date = relevant_second_hash['start_date'].to_date
      second_end_date = relevant_second_hash['end_date'].to_date

      if first_start_date == second_start_date
        start_date = second_start_date
      else
        start_date = [first_start_date, second_start_date].min
        end_date = [first_start_date, second_start_date].max - 1.day
        result_array << result_hash(relevant_first_hash['additional_attribute'],
                                    relevant_second_hash['additional_attribute'],
                                    start_date,
                                    end_date,
                                    relevant_first_hash['interval'])
        start_date = [first_start_date, second_start_date].max
      end
      if first_end_date == second_end_date
        end_date = second_end_date
      else
        end_date = [first_end_date, second_end_date].min
        result_array << result_hash(relevant_first_hash['additional_attribute'],
                                    relevant_second_hash['additional_attribute'],
                                    start_date,
                                    end_date,
                                    relevant_first_hash['interval'])
        start_date = end_date + 1.day
        end_date = [first_end_date, second_end_date].max
      end
      result_array << result_hash(relevant_first_hash['additional_attribute'],
                                  relevant_second_hash['additional_attribute'],
                                  start_date,
                                  end_date,
                                  relevant_first_hash['interval'])
    end
    result_array
  end

  def result_hash(additional_attribute_a, additional_attribute_b, start_date, end_date, interval)
    { additional_attribute_a: additional_attribute_a, additional_attribute_b: additional_attribute_b, start_date: start_date, end_date: end_date, interval: interval }
  end

我认为除了重新考虑您的数据结构之外,没有任何方法可以解决问题的复杂性,您可能会考虑这样做。我已尝试解释下面的大部分步骤,但您可能需要 运行 代码中加入一些 puts 语句才能获得透彻的理解。

我已将问题概括为具有任意数量的哈希数组的问题,这实际上简化了代码。

给定数据和概括

a = [
  { start_date: '01/10/2021', end_date: '31/10/2021',
    additional_attribute: 5 },
  { start_date: '01/11/2021', end_date: '30/11/2021',
    additional_attribute: 10 }
]

b = [
  { start_date: '01/10/2021', end_date: '31/12/2021',
    additional_attribute: 20 }
]

array = [a, b]
  #=> [[{:start_date=>"01/10/2021", :end_date=>"31/10/2021",
  #      :additional_attribute=>5},
  #     {:start_date=>"01/11/2021", :end_date=>"30/11/2021",
  #      :additional_attribute=>10}],
  #    [{:start_date=>"01/10/2021", :end_date=>"31/12/2021",
  #      :additional_attribute=>20}]]   

创建两个辅助方法

def date_str_to_date(date_str)
  DateTime.strptime(date_str, '%d/%m/%Y')
end
def date_to_date_str(date_time)
    date_time.strftime('%d/%m/%Y')
end

例如:

date_str_to_date('01/10/2021')
  #=> #<DateTime: 2021-10-01T00:00:00+00:00 ((2459489j,0s,0n),+0s,2299161j)>
date_to_date_str(date_str_to_date('31/12/2021'))
  #=> "31/12/2021"

DateTime::strptime

array转换为哈希数组

arr =
  array.each_with_index.with_object([]) do |(ar,i),a2|
    ar.each do |h|
      a2 << h.merge(start_date: date_str_to_date(h[:start_date]),
                    end_date: date_str_to_date(h[:end_date]), idx: i)
    end
  end
  #=> [{:start_date=>#<DateTime: 2021-10-01T00:00:00+00:00 ((2459489j,0s,0n),+0s,2299161j)>,
  #     :end_date=>#<DateTime: 2021-10-31T00:00:00+00:00 ((2459519j,0s,0n),+0s,2299161j)>,
  #     :additional_attribute=>5, :idx=>0},
  #    {:start_date=>#<DateTime: 2021-11-01T00:00:00+00:00 ((2459520j,0s,0n),+0s,2299161j)>,
  #     :end_date=>#<DateTime: 2021-11-30T00:00:00+00:00 ((2459549j,0s,0n),+0s,2299161j)>,
  #     :additional_attribute=>10, :idx=>0},
  #    {:start_date=>#<DateTime: 2021-10-01T00:00:00+00:00 ((2459489j,0s,0n),+0s,2299161j)>,
  #     :end_date=>#<DateTime: 2021-12-31T00:00:00+00:00 ((2459580j,0s,0n),+0s,2299161j)>,
  #     :additional_attribute=>20, :idx=>1}]

请注意,ab 中的每个哈希值都映射到 arr 中的哈希值,而 a 中的每个哈希值都有 :idx => 0 和那些来自 b:idx => 1.

计算最早开始日期和最晚结束日期

start, finish = arr.flat_map { |h| [h[:start_date], h[:end_date]] }.minmax
  #=> [#<DateTime: 2021-10-01T00:00:00+00:00 ((2459489j,0s,0n),+0s,2299161j)>,
  #    #<DateTime: 2021-12-31T00:00:00+00:00 ((2459580j,0s,0n),+0s,2299161j)>]

所以

start
  #=> #<DateTime: 2021-10-01T00:00:00+00:00 ((2459489j,0s,0n),+0s,2299161j)>
finish
  #=> #<DateTime: 2021-12-31T00:00:00+00:00 ((2459580j,0s,0n),+0s,2299161j)>

参见 Enumerable#flat_map and Array#minmax

对于startfinish之间的每个日期构造一组arr元素的索引,其范围包括给定日期

require 'set'
coverage_by_date = (start..finish).map do |date|
  [date,
   arr.each_index.select do |i|
    (arr[i][:start_date]..arr[i][:end_date]).cover?(date)
   end.to_set
  ]
end
  #=> [[#<DateTime: 2021-10-01T00:00:00+00:00 ((2459489j,0s,0n),+0s,2299161j)>, #<Set: {0, 2}>],
  #    [#<DateTime: 2021-10-02T00:00:00+00:00 ((2459490j,0s,0n),+0s,2299161j)>, #<Set: {0, 2}>],
  #    ...
  #    [#<DateTime: 2021-10-31T00:00:00+00:00 ((2459519j,0s,0n),+0s,2299161j)>, #<Set: {0, 2}>],
  #    [#<DateTime: 2021-11-01T00:00:00+00:00 ((2459520j,0s,0n),+0s,2299161j)>, #<Set: {1, 2}>],
  #    ...
  #    [#<DateTime: 2021-11-30T00:00:00+00:00 ((2459549j,0s,0n),+0s,2299161j)>, #<Set: {1, 2}>],
  #    [#<DateTime: 2021-12-01T00:00:00+00:00 ((2459550j,0s,0n),+0s,2299161j)>, #<Set: {2}>],
  #    ...
  #    [#<DateTime: 2021-12-31T00:00:00+00:00 ((2459580j,0s,0n),+0s,2299161j)>, #<Set: {2}>]]

最后,删除具有空覆盖集的日期并将生成的日期数组切片为具有相等覆盖集的范围,然后映射到所需的哈希值

coverage_by_date.reject { |_,set| set.empty? }
                .slice_when { |(_,set1),(_,set2)| set1 != set2 }
                .map do |ar|
                   attributes = ar.first
                                  .last
                                  .each_with_object(Hash.new(0)) do |i,h|
                                     g = arr[i] 
                                     h[g[:idx]] = g[:additional_attribute]
                                   end.values_at(*0..array.size-1)
                   { start_date: date_to_date_str(ar.first.first),
                     end_date: date_to_date_str(ar.last.first), 
                     attributes: attributes }                       
                 end
  #=> [{:start_date=>"01/10/2021", :end_date=>"31/10/2021", :attributes=>[5, 20]},
  #    {:start_date=>"01/11/2021", :end_date=>"30/11/2021", :attributes=>[10, 20]},
  #    {:start_date=>"01/12/2021", :end_date=>"31/12/2021", :attributes=>[0, 20]}]

当(例如)

ar.first.last
  #=> #<Set: {0, 2}>

我们发现

attributes
  #=> [5, 20]

注:

coverage_by_date.reject { |_,set| set.empty? }
                .slice_when { |(_,set1),(_,set2)| set1 != set2 }
                .map do |ar|
                   { start_date: date_to_date_str(ar.first.first),
                     end_date: date_to_date_str(ar.last.first),
                     set: ar.first.last }
                 end
  #=> [{:start_date=>"01/10/2021", :end_date=>"31/10/2021", :set=>#<Set: {0, 2}>},
  #    {:start_date=>"01/11/2021", :end_date=>"30/11/2021", :set=>#<Set: {1, 2}>},
  #    {:start_date=>"01/12/2021", :end_date=>"31/12/2021", :set=>#<Set: {2}>}]

参见 Enumerable#slice_when, the form of Hash::new that takes an argument (the default value) and no block, and Hash#values_at