最大间隔交点

Max interval intersection point

我正在尝试实现 python 中的逻辑。给定一组区间,找到具有最大交点数的区间。如果输入(1,6) (2,3) (4,11),那么(1,6)应该是returned。这已在下面得到解答,但我无法在 python 中实现它。 given-a-set-of-intervals-find-the-interval-which-has-the-maximum-number-of-inte.

到目前为止,我使用的是以下逻辑。任何帮助将不胜感激。

def interval_intersection(intervals):
    if len(intervals) ==1:
        return intervals
    intervals.sort(key=lambda x: x[0])
    
    res =[intervals[0]]
    for i in range(1,len(intervals)):
        if intervals[i][0] > res[-1][1]:
            res.append(intervals[i])
        else:
            res[-1] = [min(res[-1][0],intervals[i][0]),max(res[-1][1],intervals[i][1])]
    return res

Examples:

  1. [[1,5],[5,10],[5,5]] ans 应该是 [5,5] 在领带 [5,5] 的情况下,元素数量最少的区间。这里 [5,5] 在范围内只有 1 个元素,即 5 因此 ans
  2. [[1,2],[3,5]] 没有交集 return -1

这是 David Eisenstat's 算法的一个相当简单的实现。唯一的细微差别是:

  1. 我假设所有间隔在两端都是闭合的,这意味着如果排序事件同时发生,则排序事件应将开始放在结束之前。如果你想要完全打开的间隔,或者在右侧打开,这个顺序需要颠倒。
  2. 返回的区间具有最多的交叉点,最短的长度首先打破连接,然后是最早的开始。
def interval_solve(intervals: Sequence[Sequence[int]]) -> Union[Sequence[int], int]:
    start_type = -1  # Assumes all intervals are closed
    end_type = 1

    events = [(s, start_type, i) for i, (s, e) in enumerate(intervals)]
    events.extend((e, end_type, i) for i, (s, e) in enumerate(intervals))
    events.sort()

    inter_count: Dict[int, int] = {}

    start_count = 0
    stop_count = 0

    for event_time, event_type, event_id in events:
        if event_type == start_type:
            start_count += 1
            inter_count[event_id] = -(stop_count + 1)
        else:
            stop_count += 1
            inter_count[event_id] += start_count

    # Find max by most intersections, then by shortest interval, then by earliest start
    answer = max(range(len(intervals)),
                 key=lambda j: (inter_count[j], intervals[j][0] - intervals[j][1]))
    if inter_count[answer] == 0:
        return -1
    return intervals[answer]

实际的想法很简单,我们对间隔进行排序,并用索引和布尔键存储其中的一些,用于指示开始或结束事件。

然后,我们遍历它,同时统计索引之前的结束事件和开始事件。对于任何索引 i,间隔重叠计数很简单,之前的开始事件数 - (-1) 之前的结束事件数。

最后,如果有多个解决方案,我们可以只检查哪一个具有最小长度。

# 

def max_interval_count(intervals):
    interval_sorted = []
    for idx, interval in enumerate(intervals):
        s, e = interval
        interval_sorted.append([s, idx, 0]) # 0 for start
        interval_sorted.append([e, idx, 1]) # 1 for end
    interval_sorted.sort(key = lambda x: x[0])

    print(interval_sorted)

    number_of_starts = 0
    number_of_ends = 0

    overlap_count = {} 
    for event in interval_sorted:
        _, idx, start_end = event
        if start_end == 0: # start event
            # subtract all the ending before it
            overlap_count[idx] = - (number_of_ends)
            number_of_starts += 1
        else: # end event
            overlap_count[idx] += (number_of_starts - 1) # -1 as we should not include the start from the same interval
            number_of_ends += 1
    print(overlap_count)
    ans_idx = -1
    max_over_count = 0
    min_len_interval = 99999999999
    for idx, overl_cnt in overlap_count.items():
        if overl_cnt > max_over_count:
            ans_idx = idx
            max_over_count = overl_cnt
        elif overl_cnt == max_over_count and overl_cnt > 0 and (intervals[idx][1] - intervals[idx][0] + 1) < min_len_interval:
            min_len_interval = (intervals[idx][1] - intervals[idx][0] + 1)
            ans_idx = idx
    if ans_idx == -1:
        return ans_idx
    return intervals[ans_idx]


if __name__ == "__main__":
    test_1 = [[1,5],[5,10],[5,5]]
    test_2 = [[1,2],[3,5]]
    test_3 = [(1,6), (2,3), (4,11)]
    ans = max_interval_count(test_1)
    print(ans)
    print("---------")
    ans = max_interval_count(test_2)
    print(ans)
    print("---------")
    ans = max_interval_count(test_3)
    print(ans)
    print("---------")

[[1, 0, 0], [5, 0, 1], [5, 1, 0], [5, 2, 0], [5, 2, 1], [10, 1, 1]]
{0: 0, 1: 1, 2: 1}
[5, 5]
---------
[[1, 0, 0], [2, 0, 1], [3, 1, 0], [5, 1, 1]]
{0: 0, 1: 0}
-1
---------
[[1, 0, 0], [2, 1, 0], [3, 1, 1], [4, 2, 0], [6, 0, 1], [11, 2, 1]]
{0: 2, 1: 1, 2: 1}
(1, 6)
---------