如何获得一组不重叠的区间的笛卡尔积?
How can I get the cartesian product of a set of intervals with no overlapping?
给定一个包含一组区间的字典:
intervals = {'561801/03/08': [[1081, 1156], [1141, 1216], [1201, 1276], [1741, 1816], [1801, 1876], [1861, 1936], [1921, 1996], [1981, 2056], [2041, 2116]], '563301/03/08': [[1170, 1250], [1230, 1310], [1770, 1850], [1830, 1910], [1890, 1970], [1950, 2030], [2010, 2090], [2070, 2150], [2130, 2210]], '688002/03/08': [[1790, 1850], [1850, 1910], [1910, 1970], [1970, 2030], [2090, 2150], [2150, 2210], [2210, 2270], [2270, 2330], [2330, 2390], [2390, 2450], [2450, 2510], [2510, 2570], [2570, 2630], [2630, 2690], [2690, 2750]], '690102/03/08': [[1900, 1960], [1960, 2020], [2020, 2080], [2080, 2140], [2200, 2260], [2260, 2320], [2320, 2380], [2380, 2440], [2440, 2500], [2500, 2560], [2560, 2620], [2620, 2680], [2680, 2740]], '559402/03/08': [[2015, 2090], [2075, 2150], [2135, 2210], [2195, 2270], [2255, 2330], [2315, 2390], [2375, 2450], [2435, 2510], [2495, 2570], [2555, 2630], [2615, 2690], [2675, 2750]], '561302/03/08': [[2310, 2390], [2370, 2450], [2430, 2510], [2490, 2570], [2550, 2630], [2610, 2690], [2670, 2750]], '572602/03/08': [[2435, 2505], [2495, 2565], [2555, 2625], [2615, 2685], [2675, 2745]], '572502/03/08': [[2560, 2640], [2620, 2700]]}
可以使用以下方法获得笛卡尔积:
prod = itertools.product(*intervals)
这个笛卡尔积的大小是9915131275* 2 = 13,267,800
我希望通过不允许两个或多个域重叠的组合来减少它。这个组合可以:
[1081, 1156], [1170, 1250], [1790, 1850], [1900, 1960], [2015, 2090],
[2310, 2390], [2435, 2505], [2560, 2640] OK
这个组合不行
[1141, 1216], [1170, 1250], [1790, 1850], [1900, 1960], [2015, 2090],
[2310, 2390], [2435, 2505], [2560, 2640] not OK
以及任何以以下开头的组合:
[1141, 1216], [1170, 1250]
不应该考虑。这不包括 15131275*2 = 163,800 个组合
目的是显着减少笛卡尔积的大小,只有不重叠的区间。
提前致谢
我对这个问题的解释如下:从每个区间列表中恰好选择一个区间,并且只接受那些没有任何重叠的组合。
下面是 CPMPy (https://github.com/CPMpy/cpmpy) 中的约束规划模型。它使用 Element
约束从第 i
个间隔列表中选择选定的第 x[i]
个间隔。
基本约束是:
~( (starts[i] >= starts[j]) & (starts[i] <= ends[j]))
~( (starts[j] >= starts[i]) & (starts[j] <= ends[i]))
这确保第 i
个选定的间隔不会与第 j
个间隔重叠(反之亦然)。 (注:~
表示not
。)
from cpmpy import *
from cpmpy.solvers import *
from cpmpy_hakank import * # See http://hakank.org/cpmpy/cpmpy_hakank.py
def print_solution(a):
"""
Print the solution.
"""
# The selected intervals, as indices in each interval list
xval = a[0].value()
n = len(xval)
print(xval)
# The selected intervals, as intervals
sols = [intervals[i][xval[i]] for i in range(n)]
print(sols)
print(flush=True)
#
# Note: intervals is a list of list of intervals (not a dictionary)
#
def reduce_overlaps(intervals):
# Convert the list of intervals to a list of flattened lists
# for use with Element below.
intervals_flatten = []
for interval in intervals:
intervals_flatten.append(cpm_array(flatten_lists(interval)))
intervals_flatten = cpm_array(intervals_flatten)
# We need all values to create the domains of the selected interval
# values
all_values = flatten_lists(intervals_flatten)
max_val = max(all_values)
min_val = min(all_values)
n = len(intervals)
lens = [len(interval) for interval in intervals]
#
# Decision variables
#
model = Model()
# x[i] is the selected interval for the i'th interval list
x = intvar(0,max(lens),shape=n,name="x")
# Reduce the domain (the possible values) of each interval list
# (since they have different lengths)
for i in range(n):
model += [x[i] < lens[i]]
# starts[i] is the start value of the i'th selected interval
starts = intvar(min_val,max_val,shape=n,name="starts")
# ends[i] is the end value of the i'th selected interval
ends = intvar(min_val,max_val,shape=n,name="ends")
#
# Main constraints:
# - Pick exactly one of the intervals from each intervals list
# - Ensure that there are no overlaps between any of selected intervals.
#
# get the values of the selected intervals
for i in range(n):
# Use Element to obtain the start and end values of the selected
# interval. We have to use the following construct with Element
# since CPMPy does not (yet) support this syntax:
# starts[i] = intervals[x[i],0]
# ends[i] = intervals[x[i],1]
model += [starts[i] == Element(intervals_flatten[i],x[i]*2+0), # corresponds to: starts[i] = intervals[x[i],0]
ends[i] == Element(intervals_flatten[i],x[i]*2+1), # corresponds to: ends[i] = intervals[x[i],1]
]
# Ensure that the i'th selected interval don't overlap with
# the rest of the intervals (the j'th interval)
for i in range(n):
for j in range(i+1,n):
# Ensure that the start value of one interval is not inside the other interval
model += [~( (starts[i] >= starts[j]) & (starts[i] <= ends[j])),
~( (starts[j] >= starts[i]) & (starts[j] <= ends[i])) ]
# Print all solutions.
# This method is defined in http://hakank.org/cpmpy/cpmpy_hakank.py
# ortools_wrapper(model,[x],print_solution)
# Collect the solutions in an array
solutions = []
def get_solution(a):
xval = a[0].value()
# print(xval)
sol = [intervals[i][xval[i]] for i in range(n)]
# print("sol:",sol)
solutions.append(sol)
ortools_wrapper2(model,[x],get_solution)
return np.array(solutions)
intervals_dict = {
'561801/03/08': [[1081, 1156], [1141, 1216], [1201, 1276], [1741, 1816], [1801, 1876], [1861, 1936], [1921, 1996], [1981, 2056], [2041, 2116]],
'563301/03/08': [[1170, 1250], [1230, 1310], [1770, 1850], [1830, 1910], [1890, 1970], [1950, 2030], [2010, 2090], [2070, 2150], [2130, 2210]],
'688002/03/08': [[1790, 1850], [1850, 1910], [1910, 1970], [1970, 2030], [2090, 2150], [2150, 2210], [2210, 2270], [2270, 2330], [2330, 2390], [2390, 2450], [2450, 2510], [2510, 2570], [2570, 2630], [2630, 2690], [2690, 2750]],
'690102/03/08': [[1900, 1960], [1960, 2020], [2020, 2080], [2080, 2140], [2200, 2260], [2260, 2320], [2320, 2380], [2380, 2440], [2440, 2500], [2500, 2560], [2560, 2620], [2620, 2680], [2680, 2740]],
'559402/03/08': [[2015, 2090], [2075, 2150], [2135, 2210], [2195, 2270], [2255, 2330], [2315, 2390], [2375, 2450], [2435, 2510], [2495, 2570], [2555, 2630], [2615, 2690], [2675, 2750]],
'561302/03/08': [[2310, 2390], [2370, 2450], [2430, 2510], [2490, 2570], [2550, 2630], [2610, 2690], [2670, 2750]],
'572602/03/08': [[2435, 2505], [2495, 2565], [2555, 2625], [2615, 2685], [2675, 2745]],
'572502/03/08': [[2560, 2640], [2620, 2700]]
}
# Convert to a list of lists since this is needed for the output
intervals = [intervals_dict[a] for a in intervals_dict]
solutions = reduce_overlaps(intervals)
# print("Solutions:",solutions)
print("Num solutions:",len(solutions))
注意:该程序使用我的实用程序包 http://hakank.org/cpmpy/cpmpy_hakank.py .
该模型给出了 12201 个解决方案,显示了所选区间的索引以及区间。以下是其中一些解决方案:
sol #1
[7 7 2 4 5 2 4 0]
[[1981, 2056], [2070, 2150], [1910, 1970], [2200, 2260], [2315, 2390], [2430, 2510], [2675, 2745], [2560, 2640]]
sol #2
[7 7 0 4 5 2 4 0]
[[1981, 2056], [2070, 2150], [1790, 1850], [2200, 2260], [2315, 2390], [2430, 2510], [2675, 2745], [2560, 2640]]
sol #3
[7 7 1 4 5 2 4 0]
[[1981, 2056], [2070, 2150], [1850, 1910], [2200, 2260], [2315, 2390], [2430, 2510], [2675, 2745], [2560, 2640]]
....
sol #12200
[4 8 2 5 0 1 1 1]
[[1801, 1876], [2130, 2210], [1910, 1970], [2260, 2320], [2015, 2090], [2370, 2450], [2495, 2565], [2620, 2700]]
sol #12201
[6 8 1 5 0 1 1 1]
[[1921, 1996], [2130, 2210], [1850, 1910], [2260, 2320], [2015, 2090], [2370, 2450], [2495, 2565], [2620, 2700]]
ExitStatus.OPTIMAL (3.59288788 seconds)
Nr solutions: 12201
Num conflicts: 302
NumBranches: 135035
WallTime: 3.59288788
更新
这里有两个CP模型:
- CPMpy模型(与上面略有不同):http://hakank.org/cpmpy/reduce_overlapping_intervals_cp.py
- Picat 模型(我用它来制作问题原型):http://hakank.org/picat/reduce_overlapping_intervals.pi
更新 2
问题还有另一种解释:从每个区间列表中删除区间,使这些剩余区间列表的任意组合(从每个区间列表中取出一个区间)不重叠。并且我们要求每个区间列表至少保留一个区间。
这样说,那么一共有(根据我的Picat模型,见下)608599种不同的配置。
也许更有趣的是只使用最优解,即具有最大保持间隔数的配置。那么保持间隔的最佳数量是 15(同样根据我的 Picat 模型),并且有 170 个这样的配置。 (令我惊讶的是,保持间隔的最佳数量仅为 15 ,它是可能的 72 个间隔中的一小部分)。
以下是其中一些最佳解决方案(保留 15 个间隔):
interval = 1 = [[1081,1156],[1741,1816],[1801,1876],[1861,1936],[1921,1996],[1981,2056]]
interval = 2 = [[1170,1250],[1230,1310]]
interval = 3 = [[2150,2210],[2210,2270]]
interval = 4 = [[2080,2140]]
interval = 5 = [[2675,2750]]
interval = 6 = [[2310,2390]]
interval = 7 = [[2435,2505]]
interval = 8 = [[2560,2640]]
interval = 1 = [[1081,1156],[1141,1216],[1741,1816],[1801,1876]]
interval = 2 = [[1230,1310]]
interval = 3 = [[2150,2210],[2210,2270]]
interval = 4 = [[1900,1960],[1960,2020],[2020,2080],[2080,2140]]
interval = 5 = [[2435,2510]]
interval = 6 = [[2310,2390]]
interval = 7 = [[2675,2745]]
interval = 8 = [[2560,2640]]
interval = 1 = [[1081,1156],[1741,1816],[1801,1876]]
interval = 2 = [[1170,1250],[1230,1310]]
interval = 3 = [[2090,2150],[2150,2210],[2210,2270]]
interval = 4 = [[1900,1960],[1960,2020],[2020,2080]]
interval = 5 = [[2435,2510]]
interval = 6 = [[2310,2390]]
interval = 7 = [[2675,2745]]
interval = 8 = [[2560,2640]]
此方法的 Picat 模型在这里:http://hakank.org/picat/reduce_overlapping_interval2.pi。
给定一个包含一组区间的字典:
intervals = {'561801/03/08': [[1081, 1156], [1141, 1216], [1201, 1276], [1741, 1816], [1801, 1876], [1861, 1936], [1921, 1996], [1981, 2056], [2041, 2116]], '563301/03/08': [[1170, 1250], [1230, 1310], [1770, 1850], [1830, 1910], [1890, 1970], [1950, 2030], [2010, 2090], [2070, 2150], [2130, 2210]], '688002/03/08': [[1790, 1850], [1850, 1910], [1910, 1970], [1970, 2030], [2090, 2150], [2150, 2210], [2210, 2270], [2270, 2330], [2330, 2390], [2390, 2450], [2450, 2510], [2510, 2570], [2570, 2630], [2630, 2690], [2690, 2750]], '690102/03/08': [[1900, 1960], [1960, 2020], [2020, 2080], [2080, 2140], [2200, 2260], [2260, 2320], [2320, 2380], [2380, 2440], [2440, 2500], [2500, 2560], [2560, 2620], [2620, 2680], [2680, 2740]], '559402/03/08': [[2015, 2090], [2075, 2150], [2135, 2210], [2195, 2270], [2255, 2330], [2315, 2390], [2375, 2450], [2435, 2510], [2495, 2570], [2555, 2630], [2615, 2690], [2675, 2750]], '561302/03/08': [[2310, 2390], [2370, 2450], [2430, 2510], [2490, 2570], [2550, 2630], [2610, 2690], [2670, 2750]], '572602/03/08': [[2435, 2505], [2495, 2565], [2555, 2625], [2615, 2685], [2675, 2745]], '572502/03/08': [[2560, 2640], [2620, 2700]]}
可以使用以下方法获得笛卡尔积:
prod = itertools.product(*intervals)
这个笛卡尔积的大小是9915131275* 2 = 13,267,800
我希望通过不允许两个或多个域重叠的组合来减少它。这个组合可以:
[1081, 1156], [1170, 1250], [1790, 1850], [1900, 1960], [2015, 2090], [2310, 2390], [2435, 2505], [2560, 2640] OK
这个组合不行
[1141, 1216], [1170, 1250], [1790, 1850], [1900, 1960], [2015, 2090], [2310, 2390], [2435, 2505], [2560, 2640] not OK
以及任何以以下开头的组合:
[1141, 1216], [1170, 1250]
不应该考虑。这不包括 15131275*2 = 163,800 个组合 目的是显着减少笛卡尔积的大小,只有不重叠的区间。
提前致谢
我对这个问题的解释如下:从每个区间列表中恰好选择一个区间,并且只接受那些没有任何重叠的组合。
下面是 CPMPy (https://github.com/CPMpy/cpmpy) 中的约束规划模型。它使用 Element
约束从第 i
个间隔列表中选择选定的第 x[i]
个间隔。
基本约束是:
~( (starts[i] >= starts[j]) & (starts[i] <= ends[j]))
~( (starts[j] >= starts[i]) & (starts[j] <= ends[i]))
这确保第 i
个选定的间隔不会与第 j
个间隔重叠(反之亦然)。 (注:~
表示not
。)
from cpmpy import *
from cpmpy.solvers import *
from cpmpy_hakank import * # See http://hakank.org/cpmpy/cpmpy_hakank.py
def print_solution(a):
"""
Print the solution.
"""
# The selected intervals, as indices in each interval list
xval = a[0].value()
n = len(xval)
print(xval)
# The selected intervals, as intervals
sols = [intervals[i][xval[i]] for i in range(n)]
print(sols)
print(flush=True)
#
# Note: intervals is a list of list of intervals (not a dictionary)
#
def reduce_overlaps(intervals):
# Convert the list of intervals to a list of flattened lists
# for use with Element below.
intervals_flatten = []
for interval in intervals:
intervals_flatten.append(cpm_array(flatten_lists(interval)))
intervals_flatten = cpm_array(intervals_flatten)
# We need all values to create the domains of the selected interval
# values
all_values = flatten_lists(intervals_flatten)
max_val = max(all_values)
min_val = min(all_values)
n = len(intervals)
lens = [len(interval) for interval in intervals]
#
# Decision variables
#
model = Model()
# x[i] is the selected interval for the i'th interval list
x = intvar(0,max(lens),shape=n,name="x")
# Reduce the domain (the possible values) of each interval list
# (since they have different lengths)
for i in range(n):
model += [x[i] < lens[i]]
# starts[i] is the start value of the i'th selected interval
starts = intvar(min_val,max_val,shape=n,name="starts")
# ends[i] is the end value of the i'th selected interval
ends = intvar(min_val,max_val,shape=n,name="ends")
#
# Main constraints:
# - Pick exactly one of the intervals from each intervals list
# - Ensure that there are no overlaps between any of selected intervals.
#
# get the values of the selected intervals
for i in range(n):
# Use Element to obtain the start and end values of the selected
# interval. We have to use the following construct with Element
# since CPMPy does not (yet) support this syntax:
# starts[i] = intervals[x[i],0]
# ends[i] = intervals[x[i],1]
model += [starts[i] == Element(intervals_flatten[i],x[i]*2+0), # corresponds to: starts[i] = intervals[x[i],0]
ends[i] == Element(intervals_flatten[i],x[i]*2+1), # corresponds to: ends[i] = intervals[x[i],1]
]
# Ensure that the i'th selected interval don't overlap with
# the rest of the intervals (the j'th interval)
for i in range(n):
for j in range(i+1,n):
# Ensure that the start value of one interval is not inside the other interval
model += [~( (starts[i] >= starts[j]) & (starts[i] <= ends[j])),
~( (starts[j] >= starts[i]) & (starts[j] <= ends[i])) ]
# Print all solutions.
# This method is defined in http://hakank.org/cpmpy/cpmpy_hakank.py
# ortools_wrapper(model,[x],print_solution)
# Collect the solutions in an array
solutions = []
def get_solution(a):
xval = a[0].value()
# print(xval)
sol = [intervals[i][xval[i]] for i in range(n)]
# print("sol:",sol)
solutions.append(sol)
ortools_wrapper2(model,[x],get_solution)
return np.array(solutions)
intervals_dict = {
'561801/03/08': [[1081, 1156], [1141, 1216], [1201, 1276], [1741, 1816], [1801, 1876], [1861, 1936], [1921, 1996], [1981, 2056], [2041, 2116]],
'563301/03/08': [[1170, 1250], [1230, 1310], [1770, 1850], [1830, 1910], [1890, 1970], [1950, 2030], [2010, 2090], [2070, 2150], [2130, 2210]],
'688002/03/08': [[1790, 1850], [1850, 1910], [1910, 1970], [1970, 2030], [2090, 2150], [2150, 2210], [2210, 2270], [2270, 2330], [2330, 2390], [2390, 2450], [2450, 2510], [2510, 2570], [2570, 2630], [2630, 2690], [2690, 2750]],
'690102/03/08': [[1900, 1960], [1960, 2020], [2020, 2080], [2080, 2140], [2200, 2260], [2260, 2320], [2320, 2380], [2380, 2440], [2440, 2500], [2500, 2560], [2560, 2620], [2620, 2680], [2680, 2740]],
'559402/03/08': [[2015, 2090], [2075, 2150], [2135, 2210], [2195, 2270], [2255, 2330], [2315, 2390], [2375, 2450], [2435, 2510], [2495, 2570], [2555, 2630], [2615, 2690], [2675, 2750]],
'561302/03/08': [[2310, 2390], [2370, 2450], [2430, 2510], [2490, 2570], [2550, 2630], [2610, 2690], [2670, 2750]],
'572602/03/08': [[2435, 2505], [2495, 2565], [2555, 2625], [2615, 2685], [2675, 2745]],
'572502/03/08': [[2560, 2640], [2620, 2700]]
}
# Convert to a list of lists since this is needed for the output
intervals = [intervals_dict[a] for a in intervals_dict]
solutions = reduce_overlaps(intervals)
# print("Solutions:",solutions)
print("Num solutions:",len(solutions))
注意:该程序使用我的实用程序包 http://hakank.org/cpmpy/cpmpy_hakank.py .
该模型给出了 12201 个解决方案,显示了所选区间的索引以及区间。以下是其中一些解决方案:
sol #1
[7 7 2 4 5 2 4 0]
[[1981, 2056], [2070, 2150], [1910, 1970], [2200, 2260], [2315, 2390], [2430, 2510], [2675, 2745], [2560, 2640]]
sol #2
[7 7 0 4 5 2 4 0]
[[1981, 2056], [2070, 2150], [1790, 1850], [2200, 2260], [2315, 2390], [2430, 2510], [2675, 2745], [2560, 2640]]
sol #3
[7 7 1 4 5 2 4 0]
[[1981, 2056], [2070, 2150], [1850, 1910], [2200, 2260], [2315, 2390], [2430, 2510], [2675, 2745], [2560, 2640]]
....
sol #12200
[4 8 2 5 0 1 1 1]
[[1801, 1876], [2130, 2210], [1910, 1970], [2260, 2320], [2015, 2090], [2370, 2450], [2495, 2565], [2620, 2700]]
sol #12201
[6 8 1 5 0 1 1 1]
[[1921, 1996], [2130, 2210], [1850, 1910], [2260, 2320], [2015, 2090], [2370, 2450], [2495, 2565], [2620, 2700]]
ExitStatus.OPTIMAL (3.59288788 seconds)
Nr solutions: 12201
Num conflicts: 302
NumBranches: 135035
WallTime: 3.59288788
更新
这里有两个CP模型:
- CPMpy模型(与上面略有不同):http://hakank.org/cpmpy/reduce_overlapping_intervals_cp.py
- Picat 模型(我用它来制作问题原型):http://hakank.org/picat/reduce_overlapping_intervals.pi
更新 2
问题还有另一种解释:从每个区间列表中删除区间,使这些剩余区间列表的任意组合(从每个区间列表中取出一个区间)不重叠。并且我们要求每个区间列表至少保留一个区间。
这样说,那么一共有(根据我的Picat模型,见下)608599种不同的配置。
也许更有趣的是只使用最优解,即具有最大保持间隔数的配置。那么保持间隔的最佳数量是 15(同样根据我的 Picat 模型),并且有 170 个这样的配置。 (令我惊讶的是,保持间隔的最佳数量仅为 15 ,它是可能的 72 个间隔中的一小部分)。
以下是其中一些最佳解决方案(保留 15 个间隔):
interval = 1 = [[1081,1156],[1741,1816],[1801,1876],[1861,1936],[1921,1996],[1981,2056]]
interval = 2 = [[1170,1250],[1230,1310]]
interval = 3 = [[2150,2210],[2210,2270]]
interval = 4 = [[2080,2140]]
interval = 5 = [[2675,2750]]
interval = 6 = [[2310,2390]]
interval = 7 = [[2435,2505]]
interval = 8 = [[2560,2640]]
interval = 1 = [[1081,1156],[1141,1216],[1741,1816],[1801,1876]]
interval = 2 = [[1230,1310]]
interval = 3 = [[2150,2210],[2210,2270]]
interval = 4 = [[1900,1960],[1960,2020],[2020,2080],[2080,2140]]
interval = 5 = [[2435,2510]]
interval = 6 = [[2310,2390]]
interval = 7 = [[2675,2745]]
interval = 8 = [[2560,2640]]
interval = 1 = [[1081,1156],[1741,1816],[1801,1876]]
interval = 2 = [[1170,1250],[1230,1310]]
interval = 3 = [[2090,2150],[2150,2210],[2210,2270]]
interval = 4 = [[1900,1960],[1960,2020],[2020,2080]]
interval = 5 = [[2435,2510]]
interval = 6 = [[2310,2390]]
interval = 7 = [[2675,2745]]
interval = 8 = [[2560,2640]]
此方法的 Picat 模型在这里:http://hakank.org/picat/reduce_overlapping_interval2.pi。