生成具有大于 15 个元素的列表值对的分区的有效方法
Efficient way to generate partitions with value pair of a list >15 elements
我正在从元素列表生成分区列表(类似于集合的分区或集合分区)。问题是对于这些分区中的每一个,我都需要分配一个随机数来指示它们的值,这样我就可以 运行 稍后对由 partition = value 对组成的输出数据进行一些计算。
示例将是一个包含示例条目的 csv:
p,v
"[[1, 2, 3, 4]]",0.3999960625186746
"[[1], [2, 3, 4]]",0.49159520559753156
"[[1, 2], [3, 4]]",0.12658202037597555
"[[1, 3, 4], [2]]",0.11670775560336522
"[[1], [2], [3, 4]]",0.006059031164368345
这是我为此整理的代码:
from collections import defaultdict
import random
import csv
partitions = []
elements = input('Please specify number of elements: ')
size = int(elements)
fileheader = str(size)
# simple menu
if size == 1:
partitionlist = range(1,size+1)
print ('A one element list have 1 partition')
elif size < 28:
partitionlist = range(1,size+1)
elif size >= 28:
partitionlist = [0]
print ("Invalid number. Try again...")
# generate all partitions
def partition(elements):
if len(elements) == 1:
yield [ elements ]
return
first = elements[0]
for smaller in partition(elements[1:]):
# insert `first` in each of the subpartition's subsets
for n, subset in enumerate(smaller):
yield smaller[:n] + [[ first ] + subset] + smaller[n+1:]
# put `first` in its own subset
yield [ [ first ] ] + smaller
for p in partition(partitionlist):
partitions.append([sorted(p)] + [random.uniform(0,1)])
# write the generated input to CSV file
data = partitions
def partition_value_data(size):
with open( size+'-elem-normaldist.csv','w') as out:
csv_out=csv.writer(out)
csv_out.writerow(['p','v'])
for row in data:
csv_out.writerow(row)
partition_value_data(fileheader)
我面临的问题是,当元素数量超过 13 时,出现内存错误。是因为我的电脑内存还是 Python 本身的限制。我正在使用 Python 2.7.12.
对于包含 15 个元素的列表,分区数约为。 1382958545
我正在尝试生成最多包含 30 个元素的列表的分区,其中分区的数量约为。 545717047947902329359
非常感谢任何建议。谢谢。
您的问题是您将生成器与将其转换为列表相结合,完全抵消了创建生成器的任何好处。
相反,您应该直接从生成器中写出。
from collections import defaultdict
import random
import csv
elements = input('Please specify number of elements: ')
size = int(elements)
fileheader = str(size)
# simple menu
if size == 1:
partitionlist = range(1,size+1)
print ('A one element list have 1 partition')
elif size < 28:
partitionlist = range(1,size+1)
elif size >= 28:
partitionlist = [0]
print ("Invalid number. Try again...")
# generate all partitions
def partition(elements):
if len(elements) == 1:
yield [ elements ]
return
first = elements[0]
for smaller in partition(elements[1:]):
# insert `first` in each of the subpartition's subsets
for n, subset in enumerate(smaller):
yield smaller[:n] + [[ first ] + subset] + smaller[n+1:]
# put `first` in its own subset
yield [ [ first ] ] + smaller
def partition_value_data(size):
with open( size+'-elem-normaldist.csv','w') as out:
csv_out=csv.writer(out)
csv_out.writerow(['p','v'])
for row in partition(partitionlist):
csv_out.writerow([sorted(row)] + [random.uniform(0,1)])
partition_value_data(fileheader)
我正在从元素列表生成分区列表(类似于集合的分区或集合分区)。问题是对于这些分区中的每一个,我都需要分配一个随机数来指示它们的值,这样我就可以 运行 稍后对由 partition = value 对组成的输出数据进行一些计算。
示例将是一个包含示例条目的 csv:
p,v
"[[1, 2, 3, 4]]",0.3999960625186746
"[[1], [2, 3, 4]]",0.49159520559753156
"[[1, 2], [3, 4]]",0.12658202037597555
"[[1, 3, 4], [2]]",0.11670775560336522
"[[1], [2], [3, 4]]",0.006059031164368345
这是我为此整理的代码:
from collections import defaultdict
import random
import csv
partitions = []
elements = input('Please specify number of elements: ')
size = int(elements)
fileheader = str(size)
# simple menu
if size == 1:
partitionlist = range(1,size+1)
print ('A one element list have 1 partition')
elif size < 28:
partitionlist = range(1,size+1)
elif size >= 28:
partitionlist = [0]
print ("Invalid number. Try again...")
# generate all partitions
def partition(elements):
if len(elements) == 1:
yield [ elements ]
return
first = elements[0]
for smaller in partition(elements[1:]):
# insert `first` in each of the subpartition's subsets
for n, subset in enumerate(smaller):
yield smaller[:n] + [[ first ] + subset] + smaller[n+1:]
# put `first` in its own subset
yield [ [ first ] ] + smaller
for p in partition(partitionlist):
partitions.append([sorted(p)] + [random.uniform(0,1)])
# write the generated input to CSV file
data = partitions
def partition_value_data(size):
with open( size+'-elem-normaldist.csv','w') as out:
csv_out=csv.writer(out)
csv_out.writerow(['p','v'])
for row in data:
csv_out.writerow(row)
partition_value_data(fileheader)
我面临的问题是,当元素数量超过 13 时,出现内存错误。是因为我的电脑内存还是 Python 本身的限制。我正在使用 Python 2.7.12.
对于包含 15 个元素的列表,分区数约为。 1382958545
我正在尝试生成最多包含 30 个元素的列表的分区,其中分区的数量约为。 545717047947902329359
非常感谢任何建议。谢谢。
您的问题是您将生成器与将其转换为列表相结合,完全抵消了创建生成器的任何好处。
相反,您应该直接从生成器中写出。
from collections import defaultdict
import random
import csv
elements = input('Please specify number of elements: ')
size = int(elements)
fileheader = str(size)
# simple menu
if size == 1:
partitionlist = range(1,size+1)
print ('A one element list have 1 partition')
elif size < 28:
partitionlist = range(1,size+1)
elif size >= 28:
partitionlist = [0]
print ("Invalid number. Try again...")
# generate all partitions
def partition(elements):
if len(elements) == 1:
yield [ elements ]
return
first = elements[0]
for smaller in partition(elements[1:]):
# insert `first` in each of the subpartition's subsets
for n, subset in enumerate(smaller):
yield smaller[:n] + [[ first ] + subset] + smaller[n+1:]
# put `first` in its own subset
yield [ [ first ] ] + smaller
def partition_value_data(size):
with open( size+'-elem-normaldist.csv','w') as out:
csv_out=csv.writer(out)
csv_out.writerow(['p','v'])
for row in partition(partitionlist):
csv_out.writerow([sorted(row)] + [random.uniform(0,1)])
partition_value_data(fileheader)