从与另一个列表没有共同元素的总体生成随机样本列表

Generating a random sample list from a population that has no elements in common with another list

我希望对列表中的元素进行采样,以便该元素的 none 包含在另一个指定元素列表中。我希望继续生成新样本,直到生成一个不相交的样本。这个,下面的代码是我想到的,但是只要有一个相交的初始样本,它就不起作用,它进入无限循环并且打印显示所有生成的样本都是相同的。

import random 
unique_entities=['100','1001','10001','100001','11111']
pde_fin= ['2151', '2146', '2153', '2135', '2158', '2160', '2137', '2169', '2147', '2015', '2022', '2173', '2028', '2014', '2018', '2009', '1140', '1085', '1136', '1132', '1007', '1080', '1078', '1131', '1106', '1164', '1092', '1108', '1118', '1045', '1051', '1006','1001']
random_entities=random.sample(unique_entities,3) #choses 5 unique entities 
while(not(set(random_entities).isdisjoint(pde_fin))):
       random_entites=random.sample(unique_entities,5)
       print(random_entities,"random_entites")

print(unique_entities)

你能帮我看看哪里出了问题吗?

您可以在进行抽样之前过滤 unique_entities。从数学上讲,前后过滤在随机性方面是相同的。

unique_entities=['100','1001','10001','100001','11111']
pde_fin= ['2151', '2146', '2153', '2135', '2158', '2160', '2137', '2169', '2147', '2015', '2022', '2173', '2028', '2014', '2018', '2009', '1140', '1085', '1136', '1132', '1007', '1080', '1078', '1131', '1106', '1164', '1092', '1108', '1118', '1045', '1051', '1006','1001']
unique_entities_unique = [i for i in unique_entities if not i in pde_fin]
random_entities=random.sample(unique_entities_unique,3)
print(random_entities,"random_entites")

random_entites=random.sample(unique_entities,5) 有两个问题:

  • 首先,打错了,你写的是random_entites而不是random_entities
  • 其次,您要从 unique_entities 中抽取 5 个元素的样本,它恰好总共只包含 5 个元素。因此样本总是包含元素 '1001',也就是 pde_fin.
  • 中的一个元素

这是该程序的工作版本,其中包括一些其他调整:

import random

unique_entities = ['100', '1001', '10001', '100001', '11111']
pde_fin = ['2151', '2146', '2153', '2135', '2158', '2160', '2137', '2169', '2147', '2015', '2022', '2173', '2028',
           '2014', '2018', '2009', '1140', '1085', '1136', '1132', '1007', '1080', '1078', '1131', '1106', '1164',
           '1092', '1108', '1118', '1045', '1051', '1006', '1001']

sample_size = 3

random_entities = set(random.sample(unique_entities, sample_size))
print(f"{random_entities=}")
while not random_entities.isdisjoint(pde_fin):
    random_entities = set(random.sample(unique_entities, sample_size))
    print(f"{random_entities=}")

print(f"Result: {random_entities}")