生日悖论 python - 错误的概率输出

Birthday paradox python - incorrect probability output

我在编写 Python 中的生日悖论时遇到问题。生日悖论基本上是说,如果 class 中有 23 个人,其中两人生日相同的概率是 50%。

我试图在 Python 中编写这个悖论,但它总是以接近 25% 的概率出现。我是 Python 的新手,所以毫无疑问,这个问题有一个简单的解决方案。这是我的代码:

import random


def random_birthdays():
    bdays = []
    bdays = [random.randint(1, 365) for i in range(23)]
    bdays.sort()
    for x in bdays:
        while x < len(bdays)-1:
            if bdays[x] == bdays[x+1]:
                print(bdays[x])
                return True
            x+=1
        return False

count = 0
for i in range (1000):
if random_birthdays() == True:
    count = count + 1


print('In a sample of 1000 classes each with 23 pupils, there were', count, 'classes with individuals with the same birthday')

此行错误:

for x in bdays:

应该是

for x in range(len(bdays)):

因为您需要遍历生日索引而不是生日本身。

还有一项优化:

count = 0
for i in range (1000):
    if random_birthdays() == True:
       count = count + 1

可以替换为

count  = sum(random_birthdays() for _ in range(1000))

另外,你的函数应该这样实现:

import random

def random_birthdays(pupils):
    bdays = [random.randint(1, 365) for _ in range(pupils)]
    return pupils > len(set(bdays))

这消除了很多错误来源。

这可以像@Zefick 指出的那样调用:

count  = sum(random_birthdays(23) for _ in range(1000))
import math
def find(p):
    return math.ceil(math.sqrt(2*365*math.log(1/(1-p))));

print(find(0.25))

我是这样写的。

# check probability for birthday reoccurance for a class of 23 students or the birthday paradox
import random as r

def check_date(students):
    date=[]
    count=0
    
    for i in range(students): # Generate a random age for n students
        date+=[r.randint(1,365)] # entire sample list for age is created
        
    for letter in date: # check if the date repeats anywhere else
        if date.count(letter)>=2: # Use count it's simple & easy.
            count+=1
            
    return count # count of a pair of students having same b.day

def simulations(s,students):
    result=[] # empty list to update data.
    simulation_match=0
    
    for i in range(s):
        result+=[check_date(students)] # get a sample list for all the students in 'n' no. of simulations
        
        if check_date(students)>1: # if atleat 2 students have same b.day in each simulation
            simulation_match+=1
            
    return simulation_match,s,int(simulation_match/s*100),'%'

simulations(1000,23) # 1000 simulations with 23 students sample size

OUT: (494, 1000, 49, '%') **百分比部分根据生成的随机整数而变化**