在 python 中实现涉及重组合的概率函数
Implementing probability function involving heavy combinatorics in python
这是关于我在 math stack 上提出的问题的答案。
我想将此 question/solution 转换为 python,但我无法解释此处使用的所有符号。
我意识到这个 post 有点太 'gimme the code' 不是一个很好的问题,但我问的目的是为了理解这里涉及的数学。我不太理解这里同时使用的数学符号的语言,但我可以很好地解释 python,以便在看到答案时将其概念化。
问题可以这样设置
import numpy as np
bag = np.hstack((
np.repeat(0, 80),
np.repeat(1, 21),
np.repeat(3, 5),
np.repeat(7,1)
))
我不确定这是否正是您所追求的,但这就是我计算的方式,例如,获得总和 == 6 的概率。
它比数学更实用,只是解决了这个特定的问题,所以我不确定它是否能帮助你理解数学。
import numpy as np
import itertools
from collections import Counter
import pandas as pd
bag = np.hstack((
np.repeat(0, 80),
np.repeat(1, 21),
np.repeat(3, 5),
np.repeat(7,1)
))
#107*106*105*104*103*102*101*100*99*98
#Out[176]: 127506499163211168000 #Permutations
##Need to reduce the number to sample from without changing the number of possible combinations
reduced_bag = np.hstack((
np.repeat(0, 10), ## 0 can be chosen all 10 times
np.repeat(1, 10), ## 1 can be chosen all 10 times
np.repeat(3, 5), ## 3 can be chosen up to 5 times
np.repeat(7,1) ## 7 can be chosen once
))
## There are 96 unique combinations
number_unique_combinations = len(set(list(itertools.combinations(reduced_bag,10))))
### sorted list of all combinations
unique_combinations = sorted(list(set(list(itertools.combinations(reduced_bag,10)))))
### sum of each unique combination
sums_list = [sum(uc) for uc in unique_combinations]
### probability for each unique combination
probability_dict = {0:80, 1:21, 3:5, 7:1} ##Dictionary to refer to
n = 107 ##Number in the bag
probability_list = []
##This part is VERY slow to run because of the itertools.permutations
for x in unique_combinations:
print(x)
p = 1 ##Start with the probability again
n = 107 ##Start with a full bag for each combination
count_x = Counter(x)
for i in x:
i_left = probability_dict[i] - (Counter(x)[i] - count_x[i]) ##Number of that type left in bag
p *= i_left/n ##Multiply the probability
n = n-1 # non replacement
count_x[i] = count_x[i] - 1 ##Reduce the number in the bag
p *= len(set(list(itertools.permutations(x,10)))) ##Multiply by the number of permutations per combination
probability_list.append(p)
##sum(probability_list) ## Has a rounding error
##Out[57]: 1.0000000000000002
##
##Put the combinations into dataframe
ar = np.array((unique_combinations,sums_list,probability_list))
df = pd.DataFrame(ar).T
##Name the columns
df.columns = ["combination", "sum","probability"]
## probability that sum is >= 6
df[df["sum"] >= 6]['probability'].sum()
## 0.24139909236232826
## probability that sum is == 6
df[df["sum"] == 6]['probability'].sum()
## 0.06756408790812335
这是关于我在 math stack 上提出的问题的答案。
我想将此 question/solution 转换为 python,但我无法解释此处使用的所有符号。
我意识到这个 post 有点太 'gimme the code' 不是一个很好的问题,但我问的目的是为了理解这里涉及的数学。我不太理解这里同时使用的数学符号的语言,但我可以很好地解释 python,以便在看到答案时将其概念化。
问题可以这样设置
import numpy as np
bag = np.hstack((
np.repeat(0, 80),
np.repeat(1, 21),
np.repeat(3, 5),
np.repeat(7,1)
))
我不确定这是否正是您所追求的,但这就是我计算的方式,例如,获得总和 == 6 的概率。 它比数学更实用,只是解决了这个特定的问题,所以我不确定它是否能帮助你理解数学。
import numpy as np
import itertools
from collections import Counter
import pandas as pd
bag = np.hstack((
np.repeat(0, 80),
np.repeat(1, 21),
np.repeat(3, 5),
np.repeat(7,1)
))
#107*106*105*104*103*102*101*100*99*98
#Out[176]: 127506499163211168000 #Permutations
##Need to reduce the number to sample from without changing the number of possible combinations
reduced_bag = np.hstack((
np.repeat(0, 10), ## 0 can be chosen all 10 times
np.repeat(1, 10), ## 1 can be chosen all 10 times
np.repeat(3, 5), ## 3 can be chosen up to 5 times
np.repeat(7,1) ## 7 can be chosen once
))
## There are 96 unique combinations
number_unique_combinations = len(set(list(itertools.combinations(reduced_bag,10))))
### sorted list of all combinations
unique_combinations = sorted(list(set(list(itertools.combinations(reduced_bag,10)))))
### sum of each unique combination
sums_list = [sum(uc) for uc in unique_combinations]
### probability for each unique combination
probability_dict = {0:80, 1:21, 3:5, 7:1} ##Dictionary to refer to
n = 107 ##Number in the bag
probability_list = []
##This part is VERY slow to run because of the itertools.permutations
for x in unique_combinations:
print(x)
p = 1 ##Start with the probability again
n = 107 ##Start with a full bag for each combination
count_x = Counter(x)
for i in x:
i_left = probability_dict[i] - (Counter(x)[i] - count_x[i]) ##Number of that type left in bag
p *= i_left/n ##Multiply the probability
n = n-1 # non replacement
count_x[i] = count_x[i] - 1 ##Reduce the number in the bag
p *= len(set(list(itertools.permutations(x,10)))) ##Multiply by the number of permutations per combination
probability_list.append(p)
##sum(probability_list) ## Has a rounding error
##Out[57]: 1.0000000000000002
##
##Put the combinations into dataframe
ar = np.array((unique_combinations,sums_list,probability_list))
df = pd.DataFrame(ar).T
##Name the columns
df.columns = ["combination", "sum","probability"]
## probability that sum is >= 6
df[df["sum"] >= 6]['probability'].sum()
## 0.24139909236232826
## probability that sum is == 6
df[df["sum"] == 6]['probability'].sum()
## 0.06756408790812335