如何在定义函数时插入条件

How to insert a condition when defines a function

我有一个包含 3 个分布的函数。假设3个分布之和为1

首先,我模拟了3组数据,将它们各自的直方图独立作图:

s1 = np.random.lognormal(2.0, 0.4, size = (20000, 1))
s2 = np.random.lognormal(1.2, 0.2, size = (20000, 1))
s3 = np.random.lognormal(1.5, 0.4, size = (20000, 1))

mb = np.max([s1,s2,s3])
X = np.arange(1,mb,0.1)
 #histogram population 1
Y11, bins1 = np.histogram(s1, X)
Y1 = Y11/Y11.sum()
X1 = bins1[:-1]

 #histogram population 2
Y22, bins2 = np.histogram(s2, X)
Y2 = Y22/Y22.sum()
X2 = bins2[:-1]

 #histogram population 3
Y33, bins3 = np.histogram(s3, X)
Y3 = Y33/Y33.sum()
X3 = bins3[:-1]

然后,我将 3 组数据连接成一组并制作一个直方图来处理它:

 #all mixed populations
S =  np.concatenate((s1, s2, s3), axis=0)
Yi, bins = np.histogram(S, X)
Y = Yi/Yi.sum() #Data is normalized to have and area under curve of 1
X = bins[:-1]

我定义了一个描述 3 个分布的函数(抱歉我的丑陋代码):

def logN(x, mu1, mu2, mu3, sigma1, sigma2, sigma3, P1, P2, P3 ):
    P1 = 1 - P2 -P3 #Here I define that the sum of three fractions is one
      
    return  P1*(np.exp(-(np.log(x) - mu1)**2 / (2 * sigma1 **2)) / (x * sigma1 * np.sqrt(2 * np.pi)))+ P2*(np.exp(-(np.log(x) - mu2)**2 / (2 * sigma2 **2)) / (x * sigma2 * np.sqrt(2 * np.pi)))+ P3*(np.exp(-(np.log(x) - mu3)**2 / (2 * sigma3 **2)) / (x * sigma3 * np.sqrt(2 * np.pi)))#lognormal function

params, pcov = curve_fit(logN, X,Y, method="trf", bounds=((0,0,0,0,0,0,0,0,0),(2, 2, 2, np.inf, np.inf,np.inf,1,1,1)), p0=(1,1,1,0.5,0.5,0.5,0.3,0.3,0.3), maxfev=4000)
print(params)

该功能似乎在工作(至少在图形上):

x = numpy.arange(0, mb, 0.1)

plt.figure(figsize=(10, 6))  #size of graph
plt.plot(X1, Y1, 'o', alpha=0.2)
plt.plot(X2, Y2, 'o', alpha=0.2)
plt.plot(X3, Y3, 'o', alpha=0.2)
plt.plot(X, Y, 'r', linewidth=2)
plt.plot(X, logN(X ,params[0], params[1],params[2], params[3], params[4], params[5], params[6], params[7], params[8]),'b', linewidth=2) 
plt.xlim([-5, mb+5])
plt.ylim([0, 0.08])

但是当我看到每个分布的分数时问题就来了(假设和必须是1):

params[8]+params[7]+params[6]
Out[155]: 0.4285989056828722

似乎函数忽略了我的条件P1 = 1 - P2 -P3有人可以帮我看看我的方法有什么问题。

如果您省略了未使用的冒名顶替者 P1,您的代码将如下所示:

# unused argument P1 is gone, the real P1 in the function still there
def logN(x, mu1, mu2, mu3, sigma1, sigma2, sigma3, P2, P3 ):
    P1 = 1 - P2 -P3 #Here I define that the sum of three fractions is one
      
    return  P1*(np.exp(-(np.log(x) - mu1)**2 / (2 * sigma1 **2)) / (x * sigma1 * np.sqrt(2 * np.pi)))+ P2*(np.exp(-(np.log(x) - mu2)**2 / (2 * sigma2 **2)) / (x * sigma2 * np.sqrt(2 * np.pi)))+ P3*(np.exp(-(np.log(x) - mu3)**2 / (2 * sigma3 **2)) / (x * sigma3 * np.sqrt(2 * np.pi)))#lognormal function

# You have one less parameter to fit
params, pcov = curve_fit(logN, X,Y, method="trf",
                         bounds=((0, 0, 0, 0,      0,     0,     0,  0),
                                 (2, 2, 2, np.inf, np.inf,np.inf,1,  1)),
                         p0=     (1, 1, 1, 0.5,    0.5,   0.5,   0.3,0.3),
                         maxfev=4000)

# yada plota yada

# *params in function call is simpler way to write params[0], params[1],...,params[7]
plt.plot(X, logN(X , *params),'b', linewidth=2) 

# if you want to know the real P1
P2 = params[6]
P3 = params[7]
P1 = 1-P2-P3

试试吧,它应该仍然像以前一样工作。