逻辑回归的系数写入 python 中的函数
coefficient from logistic regression to write function in python
我刚刚完成逻辑回归。数据可以从下面link下载:
pleas click this link to download the data
下面是逻辑回归的代码。
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import roc_auc_score
import pandas as pd
scaler = StandardScaler()
data = pd.read_csv('data.csv')
dataX = data.drop('outcome',axis =1).values.astype(float)
X = scaler.fit_transform(dataX)
dataY = data[['outcome']]
Y = dataY.values
X_train,X_test,y_train,y_test = train_test_split (X,Y,test_size = 0.25, random_state = 33)
lr = LogisticRegression()
lr.fit(X_train,y_train)
# Predict the probability of the testing samples to belong to 0 or 1 class
predicted_probs = lr.predict_proba(X_test)
print(predicted_probs[0:3])
print(lr.coef_)
我可以打印逻辑回归系数,我可以计算事件发生 1 或 0 的概率。
当我使用这些系数编写 python 函数并计算发生概率 1 时。与使用此函数相比,我没有得到答案:lr.predict_proba(X_test)
我写的函数如下:
def xG(bodyPart,shotQuality,defPressure,numDefPlayers,numAttPlayers,shotdist,angle,chanceRating,type):
coeff = [0.09786083,2.30523761, -0.05875112,0.07905136,
-0.1663424 ,-0.73930942,-0.10385882,0.98845481,0.13175622]
return (coeff[0]*bodyPart+ coeff[1]*shotQuality+coeff[2]*defPressure+coeff[3]*numDefPlayers+coeff[4]*numAttPlayers+coeff[5]*shotdist+ coeff[6]*angle+coeff[7]*chanceRating+coeff[8]*type)
我得到了奇怪的答案。我知道函数计算错了。
我是机器学习和统计学的新手,请问您的建议。
我认为您错过了 xG
中的 intercept_
。您可以从 lr.intercept_
中检索它,它应该在最终公式中求和:
return 1/(1+e**(-(intercept + coeff[0]*bodyPart+ coeff[1]*shotQuality+coeff[2]*defPressure+coeff[3]*numDefPlayers+coeff[4]*numAttPlayers+coeff[5]*shotdist+ coeff[6]*angle+coeff[7]*chanceRating+coeff[8]*type))
我刚刚完成逻辑回归。数据可以从下面link下载: pleas click this link to download the data
下面是逻辑回归的代码。
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import roc_auc_score
import pandas as pd
scaler = StandardScaler()
data = pd.read_csv('data.csv')
dataX = data.drop('outcome',axis =1).values.astype(float)
X = scaler.fit_transform(dataX)
dataY = data[['outcome']]
Y = dataY.values
X_train,X_test,y_train,y_test = train_test_split (X,Y,test_size = 0.25, random_state = 33)
lr = LogisticRegression()
lr.fit(X_train,y_train)
# Predict the probability of the testing samples to belong to 0 or 1 class
predicted_probs = lr.predict_proba(X_test)
print(predicted_probs[0:3])
print(lr.coef_)
我可以打印逻辑回归系数,我可以计算事件发生 1 或 0 的概率。
当我使用这些系数编写 python 函数并计算发生概率 1 时。与使用此函数相比,我没有得到答案:lr.predict_proba(X_test)
我写的函数如下:
def xG(bodyPart,shotQuality,defPressure,numDefPlayers,numAttPlayers,shotdist,angle,chanceRating,type):
coeff = [0.09786083,2.30523761, -0.05875112,0.07905136,
-0.1663424 ,-0.73930942,-0.10385882,0.98845481,0.13175622]
return (coeff[0]*bodyPart+ coeff[1]*shotQuality+coeff[2]*defPressure+coeff[3]*numDefPlayers+coeff[4]*numAttPlayers+coeff[5]*shotdist+ coeff[6]*angle+coeff[7]*chanceRating+coeff[8]*type)
我得到了奇怪的答案。我知道函数计算错了。
我是机器学习和统计学的新手,请问您的建议。
我认为您错过了 xG
中的 intercept_
。您可以从 lr.intercept_
中检索它,它应该在最终公式中求和:
return 1/(1+e**(-(intercept + coeff[0]*bodyPart+ coeff[1]*shotQuality+coeff[2]*defPressure+coeff[3]*numDefPlayers+coeff[4]*numAttPlayers+coeff[5]*shotdist+ coeff[6]*angle+coeff[7]*chanceRating+coeff[8]*type))