没有从我的 Logistic 回归实现中获得正确的系数等值线图?
Not getting correct contour plot of coefficients from my Logistic Regression implementation?
我实现了逻辑回归并将其用于数据集。 (这是 Coursera 的 ML 课程第 3 周(通常使用 matlab 和 Octave)中的练习,使用 python(所以这不是作弊))。
我从 sklearn
中的实现开始,对本课程第三周 (http://pastie.org/10872959) 中使用的数据集进行分类。这是一个小的、可复制的例子,任何人都可以尝试我使用的东西(它只依赖于 numpy
和 sklearn
):
它获取数据集,将其拆分为特征矩阵和输出矩阵,然后从原始的 2(即来自
构造另外 26 个特征
).然后我在 sklearn
中使用逻辑回归,但这并没有给出所需的等高线图(请参见下文)。
from sklearn.linear_model import LogisticRegression as expit
import numpy as np
def thetaFunc(y, theta, x):
deg = 6
spot = 0
sum = 0
for i in range(1, deg + 1):
for j in range(i + 1):
sum += theta[spot] * x**(i - j) * y**(j)
spot += 1
return sum
def constructVariations(X, deg):
features = np.zeros((len(X), 27))
spot = 0
for i in range(1, deg + 1):
for j in range(i + 1):
features[:, spot] = X[:,0]**(i - j) * X[:,1]**(j)
spot += 1
return features
if __name__ == '__main__':
data = np.loadtxt("ex2points.txt", delimiter = ",")
X,Y = np.split(data, [len(data[0,:]) - 1], 1)
X = reg.constructVariations(X, 6)
oneArray = np.ones((len(X),1))
X = np.hstack((oneArray, X))
trial = expit(solver = 'sag')
trial = trial.fit(X = X,y = np.ravel(Y))
print(trial.coef_)
# everything below has been edited in
from matplotlib import pyplot as plt
txt = open("RegLogTheta", "r").read()
txt = txt.split()
theta = np.array(txt, float)
x = np.linspace(-1, 1.5, 100)
y = np.linspace(-1,1.5,100)
z = np.empty((100,100))
xx,yy = np.meshgrid(x,y)
for i in range(len(x)):
for j in range(len(y)):
z[i][j] = thetaFunc(yy[i][j], theta, xx[i][j])
plt.contour(xx,yy,z, levels = [0])
plt.show()
这是通用特征项的系数。
http://pastie.org/10872957 (i.e the coefficients to terms
及其生成的轮廓:
一个潜在的错误来源是我误解了 trial._coeff
中存储的 7 X 4 矩阵系数矩阵。我相信这 28 个值是上面 28 "variations" 的系数,并且我已经将系数映射到列方向和行方向的变化。按列,我的意思是 [:][0]
映射到前 7 个变体,[:][1]
映射到接下来的 7 个,依此类推,我的函数 constructVariations
解释了如何系统地创建变体。现在API维护的不是shape (n_classes, n_features)
的数组存储在trial._coeff
中,那么我是否应该推断fit
将数据分为四个类?还是我 运行 以其他方式糟糕地解决了这个问题?
更新
我对权重的解释(and/or 使用)肯定有问题:
我没有依赖 sklearn
内置的预测,而是尝试计算将以下设置为 1/2
的值
theta 的值是从打印 trial._coeff
中找到的值,x 和 y 是标量。然后绘制那些 x,y 以给出等高线。
我使用的代码(但最初没有添加)试图做到这一点。它背后的数学有什么问题?
One potential source of error is that I'm misinterpreting the 7 X 4 matrix coefficient matrix stored in trial._coeff
这个矩阵不是 7x4,而是 1x28(检查 print(trial.coef_.shape)
)。 28 个特征中的每一个都有一个系数(constructVariations 返回 27 个,手动添加 1 个)。
so should I infer that fit classified the data into four classes?
不,你误解了数组,它只有一行(对于二进制分类来说,有两行没有意义)。
Or have I run through this problem poorly in another way?
代码很好,解释不行。特别是,查看模型中的实际决策边界(通过调用 "predict" 和绘制等高线绘制)
from sklearn.linear_model import LogisticRegression as expit
import numpy as np
def constructVariations(X, deg):
features = np.zeros((len(X), 27))
spot = 0
for i in range(1, deg + 1):
for j in range(i + 1):
features[:, spot] = X[:,0]**(i - j) * X[:,1]**(j)
spot += 1
return features
if __name__ == '__main__':
data = np.loadtxt("ex2points.txt", delimiter = ",")
X,Y = np.split(data, [len(data[0,:]) - 1], 1)
rawX = np.copy(X)
X = constructVariations(X, 6)
oneArray = np.ones((len(X),1))
X = np.hstack((oneArray, X))
trial = expit(solver = 'sag')
trial = trial.fit(X = X,y = np.ravel(Y))
print(trial.coef_)
from matplotlib import pyplot as plt
h = 0.01
x_min, x_max = rawX[:, 0].min() - 1, rawX[:, 0].max() + 1
y_min, y_max = rawX[:, 1].min() - 1, rawX[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
data = constructVariations(np.c_[xx.ravel(), yy.ravel()], 6)
oneArray = np.ones((len(data),1))
data = np.hstack((oneArray, data))
Z = trial.predict(data)
Z = Z.reshape(xx.shape)
plt.figure()
plt.scatter(rawX[:, 0], rawX[:, 1], c=Y, linewidth=0, s=50)
plt.contourf(xx, yy, Z, cmap=plt.cm.Paired, alpha=0.8)
plt.show()
更新
在提供的代码中,您忘记(在可视化中)添加了“1"s to your data representation, thus your thetas are one "off”列,因为 theta[0] 是偏差,theta1 与您的 0 有关'th 变量等
def thetaFunc(y, theta, x):
deg = 6
spot = 0
sum = theta[spot]
spot += 1
for i in range(1, deg + 1):
for j in range(i + 1):
sum += theta[spot] * x**(i - j) * y**(j)
spot += 1
return sum
您还忘记了逻辑回归本身的拦截项,因此
xx,yy = np.meshgrid(x,y)
for i in range(len(x)):
for j in range(len(y)):
z[i][j] = thetaFunc(yy[i][j], theta, xx[i][j])
z -= trial.intercept_
(使用您的固定代码生成的图像)
import numpy as np
from sklearn.linear_model import LogisticRegression as expit
def thetaFunc(y, theta, x):
deg = 6
spot = 0
sum = theta[spot]
spot += 1
for i in range(1, deg + 1):
for j in range(i + 1):
sum += theta[spot] * x**(i - j) * y**(j)
spot += 1
return np.exp(-sum)
def constructVariations(X, deg):
features = np.zeros((len(X), 27))
spot = 0
for i in range(1, deg + 1):
for j in range(i + 1):
features[:, spot] = X[:,0]**(i - j) * X[:,1]**(j)
spot += 1
return features
if __name__ == '__main__':
data = np.loadtxt("ex2points.txt", delimiter = ",")
X,Y = np.split(data, [len(data[0,:]) - 1], 1)
X = constructVariations(X, 6)
rawX = np.copy(X)
oneArray = np.ones((len(X),1))
X = np.hstack((oneArray, X))
trial = expit(solver = 'sag')
trial = trial.fit(X = X,y = np.ravel(Y))
from matplotlib import pyplot as plt
theta = trial.coef_.ravel()
x = np.linspace(-1, 1.5, 100)
y = np.linspace(-1,1.5,100)
z = np.empty((100,100))
xx,yy = np.meshgrid(x,y)
for i in range(len(x)):
for j in range(len(y)):
z[i][j] = thetaFunc(yy[i][j], theta, xx[i][j])
z -= trial.intercept_
plt.contour(xx,yy,z > 1,cmap=plt.cm.Paired, alpha=0.8)
plt.scatter(rawX[:, 0], rawX[:, 1], c=Y, linewidth=0, s=50)
plt.show()
我实现了逻辑回归并将其用于数据集。 (这是 Coursera 的 ML 课程第 3 周(通常使用 matlab 和 Octave)中的练习,使用 python(所以这不是作弊))。
我从 sklearn
中的实现开始,对本课程第三周 (http://pastie.org/10872959) 中使用的数据集进行分类。这是一个小的、可复制的例子,任何人都可以尝试我使用的东西(它只依赖于 numpy
和 sklearn
):
它获取数据集,将其拆分为特征矩阵和输出矩阵,然后从原始的 2(即来自
构造另外 26 个特征sklearn
中使用逻辑回归,但这并没有给出所需的等高线图(请参见下文)。
from sklearn.linear_model import LogisticRegression as expit
import numpy as np
def thetaFunc(y, theta, x):
deg = 6
spot = 0
sum = 0
for i in range(1, deg + 1):
for j in range(i + 1):
sum += theta[spot] * x**(i - j) * y**(j)
spot += 1
return sum
def constructVariations(X, deg):
features = np.zeros((len(X), 27))
spot = 0
for i in range(1, deg + 1):
for j in range(i + 1):
features[:, spot] = X[:,0]**(i - j) * X[:,1]**(j)
spot += 1
return features
if __name__ == '__main__':
data = np.loadtxt("ex2points.txt", delimiter = ",")
X,Y = np.split(data, [len(data[0,:]) - 1], 1)
X = reg.constructVariations(X, 6)
oneArray = np.ones((len(X),1))
X = np.hstack((oneArray, X))
trial = expit(solver = 'sag')
trial = trial.fit(X = X,y = np.ravel(Y))
print(trial.coef_)
# everything below has been edited in
from matplotlib import pyplot as plt
txt = open("RegLogTheta", "r").read()
txt = txt.split()
theta = np.array(txt, float)
x = np.linspace(-1, 1.5, 100)
y = np.linspace(-1,1.5,100)
z = np.empty((100,100))
xx,yy = np.meshgrid(x,y)
for i in range(len(x)):
for j in range(len(y)):
z[i][j] = thetaFunc(yy[i][j], theta, xx[i][j])
plt.contour(xx,yy,z, levels = [0])
plt.show()
这是通用特征项的系数。
http://pastie.org/10872957 (i.e the coefficients to terms
及其生成的轮廓:
一个潜在的错误来源是我误解了 trial._coeff
中存储的 7 X 4 矩阵系数矩阵。我相信这 28 个值是上面 28 "variations" 的系数,并且我已经将系数映射到列方向和行方向的变化。按列,我的意思是 [:][0]
映射到前 7 个变体,[:][1]
映射到接下来的 7 个,依此类推,我的函数 constructVariations
解释了如何系统地创建变体。现在API维护的不是shape (n_classes, n_features)
的数组存储在trial._coeff
中,那么我是否应该推断fit
将数据分为四个类?还是我 运行 以其他方式糟糕地解决了这个问题?
更新
我对权重的解释(and/or 使用)肯定有问题:
我没有依赖 sklearn
内置的预测,而是尝试计算将以下设置为 1/2
theta 的值是从打印 trial._coeff
中找到的值,x 和 y 是标量。然后绘制那些 x,y 以给出等高线。
我使用的代码(但最初没有添加)试图做到这一点。它背后的数学有什么问题?
One potential source of error is that I'm misinterpreting the 7 X 4 matrix coefficient matrix stored in trial._coeff
这个矩阵不是 7x4,而是 1x28(检查 print(trial.coef_.shape)
)。 28 个特征中的每一个都有一个系数(constructVariations 返回 27 个,手动添加 1 个)。
so should I infer that fit classified the data into four classes?
不,你误解了数组,它只有一行(对于二进制分类来说,有两行没有意义)。
Or have I run through this problem poorly in another way?
代码很好,解释不行。特别是,查看模型中的实际决策边界(通过调用 "predict" 和绘制等高线绘制)
from sklearn.linear_model import LogisticRegression as expit
import numpy as np
def constructVariations(X, deg):
features = np.zeros((len(X), 27))
spot = 0
for i in range(1, deg + 1):
for j in range(i + 1):
features[:, spot] = X[:,0]**(i - j) * X[:,1]**(j)
spot += 1
return features
if __name__ == '__main__':
data = np.loadtxt("ex2points.txt", delimiter = ",")
X,Y = np.split(data, [len(data[0,:]) - 1], 1)
rawX = np.copy(X)
X = constructVariations(X, 6)
oneArray = np.ones((len(X),1))
X = np.hstack((oneArray, X))
trial = expit(solver = 'sag')
trial = trial.fit(X = X,y = np.ravel(Y))
print(trial.coef_)
from matplotlib import pyplot as plt
h = 0.01
x_min, x_max = rawX[:, 0].min() - 1, rawX[:, 0].max() + 1
y_min, y_max = rawX[:, 1].min() - 1, rawX[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
data = constructVariations(np.c_[xx.ravel(), yy.ravel()], 6)
oneArray = np.ones((len(data),1))
data = np.hstack((oneArray, data))
Z = trial.predict(data)
Z = Z.reshape(xx.shape)
plt.figure()
plt.scatter(rawX[:, 0], rawX[:, 1], c=Y, linewidth=0, s=50)
plt.contourf(xx, yy, Z, cmap=plt.cm.Paired, alpha=0.8)
plt.show()
更新
在提供的代码中,您忘记(在可视化中)添加了“1"s to your data representation, thus your thetas are one "off”列,因为 theta[0] 是偏差,theta1 与您的 0 有关'th 变量等
def thetaFunc(y, theta, x):
deg = 6
spot = 0
sum = theta[spot]
spot += 1
for i in range(1, deg + 1):
for j in range(i + 1):
sum += theta[spot] * x**(i - j) * y**(j)
spot += 1
return sum
您还忘记了逻辑回归本身的拦截项,因此
xx,yy = np.meshgrid(x,y)
for i in range(len(x)):
for j in range(len(y)):
z[i][j] = thetaFunc(yy[i][j], theta, xx[i][j])
z -= trial.intercept_
(使用您的固定代码生成的图像)
import numpy as np
from sklearn.linear_model import LogisticRegression as expit
def thetaFunc(y, theta, x):
deg = 6
spot = 0
sum = theta[spot]
spot += 1
for i in range(1, deg + 1):
for j in range(i + 1):
sum += theta[spot] * x**(i - j) * y**(j)
spot += 1
return np.exp(-sum)
def constructVariations(X, deg):
features = np.zeros((len(X), 27))
spot = 0
for i in range(1, deg + 1):
for j in range(i + 1):
features[:, spot] = X[:,0]**(i - j) * X[:,1]**(j)
spot += 1
return features
if __name__ == '__main__':
data = np.loadtxt("ex2points.txt", delimiter = ",")
X,Y = np.split(data, [len(data[0,:]) - 1], 1)
X = constructVariations(X, 6)
rawX = np.copy(X)
oneArray = np.ones((len(X),1))
X = np.hstack((oneArray, X))
trial = expit(solver = 'sag')
trial = trial.fit(X = X,y = np.ravel(Y))
from matplotlib import pyplot as plt
theta = trial.coef_.ravel()
x = np.linspace(-1, 1.5, 100)
y = np.linspace(-1,1.5,100)
z = np.empty((100,100))
xx,yy = np.meshgrid(x,y)
for i in range(len(x)):
for j in range(len(y)):
z[i][j] = thetaFunc(yy[i][j], theta, xx[i][j])
z -= trial.intercept_
plt.contour(xx,yy,z > 1,cmap=plt.cm.Paired, alpha=0.8)
plt.scatter(rawX[:, 0], rawX[:, 1], c=Y, linewidth=0, s=50)
plt.show()