这个基本的线性回归有什么问题?
What is the issue with is this basic linear regression?
目前正在尝试 运行 使用 jupyter 笔记本上的一些测试数据点进行非常基本的线性回归。下面是我的代码,如您所见,如果您 运行 这条预测线肯定会朝它应该去的地方移动,但由于某种原因它会停止,我不太确定为什么。谁能帮帮我?
起始权重
最终权重
损失
import matplotlib.pyplot as plt
import numpy as np
%matplotlib notebook
plt.style = "ggplot"
y = np.array([30,70,90,120,150,160,190,220])
x = np.arange(2,len(y)+2)
N = len(y)
weights = np.array([0.2,0.2])
plt.figure()
plt.scatter(x, y, color="red")
plt.plot(y_hat)
x_ticks = np.array([[1,x*0.1] for x in range(100)])
y_hat = []
for j in range(len(x_ticks)):
y_hat.append(np.dot(weights, x_ticks[j]))
def plot_model(x, y, weights, loss):
x_ticks = np.array([[1,x*0.1] for x in range(100)])
y_hat = []
for j in range(len(x_ticks)):
y_hat.append(np.dot(weights, x_ticks[j]))
plt.figure()
plt.scatter(x, y, color="red")
plt.plot(y_hat)
plt.figure()
plt.plot(loss)
def calculate_grad(weights, N, x_proc, y, loss):
residuals = np.sum(y.reshape(N,1) - weights*x_proc, 1)
loss.append(sum(residuals**2)/2)
#print(residuals, x_proc)
return -np.dot(residuals, x_proc)
def adjust_weights(weights, grad, learning_rate):
weights -= learning_rate*grad
return weights
learning_rate = 0.006
epochs = 2000
loss = []
x_processed = np.array([[1,i] for i in x])
for j in range(epochs):
grad = calculate_grad(weights, N, x_processed, y, loss)
weights = adjust_weights(weights, grad, learning_rate)
if j % 200 == 0:
print(weights, grad)
plot_model(x, y, weights, loss)
有几个问题。
首先让我们谈谈您尝试查找参数的方式。您在执行矩阵和向量乘法时遇到一些问题。我喜欢将权重和 y 可视化为列向量。
然后,您就会知道我们需要对您处理过的 x 矩阵和权重列向量进行点积。这是第 1 步。
现在,记住你的链式法则!您的梯度走在正确的轨道上,但您需要记住将 x_proc * residuals
乘以 (-2/n),其中 n 是您拥有的观测值的数量!
这是代码:
y = np.array([[30,70,90,120,150,160,190,220]]).T
x = np.arange(2,len(y)+2)
N = len(y)
weights = np.array([0.2,0.2])
def calculate_grad(weights, N, x_proc, y, loss):
y_hat = np.dot(x_proc, weights.reshape(2,1))
residuals = y - y_hat
gradient = (-2/float(len(x_proc)))*sum(x_proc * residuals)
return gradient
def adjust_weights(weights, grad, learning_rate):
weights -= learning_rate*grad
return weights
现在是绘图问题。
x 不需要增加 0.1。您应该像查找权重时那样简单地使用 x_proc。像这样:
def plot_model(x, y, weights, loss):
y_hat = []
for j in x:
y_hat.append(np.dot(weights, [1, j]))
plt.figure()
plt.scatter(x, y, color="red")
plt.plot(x, y_hat)
plt.show()
而且 tada,经过 2000 次迭代,您得到的权重:[-12.80036278 25.75042317]
非常接近实际解决方案:[-13.33333 25.833333]
.
这是工作代码:
import numpy as np
import matplotlib.pyplot as plt
y = np.array([[30,70,90,120,150,160,190,220]]).T
x = np.arange(2,len(y)+2)
N = len(y)
weights = np.array([0.2,0.2])
def plot_model(x, y, weights, loss):
y_hat = []
for j in x:
y_hat.append(np.dot(weights, [1, j]))
plt.figure()
plt.scatter(x, y, color="red")
plt.plot(x, y_hat)
plt.show()
def calculate_grad(weights, N, x_proc, y, loss):
y_hat = np.dot(x_proc, weights.reshape(2,1))
residuals = y - y_hat
gradient = (-2/float(len(x_proc)))*sum(x_proc * residuals)
return gradient
def adjust_weights(weights, grad, learning_rate):
weights -= learning_rate*grad
return weights
learning_rate = 0.006
epochs = 2000
loss = []
x_processed = np.array([[1,i] for i in x])
for j in range(epochs):
grad = calculate_grad(weights, N, x_processed, y, loss)
weights = adjust_weights(weights, grad, learning_rate)
plot_model(x, y, weights, loss)
我想向您解释一下我是如何解决这个问题的。
我从笔和纸开始。找到数值并不是那么重要,但您必须了解运算顺序(例如,将矩阵 x 乘以权重的列向量,而不是相反)。你的代码让我感到困惑,因为我不确定你为操作顺序构建的思维导图。
那么写代码就很简单了。
如果你想检查你的解决方案是否正确,你可以使用封闭形式的解决方案(如果有一个;))最小二乘问题:inverse(X.TX)(X.T*y)。你的情况:
y = np.array([[30,70,90,120,150,160,190,220]]).T
x = np.arange(2,len(y)+2)
x = np.matrix([[1, x] for x in x])
beta_pt1 = np.linalg.inv(x.T*x)
beta_pt2 = x.T*y
beta = beta_pt1*beta_pt2
print(beta_pt1*beta_pt2)
目前正在尝试 运行 使用 jupyter 笔记本上的一些测试数据点进行非常基本的线性回归。下面是我的代码,如您所见,如果您 运行 这条预测线肯定会朝它应该去的地方移动,但由于某种原因它会停止,我不太确定为什么。谁能帮帮我?
起始权重
最终权重
损失
import matplotlib.pyplot as plt
import numpy as np
%matplotlib notebook
plt.style = "ggplot"
y = np.array([30,70,90,120,150,160,190,220])
x = np.arange(2,len(y)+2)
N = len(y)
weights = np.array([0.2,0.2])
plt.figure()
plt.scatter(x, y, color="red")
plt.plot(y_hat)
x_ticks = np.array([[1,x*0.1] for x in range(100)])
y_hat = []
for j in range(len(x_ticks)):
y_hat.append(np.dot(weights, x_ticks[j]))
def plot_model(x, y, weights, loss):
x_ticks = np.array([[1,x*0.1] for x in range(100)])
y_hat = []
for j in range(len(x_ticks)):
y_hat.append(np.dot(weights, x_ticks[j]))
plt.figure()
plt.scatter(x, y, color="red")
plt.plot(y_hat)
plt.figure()
plt.plot(loss)
def calculate_grad(weights, N, x_proc, y, loss):
residuals = np.sum(y.reshape(N,1) - weights*x_proc, 1)
loss.append(sum(residuals**2)/2)
#print(residuals, x_proc)
return -np.dot(residuals, x_proc)
def adjust_weights(weights, grad, learning_rate):
weights -= learning_rate*grad
return weights
learning_rate = 0.006
epochs = 2000
loss = []
x_processed = np.array([[1,i] for i in x])
for j in range(epochs):
grad = calculate_grad(weights, N, x_processed, y, loss)
weights = adjust_weights(weights, grad, learning_rate)
if j % 200 == 0:
print(weights, grad)
plot_model(x, y, weights, loss)
有几个问题。
首先让我们谈谈您尝试查找参数的方式。您在执行矩阵和向量乘法时遇到一些问题。我喜欢将权重和 y 可视化为列向量。
然后,您就会知道我们需要对您处理过的 x 矩阵和权重列向量进行点积。这是第 1 步。
现在,记住你的链式法则!您的梯度走在正确的轨道上,但您需要记住将 x_proc * residuals
乘以 (-2/n),其中 n 是您拥有的观测值的数量!
这是代码:
y = np.array([[30,70,90,120,150,160,190,220]]).T
x = np.arange(2,len(y)+2)
N = len(y)
weights = np.array([0.2,0.2])
def calculate_grad(weights, N, x_proc, y, loss):
y_hat = np.dot(x_proc, weights.reshape(2,1))
residuals = y - y_hat
gradient = (-2/float(len(x_proc)))*sum(x_proc * residuals)
return gradient
def adjust_weights(weights, grad, learning_rate):
weights -= learning_rate*grad
return weights
现在是绘图问题。
x 不需要增加 0.1。您应该像查找权重时那样简单地使用 x_proc。像这样:
def plot_model(x, y, weights, loss):
y_hat = []
for j in x:
y_hat.append(np.dot(weights, [1, j]))
plt.figure()
plt.scatter(x, y, color="red")
plt.plot(x, y_hat)
plt.show()
而且 tada,经过 2000 次迭代,您得到的权重:[-12.80036278 25.75042317]
非常接近实际解决方案:[-13.33333 25.833333]
.
这是工作代码:
import numpy as np
import matplotlib.pyplot as plt
y = np.array([[30,70,90,120,150,160,190,220]]).T
x = np.arange(2,len(y)+2)
N = len(y)
weights = np.array([0.2,0.2])
def plot_model(x, y, weights, loss):
y_hat = []
for j in x:
y_hat.append(np.dot(weights, [1, j]))
plt.figure()
plt.scatter(x, y, color="red")
plt.plot(x, y_hat)
plt.show()
def calculate_grad(weights, N, x_proc, y, loss):
y_hat = np.dot(x_proc, weights.reshape(2,1))
residuals = y - y_hat
gradient = (-2/float(len(x_proc)))*sum(x_proc * residuals)
return gradient
def adjust_weights(weights, grad, learning_rate):
weights -= learning_rate*grad
return weights
learning_rate = 0.006
epochs = 2000
loss = []
x_processed = np.array([[1,i] for i in x])
for j in range(epochs):
grad = calculate_grad(weights, N, x_processed, y, loss)
weights = adjust_weights(weights, grad, learning_rate)
plot_model(x, y, weights, loss)
我想向您解释一下我是如何解决这个问题的。
我从笔和纸开始。找到数值并不是那么重要,但您必须了解运算顺序(例如,将矩阵 x 乘以权重的列向量,而不是相反)。你的代码让我感到困惑,因为我不确定你为操作顺序构建的思维导图。
那么写代码就很简单了。
如果你想检查你的解决方案是否正确,你可以使用封闭形式的解决方案(如果有一个;))最小二乘问题:inverse(X.TX)(X.T*y)。你的情况:
y = np.array([[30,70,90,120,150,160,190,220]]).T
x = np.arange(2,len(y)+2)
x = np.matrix([[1, x] for x in x])
beta_pt1 = np.linalg.inv(x.T*x)
beta_pt2 = x.T*y
beta = beta_pt1*beta_pt2
print(beta_pt1*beta_pt2)