python 中的梯度下降实现?

Gradient Descent implementation in python?

我尝试实现梯度下降,当我在样本数据集上测试它时它工作正常,但它在波士顿数据集上不正常。

你能验证一下代码有什么问题吗?为什么我没有得到正确的 theta 向量?

import numpy as np
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split

X = load_boston().data
y = load_boston().target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
X_train1 = np.c_[np.ones((len(X_train), 1)), X_train]
X_test1 = np.c_[np.ones((len(X_test), 1)), X_test]

eta = 0.0001
n_iterations = 100
m = len(X_train1)
tol = 0.00001

theta = np.random.randn(14, 1)

for i in range(n_iterations):
    gradients = 2/m * X_train1.T.dot(X_train1.dot(theta) - y_train)
    if np.linalg.norm(X_train1) < tol:
        break
    theta = theta - (eta * gradients)

我得到形状为 (14, 354) 的权重向量。我在这里做错了什么?

考虑一下(展开一些语句以获得更好的可见性):

for i in range(n_iterations):
    y_hat = X_train1.dot(theta)
    error = y_hat - y_train[:, None]
    gradients = 2/m * X_train1.T.dot(error)

    if np.linalg.norm(X_train1) < tol:
        break
    theta = theta - (eta * gradients)

因为 y_hat 是 (n_samples, 1) 而 y_train 是 (n_samples,) - 例如 n_samples 是 354 - 你需要使用虚拟轴技巧 y_train[:, None].

将 y_train 带到相同的维度

y_train 这里是一个一维 NP 数组 (ndim=1) 而 X_train1.dot(theta) 是一个二维 NP 数组 (ndim=2)。当你做减法时,y_train 被广播到与另一个相同的维度。要解决此问题,您可以将 y_train 也转换为二维数组。您可以通过 y_train.reshape(-1,1).

for i in range(n_iterations):
gradients = 2/m * X_train1.T.dot(X_train1.dot(theta) - y_train.reshape(-1,1))
if np.linalg.norm(X_train1) < tol:
    break
theta = theta - (eta * gradients)