理解Numpy中梯度下降算法的梯度

Question

我正在尝试找出多元梯度下降算法的 python 代码，并找到了几个这样的实现：

import numpy as np

# m denotes the number of examples here, not the number of features
def gradientDescent(x, y, theta, alpha, m, numIterations):
    xTrans = x.transpose()
    for i in range(0, numIterations):
        hypothesis = np.dot(x, theta)
        loss = hypothesis - y
        cost = np.sum(loss ** 2) / (2 * m)
        print("Iteration %d | Cost: %f" % (i, cost))
        # avg gradient per example
        gradient = np.dot(xTrans, loss) / m
        # update
        theta = theta - alpha * gradient
    return theta

从梯度下降的定义来看，梯度下降的表达式为：

然而，在 numpy 中，它被计算为：np.dot(xTrans, loss) / m 有人可以解释一下我们是如何得到这个 numpy 表达式的吗？

Answer 1

代码其实很简单，多花点时间阅读会很有帮助。

hypothesis - y 是平方损失梯度的第一部分（作为每个分量的向量形式），它被设置为 loss 变量。假设的计算看起来像是线性回归。
xTrans 是 x 的转置，所以如果我们将这两个点乘积，我们将得到它们分量乘积的总和。
然后我们除以 m 得到平均值。

除此之外，代码还有一些 python 风格问题。我们通常在 python 中使用 under_score 而不是 camelCase，因此例如函数应该是 gradient_descent。比 java 更清晰，不是吗？ :)

理解Numpy中梯度下降算法的梯度

Understanding gradient of gradient descent algorithm in Numpy

python

numpy

gradient-descent