theano中的线性回归

Question

T.mean在this example中的意义是什么？我认为如果实现是矢量化的，T.mean 是有意义的。这里的输入 x 和 y 到 train(x, y) 是标量，而 cost 只求单个输入的平方误差，并迭代数据。

cost = T.mean(T.sqr(y - Y))
gradient = T.grad(cost=cost, wrt=w)
updates = [[w, w - gradient * 0.01]]

train = theano.function(inputs=[X, Y], outputs=cost, updates=updates, allow_input_downcast=True)

for i in range(100):
    for x, y in zip(trX, trY):
        train(x, y)

print w.get_value()

删除 T.mean 对输出模式没有影响。

Answer 1

你说得对，T.mean在这里没有意义。成本函数一次对单个训练样本进行操作，因此 "mean squared error" 实际上只是样本的平方误差。

此示例通过 stochastic gradient descent, an algorithm for online optimization. SGD iterates over samples one-by-one, as is the case in this example. However, in more complex scenarios, the dataset is often processed in mini-batches 实现线性回归，从而提供更好的性能和收敛特性。

我认为 T.mean 留在示例中作为小批量梯度下降的产物，或者更明确地表明成本函数是 MSE。

theano中的线性回归

Linear regression in theano

python

theano