Tensorflow 中的简单线性回归产生接近于零的系数

Simple linear regression in Tensorflow produces near zero coefficient

我正在尝试 Tensorflow 中只有一个自变量的简单线性回归。我的数据图显示系数应该接近 1,事实上,如果我 运行 使用 sklearn.linear_model.LinearRegression,我得到一个合理的结果,约为 0.90。

但是 运行 使用 this tutorialTensorflow 中使用 this tutorial 生成非常接近于零的系数。我能够使用随机数从 Tensorflow 中得到一个合理的结果。我试过调整学习率或轮数,但没有任何有意义的效果。

MRE 包含实际数据,并且应该根据 sklearn 产生 0.8975 的系数,但根据 Tensorflow 产生 0.00045 的系数。我已经考虑过它在局部最小值时被捕获,但是我能找到的关于此类问题的示例中的 none 对我的问题有效。

import numpy as np
import tensorflow as tf
from sklearn import linear_model

learning_rate = 0.1
epochs = 100

x_train = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348, 
                0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182, 
                -0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268, 
                -0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212, 
                -0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534, 
                0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158, 
                0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231, 
                0.00159, -0.00463, 0.00174, 0, -0.0029, 
                -0.00349, 0.01372, -0.00302])

y_train = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441, 
                0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416, 
                -0.00191, -0.00607, 0.00161, 0.00289, -0.00416, 
                0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032, 
                -0.00387, -0.00162, -0.00292, -0.01367, 0.00198, 
                0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164, 
                0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
                -0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196, 
                -0.00065, -0.00391, -0.0108, 0.01291, -0.00098])

regr = linear_model.LinearRegression()
regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
print ('Coefficients: ', regr.coef_)

weight = tf.Variable(0.)
bias = tf.Variable(0.)

for e in range(epochs):
    with tf.GradientTape() as tape:
        y_pred = weight*x_train + bias
        loss = tf.reduce_mean(tf.square(y_pred - y_train))
        gradients = tape.gradient(loss, [weight,bias])
        weight.assign_sub(gradients[0]*learning_rate)
        bias.assign_sub(gradients[1]*learning_rate)

print(weight.numpy(), 'weight', bias.numpy(), 'bias')

在发布的示例中,训练数据集 x 和 y 值非常小,这导致梯度非常小,因此虽然模型在数据上正确训练,但可能需要几百万次迭代,

scikit 学习线性回归模型使用最小二乘曲线拟合,因此它可以无限快速地拟合数据集。

将结果降低到可管理的 1000 次迭代的建议是应用 MinMaxScaler 使 x 和 y 数据集介于 0 和 1 之间,这将改善梯度并达到经过训练的模型,但是您应该将训练后的结果逆变换回来,如下修改代码所示。

    import numpy as np
    import tensorflow as tf
    from sklearn import linear_model
    from sklearn.preprocessing import MinMaxScaler
    import matplotlib.pyplot as plt
    learning_rate = 0.1
    epochs = 1000
    
    x_train0 = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348,
                    0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182,
                    -0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268,
                    -0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212,
                    -0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534,
                    0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158,
                    0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231,
                    0.00159, -0.00463, 0.00174, 0, -0.0029,
                    -0.00349, 0.01372, -0.00302])
    scaler1 = MinMaxScaler()
    x_train = scaler1.fit_transform(x_train0.reshape(-1,1))
    y_train0 = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441,
                    0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416,
                    -0.00191, -0.00607, 0.00161, 0.00289, -0.00416,
                    0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032,
                    -0.00387, -0.00162, -0.00292, -0.01367, 0.00198,
                    0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164,
                    0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
                    -0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196,
                    -0.00065, -0.00391, -0.0108, 0.01291, -0.00098])
    scaler2 = MinMaxScaler()
    y_train = scaler2.fit_transform(y_train0.reshape(-1,1))
    
    regr = linear_model.LinearRegression()
    regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
    print ('Coefficients: ', regr.coef_, ' intercept ',regr.intercept_, )
    
    weight = tf.Variable(0.)
    bias = tf.Variable(0.)
    
    for e in range(epochs):
        with tf.GradientTape() as tape:
            y_pred = weight*x_train + bias
            loss = tf.reduce_mean(tf.square(y_pred - y_train))
            gradients = tape.gradient(loss, [weight,bias])
            weight.assign_sub(gradients[0]*learning_rate)
            bias.assign_sub(gradients[1]*learning_rate)
    
    
    print(weight.numpy(), 'weight', bias.numpy(), 'bias')
    
    import matplotlib.pyplot as plt
    plt.plot(x_train0,scaler2.inverse_transform(y_pred.numpy()).flatten(),'r',label='model output')
    plt.scatter(x_train0,y_train0,label='training dataset')
    plt.legend()
    plt.show()

Coefficients: [[0.97913471]] intercept [-0.00420121]

0.96772194 weight 0.0018798028 bias