Tensorflow 中的简单线性回归产生接近于零的系数
Simple linear regression in Tensorflow produces near zero coefficient
我正在尝试 Tensorflow
中只有一个自变量的简单线性回归。我的数据图显示系数应该接近 1,事实上,如果我 运行 使用 sklearn.linear_model.LinearRegression
,我得到一个合理的结果,约为 0.90。
但是 运行 使用 this tutorial 在 Tensorflow
中使用 this tutorial 生成非常接近于零的系数。我能够使用随机数从 Tensorflow
中得到一个合理的结果。我试过调整学习率或轮数,但没有任何有意义的效果。
MRE 包含实际数据,并且应该根据 sklearn
产生 0.8975 的系数,但根据 Tensorflow
产生 0.00045 的系数。我已经考虑过它在局部最小值时被捕获,但是我能找到的关于此类问题的示例中的 none 对我的问题有效。
import numpy as np
import tensorflow as tf
from sklearn import linear_model
learning_rate = 0.1
epochs = 100
x_train = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348,
0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182,
-0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268,
-0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212,
-0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534,
0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158,
0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231,
0.00159, -0.00463, 0.00174, 0, -0.0029,
-0.00349, 0.01372, -0.00302])
y_train = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441,
0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416,
-0.00191, -0.00607, 0.00161, 0.00289, -0.00416,
0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032,
-0.00387, -0.00162, -0.00292, -0.01367, 0.00198,
0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164,
0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
-0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196,
-0.00065, -0.00391, -0.0108, 0.01291, -0.00098])
regr = linear_model.LinearRegression()
regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
print ('Coefficients: ', regr.coef_)
weight = tf.Variable(0.)
bias = tf.Variable(0.)
for e in range(epochs):
with tf.GradientTape() as tape:
y_pred = weight*x_train + bias
loss = tf.reduce_mean(tf.square(y_pred - y_train))
gradients = tape.gradient(loss, [weight,bias])
weight.assign_sub(gradients[0]*learning_rate)
bias.assign_sub(gradients[1]*learning_rate)
print(weight.numpy(), 'weight', bias.numpy(), 'bias')
在发布的示例中,训练数据集 x 和 y 值非常小,这导致梯度非常小,因此虽然模型在数据上正确训练,但可能需要几百万次迭代,
scikit 学习线性回归模型使用最小二乘曲线拟合,因此它可以无限快速地拟合数据集。
将结果降低到可管理的 1000 次迭代的建议是应用 MinMaxScaler 使 x 和 y 数据集介于 0 和 1 之间,这将改善梯度并达到经过训练的模型,但是您应该将训练后的结果逆变换回来,如下修改代码所示。
import numpy as np
import tensorflow as tf
from sklearn import linear_model
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
learning_rate = 0.1
epochs = 1000
x_train0 = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348,
0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182,
-0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268,
-0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212,
-0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534,
0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158,
0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231,
0.00159, -0.00463, 0.00174, 0, -0.0029,
-0.00349, 0.01372, -0.00302])
scaler1 = MinMaxScaler()
x_train = scaler1.fit_transform(x_train0.reshape(-1,1))
y_train0 = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441,
0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416,
-0.00191, -0.00607, 0.00161, 0.00289, -0.00416,
0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032,
-0.00387, -0.00162, -0.00292, -0.01367, 0.00198,
0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164,
0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
-0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196,
-0.00065, -0.00391, -0.0108, 0.01291, -0.00098])
scaler2 = MinMaxScaler()
y_train = scaler2.fit_transform(y_train0.reshape(-1,1))
regr = linear_model.LinearRegression()
regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
print ('Coefficients: ', regr.coef_, ' intercept ',regr.intercept_, )
weight = tf.Variable(0.)
bias = tf.Variable(0.)
for e in range(epochs):
with tf.GradientTape() as tape:
y_pred = weight*x_train + bias
loss = tf.reduce_mean(tf.square(y_pred - y_train))
gradients = tape.gradient(loss, [weight,bias])
weight.assign_sub(gradients[0]*learning_rate)
bias.assign_sub(gradients[1]*learning_rate)
print(weight.numpy(), 'weight', bias.numpy(), 'bias')
import matplotlib.pyplot as plt
plt.plot(x_train0,scaler2.inverse_transform(y_pred.numpy()).flatten(),'r',label='model output')
plt.scatter(x_train0,y_train0,label='training dataset')
plt.legend()
plt.show()
Coefficients: [[0.97913471]] intercept [-0.00420121]
0.96772194 weight 0.0018798028 bias
我正在尝试 Tensorflow
中只有一个自变量的简单线性回归。我的数据图显示系数应该接近 1,事实上,如果我 运行 使用 sklearn.linear_model.LinearRegression
,我得到一个合理的结果,约为 0.90。
但是 运行 使用 this tutorial 在 Tensorflow
中使用 this tutorial 生成非常接近于零的系数。我能够使用随机数从 Tensorflow
中得到一个合理的结果。我试过调整学习率或轮数,但没有任何有意义的效果。
MRE 包含实际数据,并且应该根据 sklearn
产生 0.8975 的系数,但根据 Tensorflow
产生 0.00045 的系数。我已经考虑过它在局部最小值时被捕获,但是我能找到的关于此类问题的示例中的 none 对我的问题有效。
import numpy as np
import tensorflow as tf
from sklearn import linear_model
learning_rate = 0.1
epochs = 100
x_train = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348,
0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182,
-0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268,
-0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212,
-0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534,
0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158,
0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231,
0.00159, -0.00463, 0.00174, 0, -0.0029,
-0.00349, 0.01372, -0.00302])
y_train = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441,
0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416,
-0.00191, -0.00607, 0.00161, 0.00289, -0.00416,
0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032,
-0.00387, -0.00162, -0.00292, -0.01367, 0.00198,
0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164,
0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
-0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196,
-0.00065, -0.00391, -0.0108, 0.01291, -0.00098])
regr = linear_model.LinearRegression()
regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
print ('Coefficients: ', regr.coef_)
weight = tf.Variable(0.)
bias = tf.Variable(0.)
for e in range(epochs):
with tf.GradientTape() as tape:
y_pred = weight*x_train + bias
loss = tf.reduce_mean(tf.square(y_pred - y_train))
gradients = tape.gradient(loss, [weight,bias])
weight.assign_sub(gradients[0]*learning_rate)
bias.assign_sub(gradients[1]*learning_rate)
print(weight.numpy(), 'weight', bias.numpy(), 'bias')
在发布的示例中,训练数据集 x 和 y 值非常小,这导致梯度非常小,因此虽然模型在数据上正确训练,但可能需要几百万次迭代,
scikit 学习线性回归模型使用最小二乘曲线拟合,因此它可以无限快速地拟合数据集。
将结果降低到可管理的 1000 次迭代的建议是应用 MinMaxScaler 使 x 和 y 数据集介于 0 和 1 之间,这将改善梯度并达到经过训练的模型,但是您应该将训练后的结果逆变换回来,如下修改代码所示。
import numpy as np
import tensorflow as tf
from sklearn import linear_model
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
learning_rate = 0.1
epochs = 1000
x_train0 = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348,
0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182,
-0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268,
-0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212,
-0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534,
0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158,
0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231,
0.00159, -0.00463, 0.00174, 0, -0.0029,
-0.00349, 0.01372, -0.00302])
scaler1 = MinMaxScaler()
x_train = scaler1.fit_transform(x_train0.reshape(-1,1))
y_train0 = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441,
0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416,
-0.00191, -0.00607, 0.00161, 0.00289, -0.00416,
0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032,
-0.00387, -0.00162, -0.00292, -0.01367, 0.00198,
0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164,
0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
-0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196,
-0.00065, -0.00391, -0.0108, 0.01291, -0.00098])
scaler2 = MinMaxScaler()
y_train = scaler2.fit_transform(y_train0.reshape(-1,1))
regr = linear_model.LinearRegression()
regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
print ('Coefficients: ', regr.coef_, ' intercept ',regr.intercept_, )
weight = tf.Variable(0.)
bias = tf.Variable(0.)
for e in range(epochs):
with tf.GradientTape() as tape:
y_pred = weight*x_train + bias
loss = tf.reduce_mean(tf.square(y_pred - y_train))
gradients = tape.gradient(loss, [weight,bias])
weight.assign_sub(gradients[0]*learning_rate)
bias.assign_sub(gradients[1]*learning_rate)
print(weight.numpy(), 'weight', bias.numpy(), 'bias')
import matplotlib.pyplot as plt
plt.plot(x_train0,scaler2.inverse_transform(y_pred.numpy()).flatten(),'r',label='model output')
plt.scatter(x_train0,y_train0,label='training dataset')
plt.legend()
plt.show()
Coefficients: [[0.97913471]] intercept [-0.00420121]
0.96772194 weight 0.0018798028 bias