张量流线性回归错误爆炸
tensorflow linear regression error blows up
我正在尝试使用 tensorflow 拟合一个非常简单的线性回归模型。然而,损失(均方误差)爆炸而不是减少到零。
首先,我生成我的数据:
x_data = np.random.uniform(high=10,low=0,size=100)
y_data = 3.5 * x_data -4 + np.random.normal(loc=0, scale=2,size=100)
然后,我定义计算图:
X = tf.placeholder(dtype=tf.float32, shape=100)
Y = tf.placeholder(dtype=tf.float32, shape=100)
m = tf.Variable(1.0)
c = tf.Variable(1.0)
Ypred = m*X + c
loss = tf.reduce_mean(tf.square(Ypred - Y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=.1)
train = optimizer.minimize(loss)
最后,运行 100 个时期:
steps = {}
steps['m'] = []
steps['c'] = []
losses=[]
for k in range(100):
_m = session.run(m)
_c = session.run(c)
_l = session.run(loss, feed_dict={X: x_data, Y:y_data})
session.run(train, feed_dict={X: x_data, Y:y_data})
steps['m'].append(_m)
steps['c'].append(_c)
losses.append(_l)
然而,当我绘制损失时,我得到:
完整的代码也可以找到here。
学习率太大; 0.001 效果很好:
x_data = np.random.uniform(high=10,low=0,size=100)
y_data = 3.5 * x_data -4 + np.random.normal(loc=0, scale=2,size=100)
X = tf.placeholder(dtype=tf.float32, shape=100)
Y = tf.placeholder(dtype=tf.float32, shape=100)
m = tf.Variable(1.0)
c = tf.Variable(1.0)
Ypred = m*X + c
loss = tf.reduce_mean(tf.square(Ypred - Y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)
train = optimizer.minimize(loss)
init = tf.global_variables_initializer()
with tf.Session() as session:
session.run(init)
steps = {}
steps['m'] = []
steps['c'] = []
losses=[]
for k in range(100):
_m = session.run(m)
_c = session.run(c)
_l = session.run(loss, feed_dict={X: x_data, Y:y_data})
session.run(train, feed_dict={X: x_data, Y:y_data})
steps['m'].append(_m)
steps['c'].append(_c)
losses.append(_l)
plt.plot(losses)
plt.savefig('loss.png')
(可能有用的参考:https://gist.github.com/fuglede/ad04ce38e80887ddcbeb6b81e97bbfbc)
每当你看到你的成本随着时期数单调增加时,这就是一个肯定迹象表明你的学习率太高了。重复运行你的训练,每次你的学习率乘以1/10,直到成本函数随着epochs的数量明显下降。
我正在尝试使用 tensorflow 拟合一个非常简单的线性回归模型。然而,损失(均方误差)爆炸而不是减少到零。
首先,我生成我的数据:
x_data = np.random.uniform(high=10,low=0,size=100)
y_data = 3.5 * x_data -4 + np.random.normal(loc=0, scale=2,size=100)
然后,我定义计算图:
X = tf.placeholder(dtype=tf.float32, shape=100)
Y = tf.placeholder(dtype=tf.float32, shape=100)
m = tf.Variable(1.0)
c = tf.Variable(1.0)
Ypred = m*X + c
loss = tf.reduce_mean(tf.square(Ypred - Y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=.1)
train = optimizer.minimize(loss)
最后,运行 100 个时期:
steps = {}
steps['m'] = []
steps['c'] = []
losses=[]
for k in range(100):
_m = session.run(m)
_c = session.run(c)
_l = session.run(loss, feed_dict={X: x_data, Y:y_data})
session.run(train, feed_dict={X: x_data, Y:y_data})
steps['m'].append(_m)
steps['c'].append(_c)
losses.append(_l)
然而,当我绘制损失时,我得到:
完整的代码也可以找到here。
学习率太大; 0.001 效果很好:
x_data = np.random.uniform(high=10,low=0,size=100)
y_data = 3.5 * x_data -4 + np.random.normal(loc=0, scale=2,size=100)
X = tf.placeholder(dtype=tf.float32, shape=100)
Y = tf.placeholder(dtype=tf.float32, shape=100)
m = tf.Variable(1.0)
c = tf.Variable(1.0)
Ypred = m*X + c
loss = tf.reduce_mean(tf.square(Ypred - Y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)
train = optimizer.minimize(loss)
init = tf.global_variables_initializer()
with tf.Session() as session:
session.run(init)
steps = {}
steps['m'] = []
steps['c'] = []
losses=[]
for k in range(100):
_m = session.run(m)
_c = session.run(c)
_l = session.run(loss, feed_dict={X: x_data, Y:y_data})
session.run(train, feed_dict={X: x_data, Y:y_data})
steps['m'].append(_m)
steps['c'].append(_c)
losses.append(_l)
plt.plot(losses)
plt.savefig('loss.png')
(可能有用的参考:https://gist.github.com/fuglede/ad04ce38e80887ddcbeb6b81e97bbfbc)
每当你看到你的成本随着时期数单调增加时,这就是一个肯定迹象表明你的学习率太高了。重复运行你的训练,每次你的学习率乘以1/10,直到成本函数随着epochs的数量明显下降。