这个 TensorFlow 示例如何实际更新权重以找到解决方案
How does this TensorFlow sample actually update the weights to find the solution
tensorflow、python 和 numpy 的新手(我想这就是本示例中的所有内容)
在下面的代码中,我(几乎)了解到循环中的 update_weights.run() 调用正在计算损失并开发新的权重。我看不到的是这实际上是如何导致权重发生变化的。
我坚持的观点被评论为#THIS IS WHAT I DONT UNDERSTAND
update_weights.run() 和权重中的新值之间有什么关系? - 也许;为什么在值已更改的循环之后调用 weights.eval?
感谢您的帮助
#@test {"output": "ignore"}
# Import tf
import tensorflow as tf
# Numpy is Num-Pie n dimensional arrays
# https://en.wikipedia.org/wiki/NumPy
import numpy as np
# Plotting library
# http://matplotlib.org/users/pyplot_tutorial.html
import matplotlib.pyplot as plt
# %matplotlib magic
# http://ipython.readthedocs.io/en/stable/interactive/tutorial.html#magics-explained
%matplotlib inline
# Set up the data with a noisy linear relationship between X and Y.
# Variable?
num_examples = 5
noise_factor = 1.5
line_x_range = (-10,10)
#Just variables in Python
# np.linspace - Return evenly spaced numbers over a specified interval.
X = np.array([
np.linspace(line_x_range[0], line_x_range[1], num_examples),
np.linspace(line_x_range[0], line_x_range[1], num_examples)
])
# Plot out the starting data
# plt.figure(figsize=(4,4))
# plt.scatter(X[0], X[1])
# plt.show()
# npm.random.randn - Return a sample (or samples) from the “standard normal” distribution.
# Generate noise for x and y (2)
noise = np.random.randn(2, num_examples) * noise_factor
# plt.figure(figsize=(4,4))
# plt.scatter(noise[0],noise[1])
# plt.show()
# += on an np.array
X += noise
# The 'Answer' polyfit to the noisy data
answer_m, answer_b = np.polyfit(X[0], X[1], 1)
# Destructuring Assignment - http://codeschool.org/python-additional-miscellany/
x, y = X
# plt.figure(figsize=(4,4))
# plt.scatter(x, y)
# plt.show()
# np.array
# for a in x
# [(1., a) for a in [1,2,3]] => [(1.0, 1), (1.0, 2), (1.0, 3)]
# numpy.ndarray.astype - http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.astype.html
# Copy of the array, cast to a specified type.
x_with_bias = np.array([(1., a) for a in x]).astype(np.float32)
#Just variables in Python
# The difference between our current outputs and the training outputs over time
# Starts high and decreases
losses = []
history = []
training_steps = 50
learning_rate = 0.002
# Start the session and give it a variable name sess
with tf.Session() as sess:
# Set up all the tensors, variables, and operations.
# Creates a constant tensor
input = tf.constant(x_with_bias)
# Transpose the ndarray y of random float numbers
target = tf.constant(np.transpose([y]).astype(np.float32))
# Start with random weights
weights = tf.Variable(tf.random_normal([2, 1], 0, 0.1))
# Initialize variables ...?obscure?
tf.initialize_all_variables().run()
print('Initialization complete')
# tf.matmul - Matrix Multiplication
# What are yhat? Why this name?
yhat = tf.matmul(input, weights)
# tf.sub - Matrix Subtraction
yerror = tf.sub(yhat, target)
# tf.nn.l2_loss - Computes half the L2 norm of a tensor without the sqrt
# loss function?
loss = tf.nn.l2_loss(yerror)
# tf.train.GradientDescentOptimizer - Not sure how this is updating the weights tensor?
# What is it operating on?
update_weights = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# _ in Python is conventionally used for a throwaway variable
for step in range(training_steps):
# Repeatedly run the operations, updating the TensorFlow variable.
# THIS IS WHAT I DONT UNDERSTAND
update_weights.run()
losses.append(loss.eval())
b, m = weights.eval()
history.append((b,m,step))
# Training is done, get the final values for the graphs
betas = weights.eval()
yhat = yhat.eval()
# Show the fit and the loss over time.
# destructuring assignment
fig, (ax1, ax2, ax3) = plt.subplots(1, 3)
# Adjust whitespace between plots
plt.subplots_adjust(wspace=.2)
# Output size of the figure
fig.set_size_inches(12, 4)
ax1.set_title("Final Data Fit")
ax1.axis('equal')
ax1.axis([-15, 15, -15, 15])
# Scatter plot data x and y (pairs?) set with 60% opacity
ax1.scatter(x, y, alpha=.6)
# Scatter plot x and np.transpose(yhat)[0] (must be same length), in red, 50% transparency
# these appear to be the x values mapped onto the
ax1.scatter(x, np.transpose(yhat)[0], c="r", alpha=.5)
# Add the line along the slope defined by betas (whatever that is)
ax1.plot(line_x_range, [betas[0] + a * betas[1] for a in line_x_range], "g", alpha=0.6)
# This polyfit coefficients are reversed in order vs the betas
ax1.plot(line_x_range, [answer_m * a + answer_b for a in line_x_range], "r", alpha=0.3)
ax2.set_title("Loss over Time")
# Create a range of intefers from 0 to training_steps and plot the losses as a curve
ax2.plot(range(0, training_steps), losses)
ax2.set_ylabel("Loss")
ax2.set_xlabel("Training steps")
ax3.set_title("Slope over Time")
ax3.axis('equal')
ax3.axis([-15, 15, -15, 15])
for b, m, step in history:
ax3.plot(line_x_range, [b + a * m for a in line_x_range], "g", alpha=0.2)
# This line seems to be superfluous removing it doesn't change the behaviour
plt.show()
好的,所以 update_weights() 正在对您定义为预测与目标之间的误差的损失调用最小化器。
它会做的是在权重上增加一些小量(多小由 learning_rate 参数控制)来减少你的损失,从而做出你的预测 "truer"。
这就是您调用 update_weights() 时发生的情况,因此在调用之后您的权重从一个小值发生变化,如果一切都按照计划进行,您的损失值就会减少。
你想要的是跟随你的损失的演变和权重,例如看看损失是否真的在减少(并且你的算法有效)或者权重是否变化很大或者可能将它们可视化。
通过可视化损失的变化方式,您可以获得很多见解。
这就是为什么你必须看到参数和损失演变的完整历史;这就是为什么你在每一步都评估它们。
eval 或 运行 操作在最小化器上执行时与参数不同,当您在最小化器上执行时,它将应用 最小化器到当你对权重进行计算时,权重只是评估它们。
我强烈建议您阅读 this website,其中作者对正在发生的事情的解释比我好得多,而且更详细。
tensorflow、python 和 numpy 的新手(我想这就是本示例中的所有内容)
在下面的代码中,我(几乎)了解到循环中的 update_weights.run() 调用正在计算损失并开发新的权重。我看不到的是这实际上是如何导致权重发生变化的。
我坚持的观点被评论为#THIS IS WHAT I DONT UNDERSTAND
update_weights.run() 和权重中的新值之间有什么关系? - 也许;为什么在值已更改的循环之后调用 weights.eval?
感谢您的帮助
#@test {"output": "ignore"}
# Import tf
import tensorflow as tf
# Numpy is Num-Pie n dimensional arrays
# https://en.wikipedia.org/wiki/NumPy
import numpy as np
# Plotting library
# http://matplotlib.org/users/pyplot_tutorial.html
import matplotlib.pyplot as plt
# %matplotlib magic
# http://ipython.readthedocs.io/en/stable/interactive/tutorial.html#magics-explained
%matplotlib inline
# Set up the data with a noisy linear relationship between X and Y.
# Variable?
num_examples = 5
noise_factor = 1.5
line_x_range = (-10,10)
#Just variables in Python
# np.linspace - Return evenly spaced numbers over a specified interval.
X = np.array([
np.linspace(line_x_range[0], line_x_range[1], num_examples),
np.linspace(line_x_range[0], line_x_range[1], num_examples)
])
# Plot out the starting data
# plt.figure(figsize=(4,4))
# plt.scatter(X[0], X[1])
# plt.show()
# npm.random.randn - Return a sample (or samples) from the “standard normal” distribution.
# Generate noise for x and y (2)
noise = np.random.randn(2, num_examples) * noise_factor
# plt.figure(figsize=(4,4))
# plt.scatter(noise[0],noise[1])
# plt.show()
# += on an np.array
X += noise
# The 'Answer' polyfit to the noisy data
answer_m, answer_b = np.polyfit(X[0], X[1], 1)
# Destructuring Assignment - http://codeschool.org/python-additional-miscellany/
x, y = X
# plt.figure(figsize=(4,4))
# plt.scatter(x, y)
# plt.show()
# np.array
# for a in x
# [(1., a) for a in [1,2,3]] => [(1.0, 1), (1.0, 2), (1.0, 3)]
# numpy.ndarray.astype - http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.astype.html
# Copy of the array, cast to a specified type.
x_with_bias = np.array([(1., a) for a in x]).astype(np.float32)
#Just variables in Python
# The difference between our current outputs and the training outputs over time
# Starts high and decreases
losses = []
history = []
training_steps = 50
learning_rate = 0.002
# Start the session and give it a variable name sess
with tf.Session() as sess:
# Set up all the tensors, variables, and operations.
# Creates a constant tensor
input = tf.constant(x_with_bias)
# Transpose the ndarray y of random float numbers
target = tf.constant(np.transpose([y]).astype(np.float32))
# Start with random weights
weights = tf.Variable(tf.random_normal([2, 1], 0, 0.1))
# Initialize variables ...?obscure?
tf.initialize_all_variables().run()
print('Initialization complete')
# tf.matmul - Matrix Multiplication
# What are yhat? Why this name?
yhat = tf.matmul(input, weights)
# tf.sub - Matrix Subtraction
yerror = tf.sub(yhat, target)
# tf.nn.l2_loss - Computes half the L2 norm of a tensor without the sqrt
# loss function?
loss = tf.nn.l2_loss(yerror)
# tf.train.GradientDescentOptimizer - Not sure how this is updating the weights tensor?
# What is it operating on?
update_weights = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# _ in Python is conventionally used for a throwaway variable
for step in range(training_steps):
# Repeatedly run the operations, updating the TensorFlow variable.
# THIS IS WHAT I DONT UNDERSTAND
update_weights.run()
losses.append(loss.eval())
b, m = weights.eval()
history.append((b,m,step))
# Training is done, get the final values for the graphs
betas = weights.eval()
yhat = yhat.eval()
# Show the fit and the loss over time.
# destructuring assignment
fig, (ax1, ax2, ax3) = plt.subplots(1, 3)
# Adjust whitespace between plots
plt.subplots_adjust(wspace=.2)
# Output size of the figure
fig.set_size_inches(12, 4)
ax1.set_title("Final Data Fit")
ax1.axis('equal')
ax1.axis([-15, 15, -15, 15])
# Scatter plot data x and y (pairs?) set with 60% opacity
ax1.scatter(x, y, alpha=.6)
# Scatter plot x and np.transpose(yhat)[0] (must be same length), in red, 50% transparency
# these appear to be the x values mapped onto the
ax1.scatter(x, np.transpose(yhat)[0], c="r", alpha=.5)
# Add the line along the slope defined by betas (whatever that is)
ax1.plot(line_x_range, [betas[0] + a * betas[1] for a in line_x_range], "g", alpha=0.6)
# This polyfit coefficients are reversed in order vs the betas
ax1.plot(line_x_range, [answer_m * a + answer_b for a in line_x_range], "r", alpha=0.3)
ax2.set_title("Loss over Time")
# Create a range of intefers from 0 to training_steps and plot the losses as a curve
ax2.plot(range(0, training_steps), losses)
ax2.set_ylabel("Loss")
ax2.set_xlabel("Training steps")
ax3.set_title("Slope over Time")
ax3.axis('equal')
ax3.axis([-15, 15, -15, 15])
for b, m, step in history:
ax3.plot(line_x_range, [b + a * m for a in line_x_range], "g", alpha=0.2)
# This line seems to be superfluous removing it doesn't change the behaviour
plt.show()
好的,所以 update_weights() 正在对您定义为预测与目标之间的误差的损失调用最小化器。
它会做的是在权重上增加一些小量(多小由 learning_rate 参数控制)来减少你的损失,从而做出你的预测 "truer"。
这就是您调用 update_weights() 时发生的情况,因此在调用之后您的权重从一个小值发生变化,如果一切都按照计划进行,您的损失值就会减少。
你想要的是跟随你的损失的演变和权重,例如看看损失是否真的在减少(并且你的算法有效)或者权重是否变化很大或者可能将它们可视化。
通过可视化损失的变化方式,您可以获得很多见解。 这就是为什么你必须看到参数和损失演变的完整历史;这就是为什么你在每一步都评估它们。
eval 或 运行 操作在最小化器上执行时与参数不同,当您在最小化器上执行时,它将应用 最小化器到当你对权重进行计算时,权重只是评估它们。 我强烈建议您阅读 this website,其中作者对正在发生的事情的解释比我好得多,而且更详细。