Softmax Returns 单位矩阵
Softmax Returns Identity Matrix
我对 softmax
、y = tf.nn.softmax(tf.matmul(x, W) + b)
的输入是一个值矩阵
tf.matmul(x, W) + b =
[[ 9.77206726e+02]
[ 5.72391296e+02]
[ 3.53560760e+02]
[ 4.75727379e-01]
[ 6.58911804e+02]]
但是当这个输入到 softmax
时,我得到:
tf.nn.softmax(tf.matmul(x, W) + b) =
[[ 1.]
[ 1.]
[ 1.]
[ 1.]
[ 1.]]
使我的训练输出成为 1
的数组,这意味着 none 的权重 W
或偏差 b
更新每批训练数据。这也导致我在一组随机测试数据
上的准确度为1
下面是我的代码:
x = tf.placeholder(tf.float32, [None, 2])
W = tf.Variable(tf.random_normal([2, 1]))
b = tf.Variable(tf.random_normal([1]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
## placeholder for cross-entropy
y_ = tf.placeholder(tf.float32, [None, 1])
## cross-entropy function
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
## backpropagation & gradienct descent
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
## initialize variables
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
ITER_RANGE = 10
EVAL_BATCH_SIZE = ( len(training_outputs)/ITER_RANGE )
training_outputs = np.reshape(training_outputs, (300, 1))
## training
for i in range(ITER_RANGE):
print 'iterator:'
print i
## batch out training data
BEGIN = ( i*EVAL_BATCH_SIZE )
END = ( (i*EVAL_BATCH_SIZE) + EVAL_BATCH_SIZE )
batch_ys = training_outputs[BEGIN:END]
batch_xs = training_inputs[BEGIN:END]
print 'batch_xs'
print batch_xs
print 'batch_ys'
print batch_ys
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
# y = tf.nn.softmax(tf.matmul(x, W) + b)
print 'y'
print (sess.run(y, feed_dict={x: batch_xs, y_: batch_ys}))
#print 'x'
#print sess.run(x)
print 'W'
print sess.run(W)
print 'b'
print sess.run(b)
print 'tf.matmul(x, W) + b'
print sess.run(tf.matmul(x, W) + b, feed_dict={x: batch_xs, y_: batch_ys})
print 'tf.nn.softmaxtf.matmul(x, W) + b)'
print sess.run((tf.nn.softmax(tf.matmul(x, W) + b)), feed_dict={x: batch_xs, y_: batch_ys})
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
test_outputs = np.random.rand(300, 1)
## the following prints 1
print(sess.run(accuracy, feed_dict={x: test_inputs, y_: test_outputs}))
根据Softmax的定义,它“'squashes'任意实数值的K维向量到(0, 1)范围内实数值的K维向量加起来等于 1"
如果只有 1 个输出值,那么 Softmax 输出的分类概率分布就是 1
,而不是加起来为 1 的值。
您的 softmax 函数似乎已应用于输出向量中的每个不同值。尝试转置您的输出,即将 tf.nn.softmax(tf.matmul(x, W) + b))
更改为 tf.nn.softmax(tf.transpose(tf.matmul(x, W) + b)))
.
看起来你只有两个 类 {yes, no} 并且 tf.matmul(x, W) + b
代表了 {yes} 的概率。在这种情况下,您应该使用 tf.nn.sigmoid_cross_entropy_with_logits
而不是 softmax
。类似于:
y_pred = tf.matmul(x, W) + b
loss = tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(y_pred, y_))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
交叉熵损失不完整。使用带有 logits 的交叉熵。
我对 softmax
、y = tf.nn.softmax(tf.matmul(x, W) + b)
的输入是一个值矩阵
tf.matmul(x, W) + b =
[[ 9.77206726e+02]
[ 5.72391296e+02]
[ 3.53560760e+02]
[ 4.75727379e-01]
[ 6.58911804e+02]]
但是当这个输入到 softmax
时,我得到:
tf.nn.softmax(tf.matmul(x, W) + b) =
[[ 1.]
[ 1.]
[ 1.]
[ 1.]
[ 1.]]
使我的训练输出成为 1
的数组,这意味着 none 的权重 W
或偏差 b
更新每批训练数据。这也导致我在一组随机测试数据
1
下面是我的代码:
x = tf.placeholder(tf.float32, [None, 2])
W = tf.Variable(tf.random_normal([2, 1]))
b = tf.Variable(tf.random_normal([1]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
## placeholder for cross-entropy
y_ = tf.placeholder(tf.float32, [None, 1])
## cross-entropy function
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
## backpropagation & gradienct descent
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
## initialize variables
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
ITER_RANGE = 10
EVAL_BATCH_SIZE = ( len(training_outputs)/ITER_RANGE )
training_outputs = np.reshape(training_outputs, (300, 1))
## training
for i in range(ITER_RANGE):
print 'iterator:'
print i
## batch out training data
BEGIN = ( i*EVAL_BATCH_SIZE )
END = ( (i*EVAL_BATCH_SIZE) + EVAL_BATCH_SIZE )
batch_ys = training_outputs[BEGIN:END]
batch_xs = training_inputs[BEGIN:END]
print 'batch_xs'
print batch_xs
print 'batch_ys'
print batch_ys
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
# y = tf.nn.softmax(tf.matmul(x, W) + b)
print 'y'
print (sess.run(y, feed_dict={x: batch_xs, y_: batch_ys}))
#print 'x'
#print sess.run(x)
print 'W'
print sess.run(W)
print 'b'
print sess.run(b)
print 'tf.matmul(x, W) + b'
print sess.run(tf.matmul(x, W) + b, feed_dict={x: batch_xs, y_: batch_ys})
print 'tf.nn.softmaxtf.matmul(x, W) + b)'
print sess.run((tf.nn.softmax(tf.matmul(x, W) + b)), feed_dict={x: batch_xs, y_: batch_ys})
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
test_outputs = np.random.rand(300, 1)
## the following prints 1
print(sess.run(accuracy, feed_dict={x: test_inputs, y_: test_outputs}))
根据Softmax的定义,它“'squashes'任意实数值的K维向量到(0, 1)范围内实数值的K维向量加起来等于 1"
如果只有 1 个输出值,那么 Softmax 输出的分类概率分布就是 1
,而不是加起来为 1 的值。
您的 softmax 函数似乎已应用于输出向量中的每个不同值。尝试转置您的输出,即将 tf.nn.softmax(tf.matmul(x, W) + b))
更改为 tf.nn.softmax(tf.transpose(tf.matmul(x, W) + b)))
.
看起来你只有两个 类 {yes, no} 并且 tf.matmul(x, W) + b
代表了 {yes} 的概率。在这种情况下,您应该使用 tf.nn.sigmoid_cross_entropy_with_logits
而不是 softmax
。类似于:
y_pred = tf.matmul(x, W) + b
loss = tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(y_pred, y_))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
交叉熵损失不完整。使用带有 logits 的交叉熵。