Tensorflow tf.nn.softmax() 函数的性能比手写的 softmax 好很多
Tensorflow tf.nn.softmax() function performs much better than hand-written softmax
我正在用 tensorflow 写一个简单的逻辑回归。我发现当使用 tf.nn.softmax 时,算法收敛得更快,最终准确率更高。
如果换成我自己实现的softmax,网络收敛的比较慢,最后的准确率也没有那么好。
代码如下:
SEED = 1025
W = tf.Variable(tf.truncated_normal([image_size * image_size, num_labels], seed=SEED))
b = tf.Variable(tf.zeros([num_labels]))
logits = tf.matmul(train_dataset, W) + b
# My softmax:
y_ = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis=0)
# Tensorflow softmax:
y_ = tf.nn.softmax(logits)
y_clipped = tf.clip_by_value(y_, 1e-10, 0.9999999)
loss = -tf.reduce_mean(tf.reduce_sum(train_labels * tf.log(y_clipped), axis=1))
使用我的 softmax:
Loss at step 0: 22.213934
Training accuracy: 12.7%
Validation accuracy: 13.2%
Loss at step 100: 12.777291
Training accuracy: 45.3%
Validation accuracy: 45.5%
Loss at step 200: 11.361242
Training accuracy: 48.2%
Validation accuracy: 47.4%
Loss at step 300: 10.658278
Training accuracy: 51.4%
Validation accuracy: 49.7%
Loss at step 400: 9.297832
Training accuracy: 59.2%
Validation accuracy: 56.8%
Loss at step 500: 8.902699
Training accuracy: 62.0%
Validation accuracy: 59.2%
Loss at step 600: 8.681184
Training accuracy: 64.2%
Validation accuracy: 61.0%
Loss at step 700: 8.529438
Training accuracy: 65.8%
Validation accuracy: 62.3%
Loss at step 800: 8.416442
Training accuracy: 66.8%
Validation accuracy: 63.3%
Test accuracy: 70.4%
使用tensorflow的softmax:
Loss at step 0: 13.555875
Training accuracy: 12.7%
Validation accuracy: 14.5%
Loss at step 100: 2.194562
Training accuracy: 72.5%
Validation accuracy: 72.0%
Loss at step 200: 1.808641
Training accuracy: 75.5%
Validation accuracy: 74.5%
Loss at step 300: 1.593390
Training accuracy: 76.8%
Validation accuracy: 75.0%
Loss at step 400: 1.442661
Training accuracy: 77.7%
Validation accuracy: 75.2%
Loss at step 500: 1.327751
Training accuracy: 78.2%
Validation accuracy: 75.4%
Loss at step 600: 1.236314
Training accuracy: 78.5%
Validation accuracy: 75.6%
Loss at step 700: 1.161479
Training accuracy: 78.9%
Validation accuracy: 75.6%
Loss at step 800: 1.098717
Training accuracy: 79.4%
Validation accuracy: 75.8%
Test accuracy: 83.3%
来自documentation,理论上tensorflow的softmax应该和我实现的一模一样吧?
This function performs the equivalent of
softmax = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis)
编辑: 我在从正态分布初始化时添加了一个种子,现在我可以重现准确度结果。
在“My softmax”行中设置轴值时,只有 axis=0 不会导致错误。设置 axis=1 或 axis=-1 都会导致此错误:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 10 and 10000 for 'truediv' (op: 'RealDiv') with input shapes: [10000,10], [10000].
您正在将 axis=0
传递给 "your" softmax。虽然我不知道你的数据看起来如何,但 0 通常是 batch 轴,沿着这个轴执行 softmax 是不正确的。请参阅 tf.nn.softmax
的文档: axis
的默认值为 -1。一般来说,axis
应该是包含不同类.
的维度
- 假设您的 softmax 实现是正确的
- 首先,将tensorflow softmax与手写softmax进行比较是不公平的,因为你的程序中包含随机性
- 我的意思是
W = tf.Variable(tf.truncated_normal([image_size * image_size, num_labels]))
行在你的程序中引入了随机性,因为权重最初是随机设置的,所以每次你运行你的程序都会得到不同的结果
- 如果你有某种种子(某种起点),你只能比较两个 softmax
- 现在,如果你多次执行上述实验并且每次 tensorflow softmax 都击败手写 softmax 那么你的问题是有效的
tf.truncated_normal
函数确实接受了一个种子参数...您可以使用该参数并查看结果是什么
- 无论如何,如果你的手写 softmax 是正确的,那么使用种子的 tensorflow softmax 和你的 softmax 应该输出相同的结果
- 甚至我认为在你的情况下你的轴应该是 1,这是最后一个轴,因为 softmax 应该沿着有 类
的轴
总而言之,以下实现有效。您可以通过 MNIST beginner example 运行 并获得相同的准确度。
# My softmax:
y1 = tf.exp(logits)
y_ = y1 / tf.reduce_sum(y1, keepdims=True, axis=1)
我正在用 tensorflow 写一个简单的逻辑回归。我发现当使用 tf.nn.softmax 时,算法收敛得更快,最终准确率更高。 如果换成我自己实现的softmax,网络收敛的比较慢,最后的准确率也没有那么好。
代码如下:
SEED = 1025
W = tf.Variable(tf.truncated_normal([image_size * image_size, num_labels], seed=SEED))
b = tf.Variable(tf.zeros([num_labels]))
logits = tf.matmul(train_dataset, W) + b
# My softmax:
y_ = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis=0)
# Tensorflow softmax:
y_ = tf.nn.softmax(logits)
y_clipped = tf.clip_by_value(y_, 1e-10, 0.9999999)
loss = -tf.reduce_mean(tf.reduce_sum(train_labels * tf.log(y_clipped), axis=1))
使用我的 softmax:
Loss at step 0: 22.213934
Training accuracy: 12.7%
Validation accuracy: 13.2%
Loss at step 100: 12.777291
Training accuracy: 45.3%
Validation accuracy: 45.5%
Loss at step 200: 11.361242
Training accuracy: 48.2%
Validation accuracy: 47.4%
Loss at step 300: 10.658278
Training accuracy: 51.4%
Validation accuracy: 49.7%
Loss at step 400: 9.297832
Training accuracy: 59.2%
Validation accuracy: 56.8%
Loss at step 500: 8.902699
Training accuracy: 62.0%
Validation accuracy: 59.2%
Loss at step 600: 8.681184
Training accuracy: 64.2%
Validation accuracy: 61.0%
Loss at step 700: 8.529438
Training accuracy: 65.8%
Validation accuracy: 62.3%
Loss at step 800: 8.416442
Training accuracy: 66.8%
Validation accuracy: 63.3%
Test accuracy: 70.4%
使用tensorflow的softmax:
Loss at step 0: 13.555875
Training accuracy: 12.7%
Validation accuracy: 14.5%
Loss at step 100: 2.194562
Training accuracy: 72.5%
Validation accuracy: 72.0%
Loss at step 200: 1.808641
Training accuracy: 75.5%
Validation accuracy: 74.5%
Loss at step 300: 1.593390
Training accuracy: 76.8%
Validation accuracy: 75.0%
Loss at step 400: 1.442661
Training accuracy: 77.7%
Validation accuracy: 75.2%
Loss at step 500: 1.327751
Training accuracy: 78.2%
Validation accuracy: 75.4%
Loss at step 600: 1.236314
Training accuracy: 78.5%
Validation accuracy: 75.6%
Loss at step 700: 1.161479
Training accuracy: 78.9%
Validation accuracy: 75.6%
Loss at step 800: 1.098717
Training accuracy: 79.4%
Validation accuracy: 75.8%
Test accuracy: 83.3%
来自documentation,理论上tensorflow的softmax应该和我实现的一模一样吧?
This function performs the equivalent of
softmax = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis)
编辑: 我在从正态分布初始化时添加了一个种子,现在我可以重现准确度结果。 在“My softmax”行中设置轴值时,只有 axis=0 不会导致错误。设置 axis=1 或 axis=-1 都会导致此错误:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 10 and 10000 for 'truediv' (op: 'RealDiv') with input shapes: [10000,10], [10000].
您正在将 axis=0
传递给 "your" softmax。虽然我不知道你的数据看起来如何,但 0 通常是 batch 轴,沿着这个轴执行 softmax 是不正确的。请参阅 tf.nn.softmax
的文档: axis
的默认值为 -1。一般来说,axis
应该是包含不同类.
- 假设您的 softmax 实现是正确的
- 首先,将tensorflow softmax与手写softmax进行比较是不公平的,因为你的程序中包含随机性
- 我的意思是
W = tf.Variable(tf.truncated_normal([image_size * image_size, num_labels]))
行在你的程序中引入了随机性,因为权重最初是随机设置的,所以每次你运行你的程序都会得到不同的结果 - 如果你有某种种子(某种起点),你只能比较两个 softmax
- 现在,如果你多次执行上述实验并且每次 tensorflow softmax 都击败手写 softmax 那么你的问题是有效的
tf.truncated_normal
函数确实接受了一个种子参数...您可以使用该参数并查看结果是什么- 无论如何,如果你的手写 softmax 是正确的,那么使用种子的 tensorflow softmax 和你的 softmax 应该输出相同的结果
- 甚至我认为在你的情况下你的轴应该是 1,这是最后一个轴,因为 softmax 应该沿着有 类 的轴
总而言之,以下实现有效。您可以通过 MNIST beginner example 运行 并获得相同的准确度。
# My softmax:
y1 = tf.exp(logits)
y_ = y1 / tf.reduce_sum(y1, keepdims=True, axis=1)