Softmax 逻辑回归:scikit-learn 和 TensorFlow 的不同表现
Softmax logistic regression: Different performance by scikit-learn and TensorFlow
我正在尝试学习一些数据的简单线性 softmax 模型。 scikit-learn 中的 LogisticRegression 似乎工作正常,现在我正在尝试将代码移植到 TensorFlow,但我没有获得相同的性能,但更糟。我知道结果不会完全相等(scikit learn 有正则化参数等),但相差太远了。
total = pd.read_feather('testfile.feather')
labels = total['labels']
features = total[['f1', 'f2']]
print(labels.shape)
print(features.shape)
classifier = linear_model.LogisticRegression(C=1e5, solver='newton-cg', multi_class='multinomial')
classifier.fit(features, labels)
pred_labels = classifier.predict(features)
print("SCI-KITLEARN RESULTS: ")
print('\tAccuracy:', classifier.score(features, labels))
print('\tPrecision:', precision_score(labels, pred_labels, average='macro'))
print('\tRecall:', recall_score(labels, pred_labels, average='macro'))
print('\tF1:', f1_score(labels, pred_labels, average='macro'))
# now try softmax regression with tensorflow
print("\n\nTENSORFLOW RESULTS: ")
## By default, the OneHotEncoder class will return a more efficient sparse encoding.
## This may not be suitable for some applications, such as use with the Keras deep learning library.
## In this case, we disabled the sparse return type by setting the sparse=False argument.
enc = OneHotEncoder(sparse=False)
enc.fit(labels.values.reshape(len(labels), 1)) # Reshape is required as Encoder expect 2D data as input
labels_one_hot = enc.transform(labels.values.reshape(len(labels), 1))
# tf Graph Input
x = tf.placeholder(tf.float32, [None, 2]) # 2 input features
y = tf.placeholder(tf.float32, [None, 5]) # 5 output classes
# Set model weights
W = tf.Variable(tf.zeros([2, 5]))
b = tf.Variable(tf.zeros([5]))
# Construct model
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax
clas = tf.argmax(pred, axis=1)
# Minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(cost)
# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()
# Start training
with tf.Session() as sess:
# Run the initializer
sess.run(init)
# Training cycle
for epoch in range(1000):
# Run optimization op (backprop) and cost op (to get loss value)
_, c = sess.run([optimizer, cost], feed_dict={x: features, y: labels_one_hot})
# Test model
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
class_out = clas.eval({x: features})
# Calculate accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print("\tAccuracy:", accuracy.eval({x: features, y: labels_one_hot}))
print('\tPrecision:', precision_score(labels, class_out, average='macro'))
print('\tRecall:', recall_score(labels, class_out, average='macro'))
print('\tF1:', f1_score(labels, class_out, average='macro'))
这段代码的输出是
(1681,)
(1681, 2)
SCI-KITLEARN RESULTS:
Accuracy: 0.822129684711
Precision: 0.837883361162
Recall: 0.784522522208
F1: 0.806251963817
TENSORFLOW RESULTS:
Accuracy: 0.694825
Precision: 0.735883666192
Recall: 0.649145125846
F1: 0.678045562185
我检查了 one-hot-encoding 的结果和数据,但我不知道为什么 TF 中的结果要差得多。
如有任何建议,我们将不胜感激..
这个问题原来很愚蠢,我只需要更多的 epochs,更小的学习率(为了提高效率,我转向了 AdamOptimizer,结果现在是相等的,尽管 TF 实现要慢得多。
(1681,)
(1681, 2)
SCI-KITLEARN RESULTS:
Accuracy: 0.822129684711
Precision: 0.837883361162
Recall: 0.784522522208
F1: 0.806251963817
TENSORFLOW RESULTS:
Accuracy: 0.82213
Precision: 0.837883361162
Recall: 0.784522522208
F1: 0.806251963817
我正在尝试学习一些数据的简单线性 softmax 模型。 scikit-learn 中的 LogisticRegression 似乎工作正常,现在我正在尝试将代码移植到 TensorFlow,但我没有获得相同的性能,但更糟。我知道结果不会完全相等(scikit learn 有正则化参数等),但相差太远了。
total = pd.read_feather('testfile.feather')
labels = total['labels']
features = total[['f1', 'f2']]
print(labels.shape)
print(features.shape)
classifier = linear_model.LogisticRegression(C=1e5, solver='newton-cg', multi_class='multinomial')
classifier.fit(features, labels)
pred_labels = classifier.predict(features)
print("SCI-KITLEARN RESULTS: ")
print('\tAccuracy:', classifier.score(features, labels))
print('\tPrecision:', precision_score(labels, pred_labels, average='macro'))
print('\tRecall:', recall_score(labels, pred_labels, average='macro'))
print('\tF1:', f1_score(labels, pred_labels, average='macro'))
# now try softmax regression with tensorflow
print("\n\nTENSORFLOW RESULTS: ")
## By default, the OneHotEncoder class will return a more efficient sparse encoding.
## This may not be suitable for some applications, such as use with the Keras deep learning library.
## In this case, we disabled the sparse return type by setting the sparse=False argument.
enc = OneHotEncoder(sparse=False)
enc.fit(labels.values.reshape(len(labels), 1)) # Reshape is required as Encoder expect 2D data as input
labels_one_hot = enc.transform(labels.values.reshape(len(labels), 1))
# tf Graph Input
x = tf.placeholder(tf.float32, [None, 2]) # 2 input features
y = tf.placeholder(tf.float32, [None, 5]) # 5 output classes
# Set model weights
W = tf.Variable(tf.zeros([2, 5]))
b = tf.Variable(tf.zeros([5]))
# Construct model
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax
clas = tf.argmax(pred, axis=1)
# Minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(cost)
# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()
# Start training
with tf.Session() as sess:
# Run the initializer
sess.run(init)
# Training cycle
for epoch in range(1000):
# Run optimization op (backprop) and cost op (to get loss value)
_, c = sess.run([optimizer, cost], feed_dict={x: features, y: labels_one_hot})
# Test model
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
class_out = clas.eval({x: features})
# Calculate accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print("\tAccuracy:", accuracy.eval({x: features, y: labels_one_hot}))
print('\tPrecision:', precision_score(labels, class_out, average='macro'))
print('\tRecall:', recall_score(labels, class_out, average='macro'))
print('\tF1:', f1_score(labels, class_out, average='macro'))
这段代码的输出是
(1681,)
(1681, 2)
SCI-KITLEARN RESULTS:
Accuracy: 0.822129684711
Precision: 0.837883361162
Recall: 0.784522522208
F1: 0.806251963817
TENSORFLOW RESULTS:
Accuracy: 0.694825
Precision: 0.735883666192
Recall: 0.649145125846
F1: 0.678045562185
我检查了 one-hot-encoding 的结果和数据,但我不知道为什么 TF 中的结果要差得多。
如有任何建议,我们将不胜感激..
这个问题原来很愚蠢,我只需要更多的 epochs,更小的学习率(为了提高效率,我转向了 AdamOptimizer,结果现在是相等的,尽管 TF 实现要慢得多。
(1681,)
(1681, 2)
SCI-KITLEARN RESULTS:
Accuracy: 0.822129684711
Precision: 0.837883361162
Recall: 0.784522522208
F1: 0.806251963817
TENSORFLOW RESULTS:
Accuracy: 0.82213
Precision: 0.837883361162
Recall: 0.784522522208
F1: 0.806251963817