SGDClassifier 将每次迭代的损失保存到数组

Question

当我在 scikit-learn 中训练 SGDClassifier 时，我可以打印出每次迭代的损失值（设置详细程度）。如何将值存储到数组中？

Answer 1

修改此的答案。

import numpy as np
from io import StringIO
import matplotlib.pyplot as plt
from sklearn.linear_model import SGDClassifier
from tensorflow.keras.datasets import mnist

(x_tr, y_tr), (x_te, y_te) = mnist.load_dataset()
x_tr, x_te = x_tr.reshape(-1, 784), x_te.reshape(-1, 784)

拦截SGDClassifier

的打印输出

old_stdout = sys.stdout
sys.stdout = mystdout = StringIO()

通过将 verbose 设置为 1 来设置模型以打印其输出。

clf = SGDClassifier(verbose=1)
clf.fit(x_tr, y_tr)

获取 SGDClassifier verbosity 的输出

sys.stdout = old_stdout
loss_history = mystdout.getvalue()

创建一个列表来存储损失值

loss_list = []

追加存储在loss_history

中的打印损失值

for line in loss_history.split('\n'):
    if(len(line.split("loss: ")) == 1):
        continue
    loss_list.append(float(line.split("loss: ")[-1]))

只是为了显示图表

plt.figure()
plt.plot(np.arange(len(loss_list)), loss_list)
plt.xlabel("Time in epochs"); plt.ylabel("Loss")
plt.show()

要将损失值保存到数组中，

loss_list = np.array(loss_list)

SGDClassifier 将每次迭代的损失保存到数组

SGDClassifier save loss from every iteration to array

gradient-descent

scikit-learn