数据打印,但不写入数据框
Data prints, but does not write to dataframe
我正在尝试计算真阳性率等。的二进制混淆矩阵,并将结果输出到 csv 文件。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import csv
from sklearn.metrics import confusion_matrix
AllBinary = pd.read_csv('BinaryData.csv')
y_test = AllBinary['Binary_ac']
y_pred = AllBinary['Binary_pred']
cm = confusion_matrix(y_test, y_pred)
stats = pd.DataFrame()
TP = cm[0][0]
FP = cm[0][1]
FN = cm[1][0]
TN = cm[1][1]
stats['TruePositive'] = TP
stats['TrueNegative'] = TN
stats['FalsePositive'] = FP
stats['FalseNegative'] = FN
print(TP)
print(TN)
print(FP)
print(FN)
stats.to_csv('C:/out/' + 'BinaryStats' + '.csv', header = True)
打印结果显示基本混淆矩阵统计信息计算如下:
210483
153902
32845
10788
csv 输出创建标题,但结果为空白。我做错了什么?
更新:
print(stats)
Empty DataFrame
Columns: [TruePositive, TrueNegative, Falsepositive, FalseNegative]
这里的问题是您不能像这样通过简单地将标量值分配给新列来附加到 df:
In [55]:
stats = pd.DataFrame()
stats['TruePositive'] = 210483
stats
Out[55]:
Empty DataFrame
Columns: [TruePositive]
Index: []
您需要在构造函数中使用所需的值构造 df:
In [62]:
TP = 210483
FP = 153902
FN = 32845
TN = 10788
stats = pd.DataFrame({'TruePositive':[TP], 'TrueNegative':[TN], 'FalsePositive':[FP], 'FalseNegative':[FN]})
stats
Out[62]:
FalseNegative FalsePositive TrueNegative TruePositive
0 32845 153902 10788 210483
或添加一个虚拟行,然后您的代码将起作用:
In [71]:
stats = pd.DataFrame()
stats = stats.append(pd.Series('dummy'), ignore_index=True)
stats['TruePositive'] = TP
stats['TrueNegative'] = TN
stats['FalsePositive'] = FP
stats['FalseNegative'] = FN
stats
Out[71]:
0 TruePositive TrueNegative FalsePositive FalseNegative
0 dummy 210483 10788 153902 32845
然后您可以删除虚拟列调用 drop
:
In [72]:
stats.drop(0, axis=1)
Out[72]:
TruePositive TrueNegative FalsePositive FalseNegative
0 210483 10788 153902 32845
所以你的尝试失败的原因是因为你的初始 df 是空的,你正在为一个新列分配一个标量值,标量值会将新列的所有行设置为此值。由于您的 df 没有行,因此失败,这就是为什么您的 df 为空的原因。
另一种方法是用单行创建 df(我在这里输入 NaN
):
In [77]:
stats = pd.DataFrame([np.NaN])
stats['TruePositive'] = TP
stats['TrueNegative'] = TN
stats['FalsePositive'] = FP
stats['FalseNegative'] = FN
stats.dropna(axis=1)
Out[77]:
TruePositive TrueNegative FalsePositive FalseNegative
0 210483 10788 153902 32845
我正在尝试计算真阳性率等。的二进制混淆矩阵,并将结果输出到 csv 文件。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import csv
from sklearn.metrics import confusion_matrix
AllBinary = pd.read_csv('BinaryData.csv')
y_test = AllBinary['Binary_ac']
y_pred = AllBinary['Binary_pred']
cm = confusion_matrix(y_test, y_pred)
stats = pd.DataFrame()
TP = cm[0][0]
FP = cm[0][1]
FN = cm[1][0]
TN = cm[1][1]
stats['TruePositive'] = TP
stats['TrueNegative'] = TN
stats['FalsePositive'] = FP
stats['FalseNegative'] = FN
print(TP)
print(TN)
print(FP)
print(FN)
stats.to_csv('C:/out/' + 'BinaryStats' + '.csv', header = True)
打印结果显示基本混淆矩阵统计信息计算如下:
210483
153902
32845
10788
csv 输出创建标题,但结果为空白。我做错了什么?
更新:
print(stats)
Empty DataFrame
Columns: [TruePositive, TrueNegative, Falsepositive, FalseNegative]
这里的问题是您不能像这样通过简单地将标量值分配给新列来附加到 df:
In [55]:
stats = pd.DataFrame()
stats['TruePositive'] = 210483
stats
Out[55]:
Empty DataFrame
Columns: [TruePositive]
Index: []
您需要在构造函数中使用所需的值构造 df:
In [62]:
TP = 210483
FP = 153902
FN = 32845
TN = 10788
stats = pd.DataFrame({'TruePositive':[TP], 'TrueNegative':[TN], 'FalsePositive':[FP], 'FalseNegative':[FN]})
stats
Out[62]:
FalseNegative FalsePositive TrueNegative TruePositive
0 32845 153902 10788 210483
或添加一个虚拟行,然后您的代码将起作用:
In [71]:
stats = pd.DataFrame()
stats = stats.append(pd.Series('dummy'), ignore_index=True)
stats['TruePositive'] = TP
stats['TrueNegative'] = TN
stats['FalsePositive'] = FP
stats['FalseNegative'] = FN
stats
Out[71]:
0 TruePositive TrueNegative FalsePositive FalseNegative
0 dummy 210483 10788 153902 32845
然后您可以删除虚拟列调用 drop
:
In [72]:
stats.drop(0, axis=1)
Out[72]:
TruePositive TrueNegative FalsePositive FalseNegative
0 210483 10788 153902 32845
所以你的尝试失败的原因是因为你的初始 df 是空的,你正在为一个新列分配一个标量值,标量值会将新列的所有行设置为此值。由于您的 df 没有行,因此失败,这就是为什么您的 df 为空的原因。
另一种方法是用单行创建 df(我在这里输入 NaN
):
In [77]:
stats = pd.DataFrame([np.NaN])
stats['TruePositive'] = TP
stats['TrueNegative'] = TN
stats['FalsePositive'] = FP
stats['FalseNegative'] = FN
stats.dropna(axis=1)
Out[77]:
TruePositive TrueNegative FalsePositive FalseNegative
0 210483 10788 153902 32845