将 statsmodels 回归结果的摘要 table 导出为 csv
Export summary table of statsmodels regression results as csv
假设我有三个要并排比较的 statsmodels OLS
对象。我可以使用 summary_col
创建摘要 table,我可以将其打印为文本或导出到乳胶。
如何将此 table 导出为 csv?
这是我想做的一个可复制的例子:
# Libraries
import pandas as pd
import statsmodels.api as sm
from statsmodels.iolib.summary2 import summary_col
# Load silly data and add constant
df = sm.datasets.stackloss.load_pandas().data
df['CONSTANT'] = 1
# Train three silly models
m0 = sm.OLS(df['STACKLOSS'], df[['CONSTANT','AIRFLOW']]).fit()
m1 = sm.OLS(df['STACKLOSS'], df[['CONSTANT','AIRFLOW','WATERTEMP']]).fit()
m2 = sm.OLS(df['STACKLOSS'], df[['CONSTANT','AIRFLOW','WATERTEMP','ACIDCONC']]).fit()
# Results table
res = summary_col([m0,m1,m2], regressor_order=m2.params.index.tolist())
print(res)
================================================
STACKLOSS I STACKLOSS II STACKLOSS III
------------------------------------------------
CONSTANT -44.1320 -50.3588 -39.9197
(6.1059) (5.1383) (11.8960)
AIRFLOW 1.0203 0.6712 0.7156
(0.1000) (0.1267) (0.1349)
WATERTEMP 1.2954 1.2953
(0.3675) (0.3680)
ACIDCONC -0.1521
(0.1563)
================================================
Standard errors in parentheses.
有没有办法将 res
导出到 csv?
结果存储为数据帧列表:
res.tables
[ STACKLOSS I STACKLOSS II STACKLOSS III
CONSTANT -44.1320 -50.3588 -39.9197
(6.1059) (5.1383) (11.8960)
AIRFLOW 1.0203 0.6712 0.7156
(0.1000) (0.1267) (0.1349)
WATERTEMP 1.2954 1.2953
(0.3675) (0.3680)
ACIDCONC -0.1521
(0.1563)
R-squared 0.8458 0.9088 0.9136
R-squared Adj. 0.8377 0.8986 0.8983]
这应该有效:
res.tables[0].to_csv("test.csv")
pd.read_csv("test.csv")
Unnamed: 0 STACKLOSS I STACKLOSS II STACKLOSS III
0 CONSTANT -44.1320 -50.3588 -39.9197
1 NaN (6.1059) (5.1383) (11.8960)
2 AIRFLOW 1.0203 0.6712 0.7156
3 NaN (0.1000) (0.1267) (0.1349)
4 WATERTEMP NaN 1.2954 1.2953
5 NaN NaN (0.3675) (0.3680)
6 ACIDCONC NaN NaN -0.1521
7 NaN NaN NaN (0.1563)
8 R-squared 0.8458 0.9088 0.9136
9 R-squared Adj. 0.8377 0.8986 0.8983
假设我有三个要并排比较的 statsmodels OLS
对象。我可以使用 summary_col
创建摘要 table,我可以将其打印为文本或导出到乳胶。
如何将此 table 导出为 csv?
这是我想做的一个可复制的例子:
# Libraries
import pandas as pd
import statsmodels.api as sm
from statsmodels.iolib.summary2 import summary_col
# Load silly data and add constant
df = sm.datasets.stackloss.load_pandas().data
df['CONSTANT'] = 1
# Train three silly models
m0 = sm.OLS(df['STACKLOSS'], df[['CONSTANT','AIRFLOW']]).fit()
m1 = sm.OLS(df['STACKLOSS'], df[['CONSTANT','AIRFLOW','WATERTEMP']]).fit()
m2 = sm.OLS(df['STACKLOSS'], df[['CONSTANT','AIRFLOW','WATERTEMP','ACIDCONC']]).fit()
# Results table
res = summary_col([m0,m1,m2], regressor_order=m2.params.index.tolist())
print(res)
================================================
STACKLOSS I STACKLOSS II STACKLOSS III
------------------------------------------------
CONSTANT -44.1320 -50.3588 -39.9197
(6.1059) (5.1383) (11.8960)
AIRFLOW 1.0203 0.6712 0.7156
(0.1000) (0.1267) (0.1349)
WATERTEMP 1.2954 1.2953
(0.3675) (0.3680)
ACIDCONC -0.1521
(0.1563)
================================================
Standard errors in parentheses.
有没有办法将 res
导出到 csv?
结果存储为数据帧列表:
res.tables
[ STACKLOSS I STACKLOSS II STACKLOSS III
CONSTANT -44.1320 -50.3588 -39.9197
(6.1059) (5.1383) (11.8960)
AIRFLOW 1.0203 0.6712 0.7156
(0.1000) (0.1267) (0.1349)
WATERTEMP 1.2954 1.2953
(0.3675) (0.3680)
ACIDCONC -0.1521
(0.1563)
R-squared 0.8458 0.9088 0.9136
R-squared Adj. 0.8377 0.8986 0.8983]
这应该有效:
res.tables[0].to_csv("test.csv")
pd.read_csv("test.csv")
Unnamed: 0 STACKLOSS I STACKLOSS II STACKLOSS III
0 CONSTANT -44.1320 -50.3588 -39.9197
1 NaN (6.1059) (5.1383) (11.8960)
2 AIRFLOW 1.0203 0.6712 0.7156
3 NaN (0.1000) (0.1267) (0.1349)
4 WATERTEMP NaN 1.2954 1.2953
5 NaN NaN (0.3675) (0.3680)
6 ACIDCONC NaN NaN -0.1521
7 NaN NaN NaN (0.1563)
8 R-squared 0.8458 0.9088 0.9136
9 R-squared Adj. 0.8377 0.8986 0.8983