python sm.ols 更改摘要格式以避免科学记数法

Question

我是运行 ols 模型，我需要知道所有系数，以便在我的分析中使用它们。我怎样才能 display/save 系数采用不同于科学记数法的格式？

model = sm.ols(formula="sales ~ product_category + quantity_bought + quantity_ordered + quantity_returned + season", data=final_email).fit()
print model.summary()

OLS Regression Results                            
==============================================================================
Dep. Variable:                sales   R-squared:                       0.974
Model:                            OLS   Adj. R-squared:                  0.938
Method:                 Least Squares   F-statistic:                     27.26
Date:                Tue, 18 Apr 2017   Prob (F-statistic):           5.39e-13
Time:                        11:43:36   Log-Likelihood:                -806.04
No. Observations:                  60   AIC:                             1682.
Df Residuals:                      25   BIC:                             1755.
Df Model:                          34                                         
Covariance Type:            nonrobust                                         
======================================================================================
                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
--------------------------------------------------------------------------------------
Intercept            -2.79e+05   2.883e+05     -0.987      0.333     -8.92e+05  3.14e+05
Product_category[A]   4.343e+04   2.456e+05      0.186      0.854     -4.95e+05  5.93e+05
Product_category[B]   2.784e+05    1.23e+05      1.128      0.270     -1.68e+05  5.75e+05
quantity_bought       -74678      1.754e+05     -0.048      0.962      -3.4e+05  3.24e+05
quantity_ordered      3.543e+05   1.363e+05      1.827      0.080     -4.21e+04  7.05e+05
quantity_returned     1.285e+05   2.154e+05      0.512      0.613     -4.61e+05  7.66e+05
season               -1.983e+04   1.76e+05     -0.133      0.895     -2.69e+05  

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The smallest eigenvalue is 1.19e-29. This might indicate that there are
strong multicollinearity problems or that the design matrix is singular.

这没有帮助：

pd.set_option('display.float_format', lambda x: '%3.f' % x)

Answer 1

您现在无法使用 statsmodels version 0.8.0，因为 RegressionResultsWrapper.summary() 方法没有很好地支持此功能。只有 xname, yname, alpha, title 个可用。

所以，有一个叫做RegressionResultsWrapper.summary()的体验函数，它有一个参数float_format，可以让你知道你需要做什么。

但由于这是一个实验性功能。使用时可能会出现错误。当您指定 float_format 时，某些结果可能与您预期的不同。

我查看了源代码，发现结果的某些格式是hard-coded。所以如果 float_format 不起作用。编辑源文件可能是您最后的选择。别担心，它可能没有你想象的那么难。欢迎提问。

Answer 2

所以这是硬编码到 statsmodels 源代码中的东西。但获得系数的最佳方法是使用 model.params。查看 RegressionResults 的来源，特别是所有属性，这将向您展示如何访问适合您的模型的所有相关信息。

Answer 3

从版本 0.10.2 开始，有一个实验函数 summary2() 需要一个 float_format。

这是 source code:

中该函数的文档字符串

    def summary2(self, yname=None, xname=None, title=None, alpha=.05,
                 float_format="%.4f"):
        """
        Experimental summary function to summarize the regression results.
        Parameters
        ----------
        yname : str
            The name of the dependent variable (optional).
        xname : list[str], optional
            Names for the exogenous variables. Default is `var_##` for ## in
            the number of regressors. Must match the number of parameters
            in the model.
        title : str, optional
            Title for the top table. If not None, then this replaces the
            default title.
        alpha : float
            The significance level for the confidence intervals.
        float_format : str
            The format for floats in parameters summary.
        Returns
        -------
        Summary
            Instance holding the summary tables and text, which can be printed
            or converted to various output formats.

python sm.ols 更改摘要格式以避免科学记数法

python sm.ols change format of summary to avoid scientific notation

python

format

regression

statsmodels