Statsmodels (Patsy) 非法变量名/'Series' object is not callable 错误

Statsmodels (Patsy) illegal variable name / 'Series' object is not callable Error

更新:

错误可能是因为我的数据集中还有一个名为"Q"的变量与Q函数冲突。这种情况下,我该如何优雅的解决呢?


更新: 你可以下载我的数据集 here.


我是 运行 使用 statsmodels 和 pandas 数据框的简单 OLS 回归,如下所示:

import statsmodels.formula.api as sm
import pandas as pd
df=pd.read_csv("exp.csv")
#df is a dataframe that I have containing many variable names such as AAPL, SPY, INF, etc.
for column in df: 
    result=sm.ols(formula="SPY"+" ~ "+column, data=df).fit()

但是,df 中的列名之一是 INF。我想也许 INF 是 pasty 的保留字,代码给我以下错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/statsmodels/base/model.py", line 155, in from_formula
    missing=missing)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/statsmodels/formula/formulatools.py", line 65, in handle_formula_data
    NA_action=na_action)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 310, in dmatrices
    NA_action, return_type)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 165, in _do_highlevel_design
    NA_action)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 62, in _try_incr_builders
    formula_like = ModelDesc.from_formula(formula_like)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 165, in from_formula
    value = Evaluator().eval(tree, require_evalexpr=False)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 400, in eval
    result = self._evaluators[key](self, tree)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 221, in _eval_any_tilde
    exprs = [evaluator.eval(arg) for arg in tree.args]    
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 400, in eval
    result = self._evaluators[key](self, tree)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/desc.py", line 355, in _eval_number
    "only allowed with **", tree)
patsy.PatsyError: numbers besides '0' and '1' are only allowed with **
    SPY ~ INF
          ^^^

我也试过用Q函数:

result=sm.ols(formula="SPY"+" ~ "+"Q('INF')", data=df).fit()

但是,它给了我以下错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/statsmodels/base/model.py", line 155, in from_formula
    missing=missing)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/statsmodels/formula/formulatools.py", line 65, in handle_formula_data
    NA_action=na_action)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 310, in dmatrices
    NA_action, return_type)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 165, in _do_highlevel_design
    NA_action)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/highlevel.py", line 70, in _try_incr_builders
    NA_action)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/build.py", line 696, in design_matrix_builders
    NA_action)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/build.py", line 443, in _examine_factor_types
    value = factor.eval(factor_states[factor], data)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/eval.py", line 566, in eval
    data)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/eval.py", line 551, in _eval
    inner_namespace=inner_namespace)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/compat.py", line 36, in call_and_wrap_exc
    return f(*args, **kwargs)
  File "/home/ap248/.local/easybuild/software/2017/Core/miniconda2/4.3.27/lib/python2.7/site-packages/patsy/eval.py", line 166, in eval
    + self._namespaces))
  File "<string>", line 1, in <module>
TypeError: 'Series' object is not callable

知道如何解决吗?

根据这个 link: http://patsy.readthedocs.io/en/latest/builtins-reference.html#patsy.builtins.Q 您可以在公式中使用 Q("var") 来消除错误。

下面的代码应该可以工作。

model = sm.ols('SPY ~ Q("INF")',data=df).fit()

一个完整的例子

import statsmodels.formula.api as sm
import pandas as pd
import numpy as np

a = np.random.rand(10,3)
df = pd.DataFrame(data=a, columns=['SPY','INF','X'])

model = sm.ols('SPY ~ Q("INF")',data=df).fit()

我已经通过忽略公式并改用直接接口解决了这个问题:

for column in df: 
    Y,X = df[column], df['SPY']
    X = sm.add_constant(X)
    result=sm.OLS(Y,X).fit()

感觉界面还是有点问题,不太好用。

您需要更改列名,statsmodels 无法拟合列名中带有“inf”的回归线,例如 INF --> Infraction 或其他任何内容。