为什么 Python 说一个值不存在,而它确实存在?

Why does Python say that a value does not exist when it specifically does?

简短说明:

主要问题是,每当我 运行 以下代码时,我都会收到以下错误:

import statsmodels.api as sm
from statsmodels.formula.api import ols    
def onewayanaova (csv, vars, x="x-axis", y="y-axis"):
        df = pd.read_csv(csv, delimiter=",") 
        df_melt = pd.melt(df.reset_index(), id_vars=['index'], value_vars=vars)
        df_melt.columns = ['index', {x}, {y}]
        model = ols(f'{y} ~ C({x})', data=df_melt).fit()
        anova_table = sm.stats.anova_lm(model, typ=2)
        print("The One-Way Anova Test Values are:\n")
        print(anova_table)
onewayanaova("Book1.csv", ["a","b","c"])

错误是:

Traceback (most recent call last):
  File "pandas\_libs\hashtable_class_helper.pxi", line 5231, in pandas._libs.hashtable.PyObjectHashTable.map_locations
TypeError: unhashable type: 'set'
Exception ignored in: 'pandas._libs.index.IndexEngine._call_map_locations'
Traceback (most recent call last):
  File "pandas\_libs\hashtable_class_helper.pxi", line 5231, in pandas._libs.hashtable.PyObjectHashTable.map_locations
TypeError: unhashable type: 'set'
Traceback (most recent call last):
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\compat.py", line 36, in call_and_wrap_exc
    return f(*args, **kwargs)
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\eval.py", line 165, in eval
    return eval(code, {}, VarLookupDict([inner_namespace]
  File "<string>", line 1, in <module>
NameError: name 'axis' is not defined

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\Users\mghaf\Desktop\Python Codes\ReMan Edu\test.py", line 3, in <module>
    mn.onewayanaova("Book1.csv", ["a","b","c"])
  File "c:\Users\mghaf\Desktop\Python Codes\ReMan Edu\maincode.py", line 154, in onewayanaova
    model = ols(f'{y} ~ C({x})', data=df_melt).fit()
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\statsmodels\base\model.py", line 200, in from_formula
    tmp = handle_formula_data(data, None, formula, depth=eval_env,
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\statsmodels\formula\formulatools.py", line 63, in handle_formula_data
    result = dmatrices(formula, Y, depth, return_type='dataframe',
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\highlevel.py", line 309, in dmatrices
    (lhs, rhs) = _do_highlevel_design(formula_like, data, eval_env,
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\highlevel.py", line 164, in _do_highlevel_design
    design_infos = _try_incr_builders(formula_like, data_iter_maker, eval_env,
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\highlevel.py", line 66, in _try_incr_builders
    return design_matrix_builders([formula_like.lhs_termlist,
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\build.py", line 693, in design_matrix_builders
    cat_levels_contrasts) = _examine_factor_types(all_factors,
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\build.py", line 443, in _examine_factor_types
    value = factor.eval(factor_states[factor], data)
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\eval.py", line 564, in eval
    return self._eval(memorize_state["eval_code"],
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\eval.py", line 547, in _eval
    return call_and_wrap_exc("Error evaluating factor",
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\compat.py", line 43, in call_and_wrap_exc
    exec("raise new_exc from e")
  File "<string>", line 1, in <module>
patsy.PatsyError: Error evaluating factor: NameError: name 'axis' is not defined
    y-axis ~ C(x-axis)
             ^^^^^^^^^

我认为是我在def onewayanaova (csv, vars, x="x-axis", y="y-axis"):中设置的X和Y变量。也许我需要更改它以免收到错误消息?

如果您需要更详细的说明,请阅读下文。

详细说明:

我正在尝试进行单向方差分析测试。但是,主要问题是 python 一直说存在 NameError,并且我的值之一未定义。

我正在运行宁以下代码:

import statsmodels.api as sm
from statsmodels.formula.api import ols    
def onewayanaova (csv, vars, x="x-axis", y="y-axis"):
        df = pd.read_csv(csv, delimiter=",") 
        df_melt = pd.melt(df.reset_index(), id_vars=['index'], value_vars=vars)
        df_melt.columns = ['index', {x}, {y}]
        model = ols(f'{y} ~ C({x})', data=df_melt).fit()
        anova_table = sm.stats.anova_lm(model, typ=2)
        print("The One-Way Anova Test Values are:\n")
        print(anova_table)

并且:

import maincode as mn
mn.onewayanaova("Book1.csv", ["a","b","c"])

我收到以下错误(第一个代码保存到名为 manicode.py 的文件中,第二个代码保存到名为 test.py 的文件中。"Book1.csv" 在与他们相同的文件夹)。错误是:

Traceback (most recent call last):
  File "pandas\_libs\hashtable_class_helper.pxi", line 5231, in pandas._libs.hashtable.PyObjectHashTable.map_locations
TypeError: unhashable type: 'set'
Exception ignored in: 'pandas._libs.index.IndexEngine._call_map_locations'
Traceback (most recent call last):
  File "pandas\_libs\hashtable_class_helper.pxi", line 5231, in pandas._libs.hashtable.PyObjectHashTable.map_locations
TypeError: unhashable type: 'set'
Traceback (most recent call last):
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\compat.py", line 36, in call_and_wrap_exc
    return f(*args, **kwargs)
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\eval.py", line 165, in eval
    return eval(code, {}, VarLookupDict([inner_namespace]
  File "<string>", line 1, in <module>
NameError: name 'axis' is not defined

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\Users\mghaf\Desktop\Python Codes\ReMan Edu\test.py", line 3, in <module>
    mn.onewayanaova("Book1.csv", ["a","b","c"])
  File "c:\Users\mghaf\Desktop\Python Codes\ReMan Edu\maincode.py", line 154, in onewayanaova
    model = ols(f'{y} ~ C({x})', data=df_melt).fit()
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\statsmodels\base\model.py", line 200, in from_formula
    tmp = handle_formula_data(data, None, formula, depth=eval_env,
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\statsmodels\formula\formulatools.py", line 63, in handle_formula_data
    result = dmatrices(formula, Y, depth, return_type='dataframe',
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\highlevel.py", line 309, in dmatrices
    (lhs, rhs) = _do_highlevel_design(formula_like, data, eval_env,
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\highlevel.py", line 164, in _do_highlevel_design
    design_infos = _try_incr_builders(formula_like, data_iter_maker, eval_env,
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\highlevel.py", line 66, in _try_incr_builders
    return design_matrix_builders([formula_like.lhs_termlist,
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\build.py", line 693, in design_matrix_builders
    cat_levels_contrasts) = _examine_factor_types(all_factors,
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\build.py", line 443, in _examine_factor_types
    value = factor.eval(factor_states[factor], data)
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\eval.py", line 564, in eval
    return self._eval(memorize_state["eval_code"],
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\eval.py", line 547, in _eval
    return call_and_wrap_exc("Error evaluating factor",
  File "C:\Users\mghaf\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\patsy\compat.py", line 43, in call_and_wrap_exc
    exec("raise new_exc from e")
  File "<string>", line 1, in <module>
patsy.PatsyError: Error evaluating factor: NameError: name 'axis' is not defined
    y-axis ~ C(x-axis)
             ^^^^^^^^^

我看到的主要错误是我将 X 和 Y 变量命名为:x="x-axis", y="y-axis"。但我不明白为什么这会给我一个错误,因为我从中制作了一个非常整洁的箱线图(但我知道 X 和 Y 用作轴标题):

def boxplot (csv, vars, x="x-axis", y="y-axis"):
    #https://www.reneshbedre.com/blog/anova.html
    df = pd.read_csv(csv, delimiter=",") 
    df_melt = pd.melt(df.reset_index(), id_vars=['index'], value_vars=vars)
    df_melt.columns = ['index', x, y]
    ax = sns.boxplot(x=x, y=y, data=df_melt, color='#99c2a2')
    ax = sns.swarmplot(x=x, y=y, data=df_melt, color='#7d0013')
    plt.show()

但是,每当我从别人那里写这段代码时,它都会给出我想要的输出:

import statsmodels.api as sm
from statsmodels.formula.api import ols
import pandas as pd
df = pd.read_csv("https://reneshbedre.github.io/assets/posts/anova/onewayanova.txt", sep="\t")
df_melt = pd.melt(df.reset_index(), id_vars=['index'], value_vars=['A', 'B', 'C', 'D'])
df_melt.columns = ['index', 'treatments', 'value']
model = ols('value ~ C(treatments)', data=df_melt).fit()
anova_table = sm.stats.anova_lm(model, typ=2)
print(anova_table)

我用上面的代码得到的输出:

                sum_sq    df         F    PR(>F)
C(treatments)  3010.95   3.0  17.49281  0.000026
Residual        918.00  16.0       NaN       NaN

主要问题是我需要更改 model = ols('value ~ C(treatments)', data=df_melt).fit()df_melt.columns = ['index', 'treatments', 'value'] 的值,因为大多数数据集没有 'treatments', 'value' 作为它们的数据库。如果您想知道我的 .csv 文件是什么:

  1. a、b 和 c 的第 headers 列
  2. 其中每个数字数量相等的列表

我的主要问题是:

Please try and help me understand why I cannot replace 'value ~ C(treatments)' with X and Y!

代码来源:https://www.reneshbedre.com/blog/anova.html

在 statsmodels 公式中,当变量(即数据框中的列)包含 - 等特殊字符时,您需要引用它们。看看 documentation,您的术语“x 轴”被解释为“x”-“轴”。引用变量可以通过 Q() 转换来完成。确保用您用于字符串的不同 (single/double) 引号引用变量名称:

model = ols(f'Q("{y}") ~ C(Q("{x}"))', data=df_melt).fit()

似乎 model = ols('value ~ C(treatments)', data=df_melt).fit() 不能有变量替代(就像我在 model = ols(f'{y} ~ C({x})', data=df_melt).fit() 中那样)。如@Rob 所述,如果我使用 model = ols(f'Q("{y}") ~ C(Q("{x}"))', data=df_melt).fit(),情况也是如此。

因此,为了让它工作并拥有我自己的名字,我只需要相对于 model = ols('value ~ C(treatments)', data=df_melt).fit() 重命名 df_melt.columns = ['index', 'treatments', 'value'](其中 'treatments', 'value' 在两行中是相同的东西代码)。