Dataframe 的真值不明确。使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()

Truth value of a Dataframe is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

我试过用 seaborn 和 my csv data (this link) by follow code according to seaborn site 做 pairplot。

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

freq_data = pd.read_csv('C:\Users\frequency.csv')

freq = sns.load_dataset(freq_data)
df = sns.pairplot(iris, hue="condition", height=2.5)
plt.show()

结果显示数据框不明确的trackback

    Traceback (most recent call last):
  File "\.vscode\test.py", line 8, in <module>
    freq = sns.load_dataset(freq_data)
  File "\site-packages\seaborn\utils.py", line 485, in load_dataset
    if name not in get_dataset_names():
  File "\site-packages\pandas\core\generic.py", line 1441, in __nonzero__   
    raise ValueError(
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我已经检查了我的数据并在此处得出结果

        condition    area  sphericity  aspect_ratio
0      20 kHz   0.249       0.287         1.376
1      20 kHz   0.954       0.721         1.421
2      20 kHz   0.118       0.260         1.409
3      20 kHz   0.540       0.552         1.526
4      20 kHz   0.448       0.465         1.160
..        ...     ...         ...           ...
310    30 kHz   6.056       0.955         2.029
311    30 kHz   4.115       1.097         1.398
312    30 kHz  11.055       1.816         1.838
313    30 kHz   4.360       1.183         1.162
314    30 kHz  10.596       0.940         1.715

[315 rows x 4 columns]
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 315 entries, 0 to 314
Data columns (total 4 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   condition     315 non-null    object 
 1   area          315 non-null    float64
 2   sphericity    315 non-null    float64
 3   aspect_ratio  315 non-null    float64
dtypes: float64(3), object(1)
memory usage: 10.0+ KB

我不知道我的数据框发生了什么:( 请建议我解决这些问题

谢谢大家

seaborn.load_dataset() is name of the dataset ({name}.csv on https://github.com/mwaskom/seaborn-data) not a pandas.DataFrame 对象的第一个参数。 seaborn.load_dataset()的return值就是pandas.DataFrame,所以不需要

freq = sns.load_dataset(freq_data)

此外,您可能想要 freq_data 而不是 df = sns.pairplot(iris, hue="condition", height=2.5) 中的 iris

这是最终的示例代码

from io import StringIO
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

TESTDATA = StringIO("""condition;area;sphericity;aspect_ratio
20 kHz;0.249;0.287;1.376
20 kHz;0.954;0.721;1.421
20 kHz;0.118;0.260;1.409
20 kHz;0.540;0.552;1.526
20 kHz;0.448;0.465;1.160
30 kHz;6.056;0.955;2.029
30 kHz;4.115;1.097;1.398
30 kHz;11.055;1.816;1.838
30 kHz;4.360;1.183;1.162
30 kHz;10.596;0.940;1.715
    """)

freq_data = pd.read_csv(TESTDATA, sep=";")

sns.pairplot(freq_data, hue="condition", height=2.5)
plt.show()