ValueError: The truth value of a DataFrame is ambiguous
ValueError: The truth value of a DataFrame is ambiguous
我有一个如下所示的数据框:
total downloaded avg_rating
id
1 2 2 5.0
2 12 12 4.5
3 1 1 5.0
4 1 1 4.0
5 0 0 0.0
我正在尝试添加一个新列,其中包含其中两列的百分比差异,但仅适用于 'downloaded' 中没有 0 的列。
我正在尝试为此使用如下函数:
def diff(ratings):
if ratings[ratings.downloaded > 0]:
val = (ratings['total'] - ratings['downloaded']) / ratings['downloaded']
else:
val = 0
return val
ratings['Pct Diff'] = diff(ratings)
我收到一个错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-129-729c09bf14e8> in <module>()
6 return val
7
----> 8 ratings['Pct Diff'] = diff(ratings)
<ipython-input-129-729c09bf14e8> in diff(ratings)
1 def diff(ratings):
----> 2 if ratings[ratings.downloaded > 0]:
3 val = (ratings['total'] - ratings['downloaded']) /
ratings['downloaded']
4 else:
5 val = 0
~\Anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
953 raise ValueError("The truth value of a {0} is ambiguous. "
954 "Use a.empty, a.bool(), a.item(), a.any() or
a.all()."
--> 955 .format(self.__class__.__name__))
956
957 __bool__ = __nonzero__
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
谁能帮我理解这个错误是什么意思?
此外,这是否是一个应用函数的好应用程序?我可以在申请中使用条件吗?在这种情况下我将如何使用它?
错误的原因是您试图按行进行(向量化计算),但实际上在您的函数中 diff()
ratings[ratings.downloaded > 0]
returns dataframe 和它前面的 if
是不明确的。错误消息反映了这一点。
您不妨回顾一下 Indexing and Selecting Data。下面的解决方案通过在开头设置它来设置默认值 0。
import pandas as pd
df = pd.DataFrame([[2, 2, 5.0], [12, 12, 4.5], [10, 5, 3.0],
[20, 2, 3.5], [3, 0, 0.0], [0, 0, 0.0]],
columns=['total', 'downloaded', 'avg_rating'])
df['Pct Diff'] = 0
df.loc[df['downloaded'] > 0, 'Pct Diff'] = (df['total'] - df['downloaded']) / df['total']
# total downloaded avg_rating Pct Diff
# 0 2 2 5.0 0.0
# 1 12 12 4.5 0.0
# 2 10 5 3.0 0.5
# 3 20 2 3.5 0.9
# 4 3 0 0.0 0.0
# 5 0 0 0.0 0.0
Dataframe 对象不转换为布尔值,更改条件
if ratings[ratings.downloaded > 0]:
至
if len(ratings[ratings.downloaded > 0]) > 0:
我有一个如下所示的数据框:
total downloaded avg_rating
id
1 2 2 5.0
2 12 12 4.5
3 1 1 5.0
4 1 1 4.0
5 0 0 0.0
我正在尝试添加一个新列,其中包含其中两列的百分比差异,但仅适用于 'downloaded' 中没有 0 的列。
我正在尝试为此使用如下函数:
def diff(ratings):
if ratings[ratings.downloaded > 0]:
val = (ratings['total'] - ratings['downloaded']) / ratings['downloaded']
else:
val = 0
return val
ratings['Pct Diff'] = diff(ratings)
我收到一个错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-129-729c09bf14e8> in <module>()
6 return val
7
----> 8 ratings['Pct Diff'] = diff(ratings)
<ipython-input-129-729c09bf14e8> in diff(ratings)
1 def diff(ratings):
----> 2 if ratings[ratings.downloaded > 0]:
3 val = (ratings['total'] - ratings['downloaded']) /
ratings['downloaded']
4 else:
5 val = 0
~\Anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
953 raise ValueError("The truth value of a {0} is ambiguous. "
954 "Use a.empty, a.bool(), a.item(), a.any() or
a.all()."
--> 955 .format(self.__class__.__name__))
956
957 __bool__ = __nonzero__
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
谁能帮我理解这个错误是什么意思?
此外,这是否是一个应用函数的好应用程序?我可以在申请中使用条件吗?在这种情况下我将如何使用它?
错误的原因是您试图按行进行(向量化计算),但实际上在您的函数中 diff()
ratings[ratings.downloaded > 0]
returns dataframe 和它前面的 if
是不明确的。错误消息反映了这一点。
您不妨回顾一下 Indexing and Selecting Data。下面的解决方案通过在开头设置它来设置默认值 0。
import pandas as pd
df = pd.DataFrame([[2, 2, 5.0], [12, 12, 4.5], [10, 5, 3.0],
[20, 2, 3.5], [3, 0, 0.0], [0, 0, 0.0]],
columns=['total', 'downloaded', 'avg_rating'])
df['Pct Diff'] = 0
df.loc[df['downloaded'] > 0, 'Pct Diff'] = (df['total'] - df['downloaded']) / df['total']
# total downloaded avg_rating Pct Diff
# 0 2 2 5.0 0.0
# 1 12 12 4.5 0.0
# 2 10 5 3.0 0.5
# 3 20 2 3.5 0.9
# 4 3 0 0.0 0.0
# 5 0 0 0.0 0.0
Dataframe 对象不转换为布尔值,更改条件
if ratings[ratings.downloaded > 0]:
至
if len(ratings[ratings.downloaded > 0]) > 0: