How to resolve ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
How to resolve ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
这是我现在正在处理的数据集列的示例:
print (data)
Credit Days
0 30
1 Cash & Carry
2 Cash & Carry
3 20
4 20
5 30
6 15
7 10
8 15
9 Cash & Carry
10 10
11 10
12 21
13 Cash & Carry
14 20
15 20
因此该列同时包含字符串和整数值。我必须将这些值转换为整数评级,并将它们保存到新创建的列中,比如 credit_days_rating。为此,我写了一个代码:
data = pd.read_csv('test.csv', engine='python')
data['Credit Days'].astype(str)
if data['Credit Days']=='Cash & Carry':
data['credit_days_rating'] = 4
else :
data['Credit Days'].astype(int)
if (data['Credit Days']>= 10) & (data['Credit Days']< 19):
data['credit_days_rating'] = 3
elif (data['Credit Days']>= 20) & (data['Credit Days']< 29):
data['credit_days_rating'] = 2
else :
data['credit_days_rating'] = 1
为此,我收到以下错误日志:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-65-f6ecf070a2d4> in <module>()
2
3 data['Credit Days'].astype(str)
----> 4 if (data['Credit Days']=='Cash & Carry'):
5 data['credit_days_rating'] = 5
6 else :
~/anaconda3/envs/tensorflow/lib/python3.5/site-packages/pandas/core/generic.py in __nonzero__(self)
1119 raise ValueError("The truth value of a {0} is ambiguous. "
1120 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
-> 1121 .format(self.__class__.__name__))
1122
1123 __bool__ = __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
新专栏应如下所示:
您可以使用 numpy.select
for set values by list of conditions, for compare numeric values use to_numeric
和 errors='coerce'
将非数字转换为 NaN
s:
m1 = data['Credit Days']=='Cash & Carry'
s = pd.to_numeric(data['Credit Days'], errors='coerce')
m2 = (s>= 10) & (s< 19)
m3 = (s>= 20) & (s< 29)
masks = [m1,m2,m3]
vals = [4,3,2]
data['credit_days_rating'] = np.select(masks, vals, default=1)
print (data)
Credit Days credit_days_rating
0 30 1
1 Cash & Carry 4
2 Cash & Carry 4
3 20 2
4 20 2
5 30 1
6 15 3
7 10 3
8 15 3
9 Cash & Carry 4
10 10 3
11 10 3
12 21 2
13 Cash & Carry 4
14 20 2
15 20 2
这是一种方法。使用强制将字符串设置为 NaN
s = pd.Series([21,'Cash & Carry',10,20])
df = pd.DataFrame(s,columns=['Credit Days'])
df["credit_days_rating"] = 'NaN'
df.loc[df['Credit Days'] == 'Cash & Carry', 'credit_days_rating'] = 5
df.loc[(pd.to_numeric(df['Credit Days'], errors='coerce') >= 10) & (pd.to_numeric(df['Credit Days'], errors='coerce') < 19),'credit_days_rating'] = 3
我猜你想要的实际上是在你的列上应用一个函数来获得一些只有整数类型的列。这可以通过以下方式完成:
data = ["some str", 10, 20, "some str", 1, 2, 3]
df = pd.DataFrame(data)
def my_function(value):
if value == "some str":
return 5
elif value >= 10 or value < 19:
return 3
df['new_col'] = df[0].apply(my_function)
df
然后输出是:
0 new_col
0 some str 5
1 10 3
2 20 3
3 some str 5
4 1 3
5 2 3
6 3 3
(data['Credit Days']=='Cash & Carry') 正在做的是返回布尔值的 pandas 系列实例,例如:
df[0] == "some str"
0 True
1 False
2 False
3 True
4 False
5 False
6 False
Name: 0, dtype: bool
如果您想要一个布尔值,以便在条件语句中使用,您将需要使用 Series 内部方法 all() 或 any():
(df[0] == "some str").any()
True
这是我现在正在处理的数据集列的示例:
print (data)
Credit Days
0 30
1 Cash & Carry
2 Cash & Carry
3 20
4 20
5 30
6 15
7 10
8 15
9 Cash & Carry
10 10
11 10
12 21
13 Cash & Carry
14 20
15 20
因此该列同时包含字符串和整数值。我必须将这些值转换为整数评级,并将它们保存到新创建的列中,比如 credit_days_rating。为此,我写了一个代码:
data = pd.read_csv('test.csv', engine='python')
data['Credit Days'].astype(str)
if data['Credit Days']=='Cash & Carry':
data['credit_days_rating'] = 4
else :
data['Credit Days'].astype(int)
if (data['Credit Days']>= 10) & (data['Credit Days']< 19):
data['credit_days_rating'] = 3
elif (data['Credit Days']>= 20) & (data['Credit Days']< 29):
data['credit_days_rating'] = 2
else :
data['credit_days_rating'] = 1
为此,我收到以下错误日志:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-65-f6ecf070a2d4> in <module>()
2
3 data['Credit Days'].astype(str)
----> 4 if (data['Credit Days']=='Cash & Carry'):
5 data['credit_days_rating'] = 5
6 else :
~/anaconda3/envs/tensorflow/lib/python3.5/site-packages/pandas/core/generic.py in __nonzero__(self)
1119 raise ValueError("The truth value of a {0} is ambiguous. "
1120 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
-> 1121 .format(self.__class__.__name__))
1122
1123 __bool__ = __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
新专栏应如下所示:
您可以使用 numpy.select
for set values by list of conditions, for compare numeric values use to_numeric
和 errors='coerce'
将非数字转换为 NaN
s:
m1 = data['Credit Days']=='Cash & Carry'
s = pd.to_numeric(data['Credit Days'], errors='coerce')
m2 = (s>= 10) & (s< 19)
m3 = (s>= 20) & (s< 29)
masks = [m1,m2,m3]
vals = [4,3,2]
data['credit_days_rating'] = np.select(masks, vals, default=1)
print (data)
Credit Days credit_days_rating
0 30 1
1 Cash & Carry 4
2 Cash & Carry 4
3 20 2
4 20 2
5 30 1
6 15 3
7 10 3
8 15 3
9 Cash & Carry 4
10 10 3
11 10 3
12 21 2
13 Cash & Carry 4
14 20 2
15 20 2
这是一种方法。使用强制将字符串设置为 NaN
s = pd.Series([21,'Cash & Carry',10,20])
df = pd.DataFrame(s,columns=['Credit Days'])
df["credit_days_rating"] = 'NaN'
df.loc[df['Credit Days'] == 'Cash & Carry', 'credit_days_rating'] = 5
df.loc[(pd.to_numeric(df['Credit Days'], errors='coerce') >= 10) & (pd.to_numeric(df['Credit Days'], errors='coerce') < 19),'credit_days_rating'] = 3
我猜你想要的实际上是在你的列上应用一个函数来获得一些只有整数类型的列。这可以通过以下方式完成:
data = ["some str", 10, 20, "some str", 1, 2, 3]
df = pd.DataFrame(data)
def my_function(value):
if value == "some str":
return 5
elif value >= 10 or value < 19:
return 3
df['new_col'] = df[0].apply(my_function)
df
然后输出是:
0 new_col
0 some str 5
1 10 3
2 20 3
3 some str 5
4 1 3
5 2 3
6 3 3
(data['Credit Days']=='Cash & Carry') 正在做的是返回布尔值的 pandas 系列实例,例如:
df[0] == "some str"
0 True
1 False
2 False
3 True
4 False
5 False
6 False
Name: 0, dtype: bool
如果您想要一个布尔值,以便在条件语句中使用,您将需要使用 Series 内部方法 all() 或 any():
(df[0] == "some str").any()
True