Python Dataframe:使用 If Then Else 逻辑有条件地创建新列 --> "The truth value of a Series is ambiguous"
Python Dataframe: Create New Column Conditionally Using If Then Else Logic --> "The truth value of a Series is ambiguous"
我正在使用以下代码创建一个新列,其值是根据 Python 数据框的其他两列中的值派生的。
# Create a list to store the data
MSP = []
for row in df_EVENT5_18['FLT']:
if df_EVENT5_18['FLT'].str.contains('1234') & df_EVENT5_18['AR'].str.contains('ABC1'):
MSP.append(29)
elif (df_EVENT5_18['FLT'].str.contains('1234')) & (df_EVENT5_18['AR'].str.contains('ABC2')):
MSP.append(25)
else:
MSP.append('')
# Create a new column from the list
df_EVENT5_18['MSP'] = MSP
当我运行上述代码时,出现以下错误:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
每当您认为 pandas 中需要循环时,请再次检查您的代码。一条线索是您有 for row in df_EVENT5_18['FLT']:
,但您从不使用 row
。
查找与字符串匹配的索引
在这种情况下,我们可以简单地使用布尔值求值来获取我们想要设置的索引:
has_flt_1234 = df_EVENT5_18['FLT'].str.contains('1234')
want_29 = has_flt_1234 & df_EVENT5_18['AR'].str.contains('ABC1')
want_25 = has_flt_1234 & df_EVENT5_18['AR'].str.contains('ABC2')
使用布尔系列设置值
然后根据需要设置适当的行:
df_EVENT5_18['MSP'][want_25] = '25'
df_EVENT5_18['MSP'][want_29] = '29'
测试代码:
import pandas as pd
df_EVENT5_18 = pd.DataFrame(dict(
FLT=['1234', '1234', '1235'],
AR=['ABC1', 'ABC2', 'ABC1']
))
print(df_EVENT5_18)
has_flt_1234 = df_EVENT5_18['FLT'].str.contains('1234')
want_29 = has_flt_1234 & df_EVENT5_18['AR'].str.contains('ABC1')
want_25 = has_flt_1234 & df_EVENT5_18['AR'].str.contains('ABC2')
# Create a new column from the list
df_EVENT5_18['MSP'] = ''
df_EVENT5_18['MSP'][want_25] = '25'
df_EVENT5_18['MSP'][want_29] = '29'
print(df_EVENT5_18)
结果:
AR FLT
0 ABC1 1234
1 ABC2 1234
2 ABC1 1235
AR FLT MSP
0 ABC1 1234 29
1 ABC2 1234 25
2 ABC1 1235
尝试这样的事情:
df[['new_col']] = df[['a','b'].apply(lambda (a,b) : pd.Series(your condition here),axis=1)
我正在使用以下代码创建一个新列,其值是根据 Python 数据框的其他两列中的值派生的。
# Create a list to store the data
MSP = []
for row in df_EVENT5_18['FLT']:
if df_EVENT5_18['FLT'].str.contains('1234') & df_EVENT5_18['AR'].str.contains('ABC1'):
MSP.append(29)
elif (df_EVENT5_18['FLT'].str.contains('1234')) & (df_EVENT5_18['AR'].str.contains('ABC2')):
MSP.append(25)
else:
MSP.append('')
# Create a new column from the list
df_EVENT5_18['MSP'] = MSP
当我运行上述代码时,出现以下错误:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
每当您认为 pandas 中需要循环时,请再次检查您的代码。一条线索是您有 for row in df_EVENT5_18['FLT']:
,但您从不使用 row
。
查找与字符串匹配的索引
在这种情况下,我们可以简单地使用布尔值求值来获取我们想要设置的索引:
has_flt_1234 = df_EVENT5_18['FLT'].str.contains('1234')
want_29 = has_flt_1234 & df_EVENT5_18['AR'].str.contains('ABC1')
want_25 = has_flt_1234 & df_EVENT5_18['AR'].str.contains('ABC2')
使用布尔系列设置值
然后根据需要设置适当的行:
df_EVENT5_18['MSP'][want_25] = '25'
df_EVENT5_18['MSP'][want_29] = '29'
测试代码:
import pandas as pd
df_EVENT5_18 = pd.DataFrame(dict(
FLT=['1234', '1234', '1235'],
AR=['ABC1', 'ABC2', 'ABC1']
))
print(df_EVENT5_18)
has_flt_1234 = df_EVENT5_18['FLT'].str.contains('1234')
want_29 = has_flt_1234 & df_EVENT5_18['AR'].str.contains('ABC1')
want_25 = has_flt_1234 & df_EVENT5_18['AR'].str.contains('ABC2')
# Create a new column from the list
df_EVENT5_18['MSP'] = ''
df_EVENT5_18['MSP'][want_25] = '25'
df_EVENT5_18['MSP'][want_29] = '29'
print(df_EVENT5_18)
结果:
AR FLT
0 ABC1 1234
1 ABC2 1234
2 ABC1 1235
AR FLT MSP
0 ABC1 1234 29
1 ABC2 1234 25
2 ABC1 1235
尝试这样的事情:
df[['new_col']] = df[['a','b'].apply(lambda (a,b) : pd.Series(your condition here),axis=1)