Pandas 相邻列与新列的比较(布尔值)
Pandas Comparison (boolean) of Adjacent columns to New Column
在下面的数据中,我需要根据某些比较添加额外的列。
test_file.csv
day v1 v2 v3
mon 38 42 42
tue 45 35 43
wed 36 45 43
thu 41 35 45
fri 37 42 44
sat 40 43 42
sun 43 40 43
我已经尝试了这些代码行,它抛出了代码下方显示的错误。
df["Compare_col_1"] = ""
df["Compare_col_2"] = ""
if ((df.v3 < df.v1) & (df.v2 > df.v1)):
df["Compare_col_1"] = "Balanced"
else:
df["Compare_col_1"] = "Out_of_Bounds"
if df.v3 < df.v2:
df["Compare_col_2"] = "Eligible"
else:
df["Compare_col_2"] = "Slow"
错误(仅使用 Pandas)
Traceback (most recent call last):
File "C:\Trials\Test.py", line 291, in
if ((df.v3 df.v1)):
File "C:\Winpy\WPy64-3770\python-3.7.7.amd64\lib\site-packages\pandas\core\generic.py", line 1479, in __nonzero__
f"The truth value of a {type(self).__name__} is ambiguous. "
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
现在,我看过几篇文章,如 ,对如何使用 numpy 获得我需要的结果给出了很好的解释。但同样的错误重复如下所示..
新代码(使用 numpy):
if (np.logical_and((df.SMA_8d < df.ClosePrice) , (df.ClosePrice < df.SMA_3d))):
df["Mark2"] = "True"
else:
df["Mark2"] = "False"
Traceback (most recent call last):
File "C:\Trials\Test.py", line 291, in
if (np.logical_and((df.v3 df.v1))):
File "C:\Winpy\WPy64-3770\python-3.7.7.amd64\lib\site-packages\pandas\core\generic.py", line 1479, in __nonzero__
f"The truth value of a {type(self).__name__} is ambiguous. "
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
是否有任何解决方案可以通过比较相邻的列来生成这些新列(更重要的是,仅 pandas 中的解决方案...)
您可以像这样使用 np.where
:
df["Compare_col_1"] = np.where((df.v3<df.v1)&(df.v2>df.v1), "Balanced", "Out_of_Bounds")
df["Compare_col_2"] = np.where(df.v3<df.v2, "Eligible", "Slow")
在下面的数据中,我需要根据某些比较添加额外的列。
test_file.csv
day v1 v2 v3
mon 38 42 42
tue 45 35 43
wed 36 45 43
thu 41 35 45
fri 37 42 44
sat 40 43 42
sun 43 40 43
我已经尝试了这些代码行,它抛出了代码下方显示的错误。
df["Compare_col_1"] = ""
df["Compare_col_2"] = ""
if ((df.v3 < df.v1) & (df.v2 > df.v1)):
df["Compare_col_1"] = "Balanced"
else:
df["Compare_col_1"] = "Out_of_Bounds"
if df.v3 < df.v2:
df["Compare_col_2"] = "Eligible"
else:
df["Compare_col_2"] = "Slow"
错误(仅使用 Pandas)
Traceback (most recent call last): File "C:\Trials\Test.py", line 291, in if ((df.v3 df.v1)): File "C:\Winpy\WPy64-3770\python-3.7.7.amd64\lib\site-packages\pandas\core\generic.py", line 1479, in __nonzero__ f"The truth value of a {type(self).__name__} is ambiguous. " ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
现在,我看过几篇文章,如
新代码(使用 numpy):
if (np.logical_and((df.SMA_8d < df.ClosePrice) , (df.ClosePrice < df.SMA_3d))):
df["Mark2"] = "True"
else:
df["Mark2"] = "False"
Traceback (most recent call last): File "C:\Trials\Test.py", line 291, in if (np.logical_and((df.v3 df.v1))): File "C:\Winpy\WPy64-3770\python-3.7.7.amd64\lib\site-packages\pandas\core\generic.py", line 1479, in __nonzero__ f"The truth value of a {type(self).__name__} is ambiguous. " ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
是否有任何解决方案可以通过比较相邻的列来生成这些新列(更重要的是,仅 pandas 中的解决方案...)
您可以像这样使用 np.where
:
df["Compare_col_1"] = np.where((df.v3<df.v1)&(df.v2>df.v1), "Balanced", "Out_of_Bounds")
df["Compare_col_2"] = np.where(df.v3<df.v2, "Eligible", "Slow")