两个 pandas 数据帧的条件元素乘法
Conditional elementwise multiplication of two pandas dataframes
我正在努力在两个数据帧之间进行基本的条件元素乘法。
假设我有以下两个数据框:
df1 = pd.DataFrame({'A': [-0.1,0.3,-0.4, 0.8,-0.5,-0.1,0.3,-0.4, 0.8,-1.2],'B': [-0.2,0.5,0.3,-0.5,0.1,-0.2,0.5,0.3,-0.5,0.9]},index=[0, 1, 2, 3,4,5,6,7,8,9])
df2=pd.DataFrame({'C': [-0.003,0.03848,-0.04404, 0.018,-0.1515,-0.02181,0.233,-0.0044, 0.01458,-0.015],'D': [-0.0152,0.0155,0.03,-0.0155,0.0151,-0.012,0.035,0.0013,-0.0005,0.009]},index=[0, 1, 2, 3,4,5,6,7,8,9])
想法是根据 df1 的值乘以 df1 and df2.shift(-1)
(逐元素,而不是矩阵乘法)。如果 (df1>=0.50 or df1<=-0.50)
则我将 df1 与 df2.shift(-1) 相乘。否则,我只放 0.
此示例中的预期结果应如下(列名是 df1 的列名以及 df1 索引):
df3=pd.DataFrame({'A': [0,0,0, -0.1212,0.010905,0,0,0, -0.012,'NaN'],'B': [0,0.015,0,-0.00755,0,0,0.00065,0,-0.0045,'NaN']},index=[0, 1, 2, 3,4,5,6,7,8,9])
我尝试了以下代码:
import numpy as np
import pandas as pd
df3=np.where((df1>=0.50 or df1 <=-0.50),df1*df2.shift(-1),0)
然后我得到 The truth value of a DataFrame is ambiguous。使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。
谢谢。
将 |
用于按位 OR
和 DataFrame
构造函数:
arr = np.where((df1>=0.50) | (df1 <=-0.50),df1*df2.shift(-1),0)
df3 = pd.DataFrame(arr, index=df1.index, columns=df1.columns)
print (df3)
A B
0 0.000000 0.00000
1 0.000000 0.01500
2 0.000000 0.00000
3 -0.121200 -0.00755
4 0.010905 0.00000
5 0.000000 0.00000
6 0.000000 0.00065
7 0.000000 0.00000
8 -0.012000 -0.00450
9 NaN NaN
Numpy 解决方案应该更快:
arr2 = np.concatenate([df2.values[1:, ],
np.repeat(np.nan, len(df2.columns))[None, :]])
arr = np.where((df1.values>=0.50) | (df1.values <=-0.50),df1.values*arr2,0)
df3 = pd.DataFrame(arr, index=df1.index, columns=df1.columns)
print (df3)
A B
0 0.000000 0.00000
1 0.000000 0.01500
2 0.000000 0.00000
3 -0.121200 -0.00755
4 0.010905 0.00000
5 0.000000 0.00000
6 0.000000 0.00065
7 0.000000 0.00000
8 -0.012000 -0.00450
9 NaN NaN
我正在努力在两个数据帧之间进行基本的条件元素乘法。 假设我有以下两个数据框:
df1 = pd.DataFrame({'A': [-0.1,0.3,-0.4, 0.8,-0.5,-0.1,0.3,-0.4, 0.8,-1.2],'B': [-0.2,0.5,0.3,-0.5,0.1,-0.2,0.5,0.3,-0.5,0.9]},index=[0, 1, 2, 3,4,5,6,7,8,9])
df2=pd.DataFrame({'C': [-0.003,0.03848,-0.04404, 0.018,-0.1515,-0.02181,0.233,-0.0044, 0.01458,-0.015],'D': [-0.0152,0.0155,0.03,-0.0155,0.0151,-0.012,0.035,0.0013,-0.0005,0.009]},index=[0, 1, 2, 3,4,5,6,7,8,9])
想法是根据 df1 的值乘以 df1 and df2.shift(-1)
(逐元素,而不是矩阵乘法)。如果 (df1>=0.50 or df1<=-0.50)
则我将 df1 与 df2.shift(-1) 相乘。否则,我只放 0.
此示例中的预期结果应如下(列名是 df1 的列名以及 df1 索引):
df3=pd.DataFrame({'A': [0,0,0, -0.1212,0.010905,0,0,0, -0.012,'NaN'],'B': [0,0.015,0,-0.00755,0,0,0.00065,0,-0.0045,'NaN']},index=[0, 1, 2, 3,4,5,6,7,8,9])
我尝试了以下代码:
import numpy as np
import pandas as pd
df3=np.where((df1>=0.50 or df1 <=-0.50),df1*df2.shift(-1),0)
然后我得到 The truth value of a DataFrame is ambiguous。使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。 谢谢。
将 |
用于按位 OR
和 DataFrame
构造函数:
arr = np.where((df1>=0.50) | (df1 <=-0.50),df1*df2.shift(-1),0)
df3 = pd.DataFrame(arr, index=df1.index, columns=df1.columns)
print (df3)
A B
0 0.000000 0.00000
1 0.000000 0.01500
2 0.000000 0.00000
3 -0.121200 -0.00755
4 0.010905 0.00000
5 0.000000 0.00000
6 0.000000 0.00065
7 0.000000 0.00000
8 -0.012000 -0.00450
9 NaN NaN
Numpy 解决方案应该更快:
arr2 = np.concatenate([df2.values[1:, ],
np.repeat(np.nan, len(df2.columns))[None, :]])
arr = np.where((df1.values>=0.50) | (df1.values <=-0.50),df1.values*arr2,0)
df3 = pd.DataFrame(arr, index=df1.index, columns=df1.columns)
print (df3)
A B
0 0.000000 0.00000
1 0.000000 0.01500
2 0.000000 0.00000
3 -0.121200 -0.00755
4 0.010905 0.00000
5 0.000000 0.00000
6 0.000000 0.00065
7 0.000000 0.00000
8 -0.012000 -0.00450
9 NaN NaN