如何 "multiply" python pandas 数据帧(就好像它们是向量一样)?
How to "multiply" python pandas dataframes (as if they were vectors)?
我正在学习 pandas。我有两个数据框:
df1 =
quality1 value
A 1
B 2
C 3
df2 =
quality2 value
D 1
E 10
F 100
我想将它们相乘(就像我可能对向量做的那样得到矩阵)。答案应该是:
df3 =
quality1 quality2 value
A D 1
E 10
F 100
B D 2
E 20
F 200
C D 3
E 30
F 300
我怎样才能做到这一点?
它不是最漂亮的,但它会起作用:
>>> df1["dummy"] = 1
>>> df2["dummy"] = 1
>>> dfm = df1.merge(df2, on="dummy")
>>> dfm["value"] = dfm.pop("value_x") * dfm.pop("value_y")
>>> del dfm["dummy"]
>>> dfm
quality1 quality2 value
0 A D 1
1 A E 10
2 A F 100
3 B D 2
4 B E 20
5 B F 200
6 C D 3
7 C E 30
8 C F 300
直到我们获得对笛卡尔连接的本机支持(吹口哨并移开视线..),在虚拟列上合并是获得相同效果的简单方法。中间框架看起来像
>>> dfm
quality1 value_x dummy quality2 value_y
0 A 1 1 D 1
1 A 1 1 E 10
2 A 1 1 F 100
3 B 2 1 D 1
4 B 2 1 E 10
5 B 2 1 F 100
6 C 3 1 D 1
7 C 3 1 E 10
8 C 3 1 F 100
您还可以使用 scikit-learn
中的 cartesian
函数:
from sklearn.utils.extmath import cartesian
# Your data:
df1 = pd.DataFrame({'quality1':list('ABC'), 'value':[1,2,3]})
df2 = pd.DataFrame({'quality2':list('DEF'), 'value':[1,10,100]})
# Make the matrix of labels:
dfm = pd.DataFrame(cartesian((df1.quality1.values, df2.quality2.values)),
columns=['quality1', 'quality2'])
# Multiply values:
dfm['value'] = df1.value.values.repeat(df2.value.size) * pd.np.tile(df2.value.values, df1.value.size)
print dfm.set_index(['quality1', 'quality2'])
产生:
value
quality1 quality2
A D 1
E 10
F 100
B D 2
E 20
F 200
C D 3
E 30
F 300
我正在学习 pandas。我有两个数据框:
df1 =
quality1 value
A 1
B 2
C 3
df2 =
quality2 value
D 1
E 10
F 100
我想将它们相乘(就像我可能对向量做的那样得到矩阵)。答案应该是:
df3 =
quality1 quality2 value
A D 1
E 10
F 100
B D 2
E 20
F 200
C D 3
E 30
F 300
我怎样才能做到这一点?
它不是最漂亮的,但它会起作用:
>>> df1["dummy"] = 1
>>> df2["dummy"] = 1
>>> dfm = df1.merge(df2, on="dummy")
>>> dfm["value"] = dfm.pop("value_x") * dfm.pop("value_y")
>>> del dfm["dummy"]
>>> dfm
quality1 quality2 value
0 A D 1
1 A E 10
2 A F 100
3 B D 2
4 B E 20
5 B F 200
6 C D 3
7 C E 30
8 C F 300
直到我们获得对笛卡尔连接的本机支持(吹口哨并移开视线..),在虚拟列上合并是获得相同效果的简单方法。中间框架看起来像
>>> dfm
quality1 value_x dummy quality2 value_y
0 A 1 1 D 1
1 A 1 1 E 10
2 A 1 1 F 100
3 B 2 1 D 1
4 B 2 1 E 10
5 B 2 1 F 100
6 C 3 1 D 1
7 C 3 1 E 10
8 C 3 1 F 100
您还可以使用 scikit-learn
中的 cartesian
函数:
from sklearn.utils.extmath import cartesian
# Your data:
df1 = pd.DataFrame({'quality1':list('ABC'), 'value':[1,2,3]})
df2 = pd.DataFrame({'quality2':list('DEF'), 'value':[1,10,100]})
# Make the matrix of labels:
dfm = pd.DataFrame(cartesian((df1.quality1.values, df2.quality2.values)),
columns=['quality1', 'quality2'])
# Multiply values:
dfm['value'] = df1.value.values.repeat(df2.value.size) * pd.np.tile(df2.value.values, df1.value.size)
print dfm.set_index(['quality1', 'quality2'])
产生:
value
quality1 quality2
A D 1
E 10
F 100
B D 2
E 20
F 200
C D 3
E 30
F 300