python中的两个Series如何相乘或相除?
How to multiply or divide two Series in python?
我有一个这样的数据集。不过实际数据集要大得多。
data1 = pd.DataFrame({'Name':["Tom","Andy","Joseph","Joe","Mary","Alexa","Chris","Jessica","Jimmy","Andrea","George","Bruce","Will","Eric","Leonard","Ryan","Megan","Michael","Sara"],\
"City":["NY","DC","LAX","NY","DC","DC","SF","SD","NY","SF","SD","DC","LAX","SF","LAX","NY","SF","PDX","FL"],\
'Car':["Ford","Ford","TOYOTA","GM","GM","Honda","GM","Porsche","Tesla","TOYOTA","Tesla","Tesla","Honda","GM","Nissan","Porsche","Nissan","Ford","Tesla"]})
首先,我想计算“City”和“Car”组合的实际频率并这样做了。
df_City_Car_actual=data1.groupby(["City","Car"]).size()
df_City_Car_actual
然后我想计算“城市”和“汽车”组合的预期频率。
所以我先做了这个。
df_City=data1.groupby("City").size()
df_City
df_Car=data1.groupby("Car").size()
df_Car
然后我想乘以 df_City 和 df_Car 并显示 City x Car 的预期频率。
例如,“DC”频率在 df_City 中为 4,“Ford”频率在 df_Car 中为 3。
因此,DC x Ford 预期频率将为 4x3=12。
我试过了但是没用
df_City_Car_expected=df_City*df_Car
df_City_Car_expected
最后,我想将 df_City_Car_actual 除以 df_City_Car_expected,以便最终数据标准化。
有没有好的方法来做到这一点?
感谢您的帮助。
我能想到的最简单的方法是使用numpy
“外积”函数,例如:
pd.DataFrame(np.outer(df_City.values, df_Car.values), index=df_City.index, columns=df_Car.index)
给出:
Car Ford GM Honda Nissan Porsche TOYOTA Tesla
City
DC 12 16 8 8 8 8 16
FL 3 4 2 2 2 2 4
LAX 9 12 6 6 6 6 12
NY 12 16 8 8 8 8 16
PDX 3 4 2 2 2 2 4
SD 6 8 4 4 4 4 8
SF 12 16 8 8 8 8 16
我有一个这样的数据集。不过实际数据集要大得多。
data1 = pd.DataFrame({'Name':["Tom","Andy","Joseph","Joe","Mary","Alexa","Chris","Jessica","Jimmy","Andrea","George","Bruce","Will","Eric","Leonard","Ryan","Megan","Michael","Sara"],\
"City":["NY","DC","LAX","NY","DC","DC","SF","SD","NY","SF","SD","DC","LAX","SF","LAX","NY","SF","PDX","FL"],\
'Car':["Ford","Ford","TOYOTA","GM","GM","Honda","GM","Porsche","Tesla","TOYOTA","Tesla","Tesla","Honda","GM","Nissan","Porsche","Nissan","Ford","Tesla"]})
首先,我想计算“City”和“Car”组合的实际频率并这样做了。
df_City_Car_actual=data1.groupby(["City","Car"]).size()
df_City_Car_actual
然后我想计算“城市”和“汽车”组合的预期频率。 所以我先做了这个。
df_City=data1.groupby("City").size()
df_City
df_Car=data1.groupby("Car").size()
df_Car
然后我想乘以 df_City 和 df_Car 并显示 City x Car 的预期频率。 例如,“DC”频率在 df_City 中为 4,“Ford”频率在 df_Car 中为 3。 因此,DC x Ford 预期频率将为 4x3=12。
我试过了但是没用
df_City_Car_expected=df_City*df_Car
df_City_Car_expected
最后,我想将 df_City_Car_actual 除以 df_City_Car_expected,以便最终数据标准化。 有没有好的方法来做到这一点? 感谢您的帮助。
我能想到的最简单的方法是使用numpy
“外积”函数,例如:
pd.DataFrame(np.outer(df_City.values, df_Car.values), index=df_City.index, columns=df_Car.index)
给出:
Car Ford GM Honda Nissan Porsche TOYOTA Tesla
City
DC 12 16 8 8 8 8 16
FL 3 4 2 2 2 2 4
LAX 9 12 6 6 6 6 12
NY 12 16 8 8 8 8 16
PDX 3 4 2 2 2 2 4
SD 6 8 4 4 4 4 8
SF 12 16 8 8 8 8 16