运行 Python 中的频率计算循环

Running loop for frequency calculation in Python

我有table中所示的数据。我想使用 Python。对于 2016 年和 2017 年存在的所有水果,我想要这些水果在 2015 年的国家/地区频率。

Country Fruit Year
Germany Apple 2015
France Apple 2015
France Apple 2015
Spain Apple 2015
Germany Banana 2015
France Banana 2015
France Apple 2016
Spain Apple 2016
Germany Banana 2016
France Banana 2016
France Banana 2017
France Grapes 2017

我想要的最终 table 如下所示:

Fruit Germany France Spain
Apple 1 2 1
Banana 1 1 0
Grapes 0 0 0

按年份过滤,然后用 pivot_table:

旋转它
(df[df.Year == 2015]
  .pivot_table('Year', 'Fruit', 'Country', aggfunc='count')
  .reindex(
    index=df.Fruit.unique(), 
    columns=df.Country.unique()
  ).fillna(0)
  .reset_index())

Country   Fruit  Germany  France  Spain
0         Apple      1.0     2.0    1.0
1        Banana      1.0     1.0    0.0
2        Grapes      0.0     0.0    0.0

另一种选择是使用 crosstab 然后 select 2015 结果:

(pd.crosstab(df.Fruit, [df.Country, df.Year])
   .loc[:, pd.IndexSlice[:, 2015]]
   .droplevel(1, 1)
   .reset_index())

Country   Fruit  France  Germany  Spain
0         Apple       2        1      1
1        Banana       1        1      0
2        Grapes       0        0      0

尝试:

df_2015 = df[df['Year'] == 2015]
pd.crosstab(df_2015['Fruit'], df_2015['Country']).reindex(df['Fruit'].unique(), fill_value=0)

输出:

Country  France  Germany  Spain
Fruit                          
Apple         2        1      1
Banana        1        1      0
Grapes        0        0      0