如何解决数据框上的 groupby 问题

How to solve issue with groupby on a dataframe

我希望有一个包含项目索引、交易所列表和最后一列价格的最终数据框。

这是一个例子:

data = {'Exchange': ['coinbase', 'binance', 'coinbase', 'ftx','coinbase'], 'Projects': ['Bitcoin', 'Bitcoin', 'Ethereum', 'Ethereum','Doge'],'Price': [10,5,10,2,10]}

df = pd.DataFrame(data)

Output : 
   Exchange  Projects  Price
0  coinbase   Bitcoin     10
1   binance   Bitcoin      5
2  coinbase  Ethereum     10
3       ftx  Ethereum      2
4  coinbase      Doge     10

这是我试过的

df2 = df.groupby(by=["Projects"]).count()


df2['Price'] = df['Price']

df2['Exchange'] = df['Exchange']

df2

Output:

          Exchange  Price
Projects        
Bitcoin     NaN      NaN
Doge        NaN      NaN
Ethereum    NaN      NaN

我想要的:

            Exchange          Price
Projects        
Bitcoin     coinbase,binance  10
Doge        coinbase,ftx      2
Ethereum    ftx               5

使用groupby_agg:

>>> df.groupby('Projects').agg({'Exchange': ','.join, 'Price': 'last'})

                  Exchange  Price
Projects                         
Bitcoin   coinbase,binance      5
Doge              coinbase     10
Ethereum      coinbase,ftx      2

您可以将 'last' 替换为其他函数,例如 'max''mean''min' 或自定义函数。

你的情况

out = df.groupby('Projects').agg({'Exchange': ','.join,'Price':'last'})
Out[35]: 
                  Exchange  Price
Projects                         
Bitcoin   coinbase,binance      5
Doge              coinbase     10
Ethereum      coinbase,ftx      2