根据 pandas 中的前三行转换数据框

Question

我有一个这样的数据框（但更大），我正在尝试使用转换来仅基于每组的前 3 行获取最大值。

     df10 = pd.DataFrame({
      'Price': [1,2,3,4,5,10,20,30,40,50],
      'Stock': ['AAPL', 'AAPL', 'AAPL', 'AAPL', 'AAPL', 'IBM','IBM','IBM','IBM','IBM']
     })

此语法适用于整个专栏

df10['max_top_3']=df10.groupby("Stock").Price.transform('max')

但我希望 'max_top_3' 列分别为 AAPL 和 IBM 显示 3 和 30 >> 这是该列中前 3 个条目的最大数量

我试过类似的方法，但出现错误

df10['max_top_3']=df10.groupby("Stock").Price.head(3).transform('max')

Answer 1

您可以使用 lambda:

链接转换中的 head

df10.groupby("Stock").Price.transform(lambda x: x.head(3).max())

0     3
1     3
2     3
3     3
4     3
5    30
6    30
7    30
8    30
9    30
Name: Price, dtype: int64

Answer 2

我会的

df10.merge(df10.groupby('Stock').head(3).groupby('Stock',as_index=False).Price.max(),on='Stock')
Out[179]: 
   Price_x Stock  Price_y
0        1  AAPL        3
1        2  AAPL        3
2        3  AAPL        3
3        4  AAPL        3
4        5  AAPL        3
5       10   IBM       30
6       20   IBM       30
7       30   IBM       30
8       40   IBM       30
9       50   IBM       30

Answer 3

Sort the dataframe(unnecessary in your case, since the data is already sorted), groupby on Stock, then get the 3rd row, using transform and nth，因为数据是按降序排列的：

df10["max_3"] = (df10
                 .sort_values(["Price", "Stock"])
                 .groupby("Stock")
                 .Price
                 .transform("nth", 2)
                 )

df10


Price   Stock   max_3
0   1   AAPL    3
1   2   AAPL    3
2   3   AAPL    3
3   4   AAPL    3
4   5   AAPL    3
5   10  IBM     30
6   20  IBM     30
7   30  IBM     30
8   40  IBM     30
9   50  IBM     30

根据 pandas 中的前三行转换数据框

Transform dataframe based on three first rows in pandas

transformation

pandas

pandas-groupby