Pandas groupby - 求前 10 项的平均值

Question

我每组有 30 个项目。

为了找到所有项目的平均值，我使用了这段代码。

y = df[["Value", "Date"]].groupby("Date").mean()

那个 return 是这样的值。

Date                  Value
       
2020-01-01 00:30:00   7172.36
2020-01-01 01:00:00   7171.55
2020-01-01 01:30:00   7205.90
2020-01-01 02:00:00   7210.24
2020-01-01 02:30:00   7221.50

但是，我想找到组中前 10 个项目的平均值，而不是整个项目。

y1 = df[["Value", "Date"]].groupby("Date").head(10).mean()

那个代码 return 只有一个 Value 而不是 pandas 系列。

所以我收到这样的错误。

AttributeError: 'numpy.float64' object has no attribute 'shift'

获取 pandas 系列而不是单个值的正确方法是什么？

Answer 1

尝试

# slice the first 10 and average
y1 = df.groupby("Date")["Value"].apply(lambda x: x.iloc[:10].mean())

Answer 2

你可以试试

y1 = df[["Value", "Date"]].groupby("Date").apply(lambda g: g['Value'].head(10).mean())

print(y1)

Date
2020-01-01 00:30:00    7172.36
2020-01-01 01:00:00    7171.55
2020-01-01 01:30:00    7205.90
2020-01-01 02:00:00    7210.24
2020-01-01 02:30:00    7221.50
dtype: float64

在.groupby("Date").head(10).mean()、groupby.head()、returnsDataFrame中，.mean()操作的是整个DataFrame，而不是组。

Pandas groupby - 求前 10 项的平均值

Pandas groupby - Find mean of first 10 items

python

pandas

pandas-groupby