如何计算 DataFrameGroupBy 对象的中位数？

Question

这是我的数据框df

            1.1  1.2  1.3  2.1 ... 5.1  6.1 6.2. 6.3.
sample_a    1    1     2    4       2    3   4   2
sample_b    2    3     3    1       1    3   1   2
sample_c    2    4     3    1       1    3   2   2

我想通过提取列名的第一个数字（即从 1.1 中取 1，从 2.1 中取 2，从 6.1 中取 6）来对 df 进行分组，并通过中位数。

这是我想要的输出：

            1    2    ...   5    6
sample_a    1    4          2    3 
sample_b    3    1          1    2 
sample_c    3    1          1    2

所以比如第一个元素(sample_a,1)1.1,1.2,1.3的中位数是1.

这是我目前拥有的代码。

df.columns = df.columns.str.extract('([\d])\.\d+',expand=False)
df.groupby(df.columns, axis=1).median(axis=1)

我不确定轴应该是 0 还是 1，但无论哪种方式我都得到 KeyError: 'axis'

当我尝试以下代码时，它工作正常。

df.columns = df.columns.str.extract('([\d])\.\d+',expand=False)
df.groupby(df.columns,axis=1).sum()

为什么中位数不起作用？

Answer 1

在 axis=1

上使用 groupby

df.groupby(df.columns.str[0], axis=1).median()

          1  2  5  6
sample_a  1  4  2  3
sample_b  3  1  1  2
sample_c  3  1  1  2

如何计算 DataFrameGroupBy 对象的中位数？

How do you calculate median on a DataFrameGroupBy object?

python

median

pandas

pandas-groupby