Pandas groupby(...).mean() 丢失密钥

Question

我有具有以下结构的数据框 rounds（这是从另一个数据框中删除列的结果）（不能 post 图片，抱歉）：

----------------------------
|type|N|D|NATC|K|iters|time|
----------------------------
rows of data
----------------------------

我使用 groupby 这样我就可以得到组的平均值，如下所示：

rounds = results.groupby(['type','N','D','NATC','K','iters'])
results_mean = rounds.mean()

我得到了我想要的方法，但我遇到了按键问题。 results_mean 数据框具有以下结构：

----------------------------
|    | | |    | |     |time|
|type|N|D|NATC|K|iters|    |
----------------------------
rows of data
----------------------------

唯一识别的键是time（我执行了results_mean.keys()）。

我做错了什么？我该如何解决？

Answer 1

在您的汇总数据中，time 是唯一的列。其他的是索引。

groupby 有一个参数 as_index。 From the documentation:

as_index : boolean, default True

For aggregated output, return object with group labels as the index. Only relevant for DataFrame input. as_index=False is effectively “SQL-style” grouped output

所以你可以通过调用

得到想要的输出

rounds = results.groupby(['type','N','D','NATC','K','iters'], as_index = False)
results_mean = rounds.mean()

或者，如果需要，您始终可以将索引转换为键 by using reset_index。使用

rounds = results.groupby(['type','N','D','NATC','K','iters'])
results_mean = rounds.mean().reset_index()

应该也有预期的效果。

Answer 2

由于使用 group_by() 函数，我遇到了丢失 dataframes's 键的相同问题，我找到的解决该问题的方法是将 Dataframe 转换为 CSV 文件然后阅读这个文件。

Pandas groupby(...).mean() 丢失密钥

Pandas groupby(...).mean() lost keys

python

pandas

pandas-groupby