Python:具有旋转数据的热图给出重复错误

Python: Heatmap with pivoted data giving duplicate error

我数据中的前 20 个观察值是:

    id                                          day         hour    consumption
0   012af199245dedacf9ea0ba6eedef4e89272c7dc    Saturday    8       0.000000
1   019ebd48fe9c9ab20051e9de1d5ddfc6fd13c55b    Tuesday     16      0.000000
2   0310daaa6368cf0618f341351b8451e509da27d7    Wednesday   17      0.000000
3   04a2ddb034ff774cda02130fd59b280d55f762d7    Tuesday     16     -0.017699
4   04d61391eeea5b957847dbe08b52d88e64909dbf    Thursday    15      0.000000
5   04f1fa8b29c58e19eebf0e26169975a66ec7cbbf    Tuesday     15      0.000000
6   0561aa699b6c91c842b850c6b73ee4b3c8cbb03b    Thursday    12     -0.002597
7   059492a3600ef0b39726af2201a0ad87610a4a02    Thursday    17      0.000000
8   059fb9175372802b43b3fdcebd2a507bc89e71b0    Thursday    12     -0.001541
9   05da142ebe95e15ab30dee30d1a982d8f419dfb2    Tuesday     20     -0.003050
10  0663c2fd03deecf7f52c3e5c7c0be5c94a3292b8    Sunday      13     -0.005613
11  07040b85d9c0c0ff122b3fef3ab73eab6c53ff0e    Saturday    18      0.000000
12  07a33356cb6330b2090152d30413b224ad1c018b    Saturday    20      0.005013
13  07d67b08fab92657c699dbeec931a48c9f1cfbf7    Friday      15     -0.015675
14  07f92e8eb78f9d8ab6446ffd2649990cffce2ead    Friday      16     -0.004035
15  086cfca739da633d89100874a6c91c37e04880af    Friday      0      -0.004068
16  0a64e559b80b819b2a48a939fa96b1f3f3791e54    Monday      12     -0.007687
17  0b477ac123374072c5acf34d1d063d6ae6c4bf0b    Friday      21      0.000000
18  0bf144e77495b06fb319f4a312f09015da7c5afd    Tuesday     4       0.000000
19  0d1263d90f5a5449a1d0eb80c0f217daff646d36    Saturday    8      -0.005963

我正在尝试通过以下方式创建热图:

sns.heatmap(df.pivot("day", "hour", "consumption"))

但我收到错误消息:

ValueError: Index contains duplicate entries, cannot reshape

我尝试使用 pivot_table() 而不是根据文档解释了此错误。但后来我得到:

DataError: No numeric types to aggregate

或者,绘制星期几的方法也可行:

# Convert dates into the number of the day of the week
# 0=Mon; 6=Sunday
df['day_num'] = df['timestamp'].dt.weekday

sns.heatmap(df.pivot_table(index="day_num", columns="hour", values="consumption"))

不过,我想保留图中的名称(或缩写)。

我该如何解决这个问题?

注意pivot and pivot_table的params顺序不同,所以如果你不命名params,顺序需要相应地改变:

  • pivot

    # pivot(index=None, columns=None, values=None)
    df.pivot('day', 'hour', 'consumption')
    
  • pivot_table

    # pivot_table(values=None, index=None, columns=None, ...)
    df.pivot_table('consumption', 'day', 'hour')
    

为避免歧义,我建议使用命名参数:

sns.heatmap(df.pivot_table(index='day', columns='hour', values='consumption'))