Python:具有旋转数据的热图给出重复错误
Python: Heatmap with pivoted data giving duplicate error
我数据中的前 20 个观察值是:
id day hour consumption
0 012af199245dedacf9ea0ba6eedef4e89272c7dc Saturday 8 0.000000
1 019ebd48fe9c9ab20051e9de1d5ddfc6fd13c55b Tuesday 16 0.000000
2 0310daaa6368cf0618f341351b8451e509da27d7 Wednesday 17 0.000000
3 04a2ddb034ff774cda02130fd59b280d55f762d7 Tuesday 16 -0.017699
4 04d61391eeea5b957847dbe08b52d88e64909dbf Thursday 15 0.000000
5 04f1fa8b29c58e19eebf0e26169975a66ec7cbbf Tuesday 15 0.000000
6 0561aa699b6c91c842b850c6b73ee4b3c8cbb03b Thursday 12 -0.002597
7 059492a3600ef0b39726af2201a0ad87610a4a02 Thursday 17 0.000000
8 059fb9175372802b43b3fdcebd2a507bc89e71b0 Thursday 12 -0.001541
9 05da142ebe95e15ab30dee30d1a982d8f419dfb2 Tuesday 20 -0.003050
10 0663c2fd03deecf7f52c3e5c7c0be5c94a3292b8 Sunday 13 -0.005613
11 07040b85d9c0c0ff122b3fef3ab73eab6c53ff0e Saturday 18 0.000000
12 07a33356cb6330b2090152d30413b224ad1c018b Saturday 20 0.005013
13 07d67b08fab92657c699dbeec931a48c9f1cfbf7 Friday 15 -0.015675
14 07f92e8eb78f9d8ab6446ffd2649990cffce2ead Friday 16 -0.004035
15 086cfca739da633d89100874a6c91c37e04880af Friday 0 -0.004068
16 0a64e559b80b819b2a48a939fa96b1f3f3791e54 Monday 12 -0.007687
17 0b477ac123374072c5acf34d1d063d6ae6c4bf0b Friday 21 0.000000
18 0bf144e77495b06fb319f4a312f09015da7c5afd Tuesday 4 0.000000
19 0d1263d90f5a5449a1d0eb80c0f217daff646d36 Saturday 8 -0.005963
我正在尝试通过以下方式创建热图:
sns.heatmap(df.pivot("day", "hour", "consumption"))
但我收到错误消息:
ValueError: Index contains duplicate entries, cannot reshape
我尝试使用 pivot_table()
而不是根据文档解释了此错误。但后来我得到:
DataError: No numeric types to aggregate
或者,绘制星期几的方法也可行:
# Convert dates into the number of the day of the week
# 0=Mon; 6=Sunday
df['day_num'] = df['timestamp'].dt.weekday
sns.heatmap(df.pivot_table(index="day_num", columns="hour", values="consumption"))
不过,我想保留图中的名称(或缩写)。
我该如何解决这个问题?
注意pivot
and pivot_table
的params顺序不同,所以如果你不命名params,顺序需要相应地改变:
-
# pivot(index=None, columns=None, values=None)
df.pivot('day', 'hour', 'consumption')
-
# pivot_table(values=None, index=None, columns=None, ...)
df.pivot_table('consumption', 'day', 'hour')
为避免歧义,我建议使用命名参数:
sns.heatmap(df.pivot_table(index='day', columns='hour', values='consumption'))
我数据中的前 20 个观察值是:
id day hour consumption
0 012af199245dedacf9ea0ba6eedef4e89272c7dc Saturday 8 0.000000
1 019ebd48fe9c9ab20051e9de1d5ddfc6fd13c55b Tuesday 16 0.000000
2 0310daaa6368cf0618f341351b8451e509da27d7 Wednesday 17 0.000000
3 04a2ddb034ff774cda02130fd59b280d55f762d7 Tuesday 16 -0.017699
4 04d61391eeea5b957847dbe08b52d88e64909dbf Thursday 15 0.000000
5 04f1fa8b29c58e19eebf0e26169975a66ec7cbbf Tuesday 15 0.000000
6 0561aa699b6c91c842b850c6b73ee4b3c8cbb03b Thursday 12 -0.002597
7 059492a3600ef0b39726af2201a0ad87610a4a02 Thursday 17 0.000000
8 059fb9175372802b43b3fdcebd2a507bc89e71b0 Thursday 12 -0.001541
9 05da142ebe95e15ab30dee30d1a982d8f419dfb2 Tuesday 20 -0.003050
10 0663c2fd03deecf7f52c3e5c7c0be5c94a3292b8 Sunday 13 -0.005613
11 07040b85d9c0c0ff122b3fef3ab73eab6c53ff0e Saturday 18 0.000000
12 07a33356cb6330b2090152d30413b224ad1c018b Saturday 20 0.005013
13 07d67b08fab92657c699dbeec931a48c9f1cfbf7 Friday 15 -0.015675
14 07f92e8eb78f9d8ab6446ffd2649990cffce2ead Friday 16 -0.004035
15 086cfca739da633d89100874a6c91c37e04880af Friday 0 -0.004068
16 0a64e559b80b819b2a48a939fa96b1f3f3791e54 Monday 12 -0.007687
17 0b477ac123374072c5acf34d1d063d6ae6c4bf0b Friday 21 0.000000
18 0bf144e77495b06fb319f4a312f09015da7c5afd Tuesday 4 0.000000
19 0d1263d90f5a5449a1d0eb80c0f217daff646d36 Saturday 8 -0.005963
我正在尝试通过以下方式创建热图:
sns.heatmap(df.pivot("day", "hour", "consumption"))
但我收到错误消息:
ValueError: Index contains duplicate entries, cannot reshape
我尝试使用 pivot_table()
而不是根据文档解释了此错误。但后来我得到:
DataError: No numeric types to aggregate
或者,绘制星期几的方法也可行:
# Convert dates into the number of the day of the week
# 0=Mon; 6=Sunday
df['day_num'] = df['timestamp'].dt.weekday
sns.heatmap(df.pivot_table(index="day_num", columns="hour", values="consumption"))
不过,我想保留图中的名称(或缩写)。
我该如何解决这个问题?
注意pivot
and pivot_table
的params顺序不同,所以如果你不命名params,顺序需要相应地改变:
-
# pivot(index=None, columns=None, values=None) df.pivot('day', 'hour', 'consumption')
-
# pivot_table(values=None, index=None, columns=None, ...) df.pivot_table('consumption', 'day', 'hour')
为避免歧义,我建议使用命名参数:
sns.heatmap(df.pivot_table(index='day', columns='hour', values='consumption'))