按小时重新采样 Pandas DataFrame 并使用 Plotly 绘制堆积条形图

Resampling Pandas DataFrame by hour and plotting a stacked bar chart using Plotly

我有一个 pandas 数据框,如下所示

MAC地址 ts 参数1 参数2
af3d116c 2021-05-0521:58:45 20 50
bffe479a 2021-05-0521:58:48 22 52
c3a8fe37 2021-05-0521:58:52 21 53
af3d116c 2021-05-0521:58:58 27 50
bffe479a 2021-05-0521:59:16 23 51
c3a8fe37 2021-05-0521:59:50 28 52
af3d116c 2021-05-0522:08:32 30 49
af3d116c 2021-05-0522:16:30 27 55
bffe479a 2021-05-0522:31:37 20 53
c3a8fe37 2021-05-0522:52:49 32 52
af3d116c 2021-05-0523:22:02 41 58
bffe479a 2021-05-0523:44:31 37 62
bffe479a 2021-05-0523:45:12 29 58
bffe479a 2021-05-0523:49:28 34 41
c3a8fe37 2021-05-0523:52:47 47 56

我想对数据框重新采样,最后绘制堆叠条形图(最好使用 plotly)表示每小时记录的总行数,并根据 MAC 地址进行颜色编码。

下面是我希望如何可视化它的表示。 (抱歉,它没有使用上面列出的数据,但给出了我想要它的指示。每个条形代表一个小时,例如:22:00 到 23:00 由表示 [=205= 的颜色分隔] 地址。)

您可以按小时对数据帧 pd.Grouper(key='ts', freq='1h') 和 'resample' 进行分组。 size 将为您提供 MAC 个地址的频率计数:

import pandas as pd
import plotly.express as px

data = {'MAC Address': {1: 'af3d116c', 2: 'bffe479a', 3: 'c3a8fe37', 4: 'af3d116c', 5: 'bffe479a', 6: 'c3a8fe37', 7: 'af3d116c', 8: 'af3d116c', 9: 'bffe479a', 10: 'c3a8fe37', 11: 'af3d116c', 12: 'bffe479a', 13: 'bffe479a', 14: 'bffe479a', 15: 'c3a8fe37'}, 'ts': {1: '2021-05-05 21:58:45', 2: '2021-05-05 21:58:48', 3: '2021-05-05 21:58:52', 4: '2021-05-05 21:58:58', 5: '2021-05-05 21:59:16', 6: '2021-05-05 21:59:50', 7: '2021-05-05 22:08:32', 8: '2021-05-05 22:16:30', 9: '2021-05-05 22:31:37', 10: '2021-05-05 22:52:49', 11: '2021-05-05 23:22:02', 12: '2021-05-05 23:44:31', 13: '2021-05-05 23:45:12', 14: '2021-05-05 23:49:28', 15: '2021-05-05 23:52:47'}, 'Parameter1': {1: 20, 2: 22, 3: 21, 4: 27, 5: 23, 6: 28, 7: 30, 8: 27, 9: 20, 10: 32, 11: 41, 12: 37, 13: 29, 14: 34, 15: 47}, 'Parameter2': {1: 50, 2: 52, 3: 53, 4: 50, 5: 51, 6: 52, 7: 49, 8: 55, 9: 53, 10: 52, 11: 58, 12: 62, 13: 58, 14: 41, 15: 56}}
df = pd.DataFrame(data)
df['ts'] = pd.to_datetime(df['ts'])

plot_df = df.groupby([pd.Grouper(key='ts', freq='1h'), 'MAC Address']).size().reset_index().rename(columns={0: "count"})

这将导致:

ts MAC Address count
0 2021-05-05 21:00:00 af3d116c 2
1 2021-05-05 21:00:00 bffe479a 2
2 2021-05-05 21:00:00 c3a8fe37 2
3 2021-05-05 22:00:00 af3d116c 2
4 2021-05-05 22:00:00 bffe479a 1
5 2021-05-05 22:00:00 c3a8fe37 1
6 2021-05-05 23:00:00 af3d116c 1
7 2021-05-05 23:00:00 bffe479a 3
8 2021-05-05 23:00:00 c3a8fe37 1

然后您可以根据需要绘制它。例如:

fig = px.bar(plot_df, x="ts", y="count", color="MAC Address", title="MAC Addresses per hour")
fig.show()