按小时重新采样 Pandas DataFrame 并使用 Plotly 绘制堆积条形图
Resampling Pandas DataFrame by hour and plotting a stacked bar chart using Plotly
我有一个 pandas 数据框,如下所示
MAC地址
ts
参数1
参数2
af3d116c
2021-05-0521:58:45
20
50
bffe479a
2021-05-0521:58:48
22
52
c3a8fe37
2021-05-0521:58:52
21
53
af3d116c
2021-05-0521:58:58
27
50
bffe479a
2021-05-0521:59:16
23
51
c3a8fe37
2021-05-0521:59:50
28
52
af3d116c
2021-05-0522:08:32
30
49
af3d116c
2021-05-0522:16:30
27
55
bffe479a
2021-05-0522:31:37
20
53
c3a8fe37
2021-05-0522:52:49
32
52
af3d116c
2021-05-0523:22:02
41
58
bffe479a
2021-05-0523:44:31
37
62
bffe479a
2021-05-0523:45:12
29
58
bffe479a
2021-05-0523:49:28
34
41
c3a8fe37
2021-05-0523:52:47
47
56
我想对数据框重新采样,最后绘制堆叠条形图(最好使用 plotly)表示每小时记录的总行数,并根据 MAC 地址进行颜色编码。
下面是我希望如何可视化它的表示。 (抱歉,它没有使用上面列出的数据,但给出了我想要它的指示。每个条形代表一个小时,例如:22:00 到 23:00 由表示 [=205= 的颜色分隔] 地址。)
您可以按小时对数据帧 pd.Grouper(key='ts', freq='1h')
和 'resample' 进行分组。 size
将为您提供 MAC 个地址的频率计数:
import pandas as pd
import plotly.express as px
data = {'MAC Address': {1: 'af3d116c', 2: 'bffe479a', 3: 'c3a8fe37', 4: 'af3d116c', 5: 'bffe479a', 6: 'c3a8fe37', 7: 'af3d116c', 8: 'af3d116c', 9: 'bffe479a', 10: 'c3a8fe37', 11: 'af3d116c', 12: 'bffe479a', 13: 'bffe479a', 14: 'bffe479a', 15: 'c3a8fe37'}, 'ts': {1: '2021-05-05 21:58:45', 2: '2021-05-05 21:58:48', 3: '2021-05-05 21:58:52', 4: '2021-05-05 21:58:58', 5: '2021-05-05 21:59:16', 6: '2021-05-05 21:59:50', 7: '2021-05-05 22:08:32', 8: '2021-05-05 22:16:30', 9: '2021-05-05 22:31:37', 10: '2021-05-05 22:52:49', 11: '2021-05-05 23:22:02', 12: '2021-05-05 23:44:31', 13: '2021-05-05 23:45:12', 14: '2021-05-05 23:49:28', 15: '2021-05-05 23:52:47'}, 'Parameter1': {1: 20, 2: 22, 3: 21, 4: 27, 5: 23, 6: 28, 7: 30, 8: 27, 9: 20, 10: 32, 11: 41, 12: 37, 13: 29, 14: 34, 15: 47}, 'Parameter2': {1: 50, 2: 52, 3: 53, 4: 50, 5: 51, 6: 52, 7: 49, 8: 55, 9: 53, 10: 52, 11: 58, 12: 62, 13: 58, 14: 41, 15: 56}}
df = pd.DataFrame(data)
df['ts'] = pd.to_datetime(df['ts'])
plot_df = df.groupby([pd.Grouper(key='ts', freq='1h'), 'MAC Address']).size().reset_index().rename(columns={0: "count"})
这将导致:
ts
MAC Address
count
0
2021-05-05 21:00:00
af3d116c
2
1
2021-05-05 21:00:00
bffe479a
2
2
2021-05-05 21:00:00
c3a8fe37
2
3
2021-05-05 22:00:00
af3d116c
2
4
2021-05-05 22:00:00
bffe479a
1
5
2021-05-05 22:00:00
c3a8fe37
1
6
2021-05-05 23:00:00
af3d116c
1
7
2021-05-05 23:00:00
bffe479a
3
8
2021-05-05 23:00:00
c3a8fe37
1
然后您可以根据需要绘制它。例如:
fig = px.bar(plot_df, x="ts", y="count", color="MAC Address", title="MAC Addresses per hour")
fig.show()
我有一个 pandas 数据框,如下所示
MAC地址 | ts | 参数1 | 参数2 |
---|---|---|---|
af3d116c | 2021-05-0521:58:45 | 20 | 50 |
bffe479a | 2021-05-0521:58:48 | 22 | 52 |
c3a8fe37 | 2021-05-0521:58:52 | 21 | 53 |
af3d116c | 2021-05-0521:58:58 | 27 | 50 |
bffe479a | 2021-05-0521:59:16 | 23 | 51 |
c3a8fe37 | 2021-05-0521:59:50 | 28 | 52 |
af3d116c | 2021-05-0522:08:32 | 30 | 49 |
af3d116c | 2021-05-0522:16:30 | 27 | 55 |
bffe479a | 2021-05-0522:31:37 | 20 | 53 |
c3a8fe37 | 2021-05-0522:52:49 | 32 | 52 |
af3d116c | 2021-05-0523:22:02 | 41 | 58 |
bffe479a | 2021-05-0523:44:31 | 37 | 62 |
bffe479a | 2021-05-0523:45:12 | 29 | 58 |
bffe479a | 2021-05-0523:49:28 | 34 | 41 |
c3a8fe37 | 2021-05-0523:52:47 | 47 | 56 |
我想对数据框重新采样,最后绘制堆叠条形图(最好使用 plotly)表示每小时记录的总行数,并根据 MAC 地址进行颜色编码。
下面是我希望如何可视化它的表示。 (抱歉,它没有使用上面列出的数据,但给出了我想要它的指示。每个条形代表一个小时,例如:22:00 到 23:00 由表示 [=205= 的颜色分隔] 地址。)
您可以按小时对数据帧 pd.Grouper(key='ts', freq='1h')
和 'resample' 进行分组。 size
将为您提供 MAC 个地址的频率计数:
import pandas as pd
import plotly.express as px
data = {'MAC Address': {1: 'af3d116c', 2: 'bffe479a', 3: 'c3a8fe37', 4: 'af3d116c', 5: 'bffe479a', 6: 'c3a8fe37', 7: 'af3d116c', 8: 'af3d116c', 9: 'bffe479a', 10: 'c3a8fe37', 11: 'af3d116c', 12: 'bffe479a', 13: 'bffe479a', 14: 'bffe479a', 15: 'c3a8fe37'}, 'ts': {1: '2021-05-05 21:58:45', 2: '2021-05-05 21:58:48', 3: '2021-05-05 21:58:52', 4: '2021-05-05 21:58:58', 5: '2021-05-05 21:59:16', 6: '2021-05-05 21:59:50', 7: '2021-05-05 22:08:32', 8: '2021-05-05 22:16:30', 9: '2021-05-05 22:31:37', 10: '2021-05-05 22:52:49', 11: '2021-05-05 23:22:02', 12: '2021-05-05 23:44:31', 13: '2021-05-05 23:45:12', 14: '2021-05-05 23:49:28', 15: '2021-05-05 23:52:47'}, 'Parameter1': {1: 20, 2: 22, 3: 21, 4: 27, 5: 23, 6: 28, 7: 30, 8: 27, 9: 20, 10: 32, 11: 41, 12: 37, 13: 29, 14: 34, 15: 47}, 'Parameter2': {1: 50, 2: 52, 3: 53, 4: 50, 5: 51, 6: 52, 7: 49, 8: 55, 9: 53, 10: 52, 11: 58, 12: 62, 13: 58, 14: 41, 15: 56}}
df = pd.DataFrame(data)
df['ts'] = pd.to_datetime(df['ts'])
plot_df = df.groupby([pd.Grouper(key='ts', freq='1h'), 'MAC Address']).size().reset_index().rename(columns={0: "count"})
这将导致:
ts | MAC Address | count | |
---|---|---|---|
0 | 2021-05-05 21:00:00 | af3d116c | 2 |
1 | 2021-05-05 21:00:00 | bffe479a | 2 |
2 | 2021-05-05 21:00:00 | c3a8fe37 | 2 |
3 | 2021-05-05 22:00:00 | af3d116c | 2 |
4 | 2021-05-05 22:00:00 | bffe479a | 1 |
5 | 2021-05-05 22:00:00 | c3a8fe37 | 1 |
6 | 2021-05-05 23:00:00 | af3d116c | 1 |
7 | 2021-05-05 23:00:00 | bffe479a | 3 |
8 | 2021-05-05 23:00:00 | c3a8fe37 | 1 |
然后您可以根据需要绘制它。例如:
fig = px.bar(plot_df, x="ts", y="count", color="MAC Address", title="MAC Addresses per hour")
fig.show()