将相似的行合并为 python 数据框中的一行
Combine similar rows to one row in python dataframe
我有一些数据框如下,我想做的是将行与相同的
"yyyymmdd" 和 "hr " 成一行。
(有几行具有相同的 "yyyymmdd" 和 "hr" )
yyyymmdd hr ariel cat kiki mmax vicky gaolie shiu nick ck
10 2015-12-27 9 0 0 0 0 0 0 0 23 0
181 2015-12-27 10 0 0 0 0 0 0 0 2 0
65 2015-12-27 11 0 0 0 0 0 0 0 20 0
4 2015-12-27 12 0 0 0 0 0 0 0 4 0
0 2015-12-27 17 0 0 0 0 0 0 0 2 0
141 2015-12-27 19 1 0 0 0 0 0 0 0 0
160 2015-12-28 8 0 8 0 0 0 0 0 0 0
82 2015-12-28 9 0 0 0 0 0 0 19 0 0
113 2015-12-28 9 11 0 0 0 0 0 0 0 0
180 2015-12-28 9 0 11 0 0 0 0 0 0 0
9 2015-12-28 10 0 13 0 0 0 0 0 0 0
76 2015-12-28 10 85 0 0 0 0 0 0 0 0
107 2015-12-28 10 0 0 0 0 0 0 15 0 0
188 2015-12-28 10 0 0 0 0 2 0 0 0 0
34 2015-12-28 11 0 0 0 0 0 0 14 0 0
69 2015-12-28 11 0 0 0 0 2 0 0 0 0
134 2015-12-28 11 0 11 0 0 0 0 0 0 0
158 2015-12-28 11 2 0 0 0 0 0 0 0 0
我想要的部分输出应该像这样:
yyyymmdd hr ariel cat kiki mmax vicky gaolie shiu nick ck
2015-12-28 10 85 13 0 0 2 0 15 0 0
请分享一些我可以在 python pandas 或 SQL 中使用的想法,谢谢!
============================================= ============================
现在我还有2个问题想问:
如何 "fill" 数据帧的 "hr" 索引?
它假设应该是这样的:
yyyymmdd hr ariel cat kiki mmax vicky gaolie shiu nick ck
0 2015-12-27 8 NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 2015-12-27 9 0 0 0 0 0 0 0 23 0
2 2015-12-27 10 0 0 0 0 0 0 0 2 0
3 2015-12-27 11 0 0 0 0 0 0 0 20 0
4 2015-12-27 12 0 0 0 0 0 0 0 4 0
5 2015-12-27 13 NaN NaN NaN NaN NaN NaN NaN NaN NaN
6 2015-12-27 14 NaN NaN NaN NaN NaN NaN NaN NaN NaN
7 2015-12-27 15 NaN NaN NaN NaN NaN NaN NaN NaN NaN
8 2015-12-27 16 NaN NaN NaN NaN NaN NaN NaN NaN NaN
9 2015-12-27 17 0 0 0 0 0 0 0 2 0
10 2015-12-27 18 NaN NaN NaN NaN NaN NaN NaN NaN NaN
11 2015-12-27 19 1 0 0 0 0 0 0 0 0
12 2015-12-27 20 NaN NaN NaN NaN NaN NaN NaN NaN NaN
13 2015-12-28 8 0 8 0 0 0 0 0 0 0
14 2015-12-28 9 11 11 0 0 0 0 19 0 0
15 2015-12-28 10 85 13 0 0 2 0 15 0 0
16 2015-12-28 11 2 11 0 0 2 0 14 0 0
17 2015-12-28 12 2 20 0 4 0 0 10 0 0
18 2015-12-28 13 8 9 0 9 3 0 9 0 0
19 2015-12-28 14 4 10 0 8 0 0 22 0 0
20 2015-12-28 15 3 3 0 2 0 0 16 0 0
21 2015-12-28 16 14 5 1 1 0 0 19 0 0
22 2015-12-28 17 15 1 2 0 0 0 19 0 0
23 2015-12-28 18 0 0 0 6 0 0 0 0 0
24 2015-12-28 19 0 0 0 5 0 0 0 0 0
25 2015-12-28 20 0 0 0 1 0 0 0 0 0
如何绘制基于列和小时的折线图?
(x 轴 = 列,即:ariel、cat、kiki...)
(y 轴 = 小时,即:8,9,10...20)
每个图表代表一个数据(即 2015-12-27、2015-12-28..)
谢谢!!
将你的数据放入一个Pandas数据框中,然后groupby并得到每组的最大值,
Copy-Pasting 你的例子变成了 csv,它看起来像这样:
import pandas as pd
df = pd.read_csv('df.csv',index_col=0)
df_combined = df.groupby(['yyyymmdd','hr']).max()
df_combined
输出:
如果您不想要 multi-index.
,请使用 reset_index()
我有一些数据框如下,我想做的是将行与相同的 "yyyymmdd" 和 "hr " 成一行。 (有几行具有相同的 "yyyymmdd" 和 "hr" )
yyyymmdd hr ariel cat kiki mmax vicky gaolie shiu nick ck
10 2015-12-27 9 0 0 0 0 0 0 0 23 0
181 2015-12-27 10 0 0 0 0 0 0 0 2 0
65 2015-12-27 11 0 0 0 0 0 0 0 20 0
4 2015-12-27 12 0 0 0 0 0 0 0 4 0
0 2015-12-27 17 0 0 0 0 0 0 0 2 0
141 2015-12-27 19 1 0 0 0 0 0 0 0 0
160 2015-12-28 8 0 8 0 0 0 0 0 0 0
82 2015-12-28 9 0 0 0 0 0 0 19 0 0
113 2015-12-28 9 11 0 0 0 0 0 0 0 0
180 2015-12-28 9 0 11 0 0 0 0 0 0 0
9 2015-12-28 10 0 13 0 0 0 0 0 0 0
76 2015-12-28 10 85 0 0 0 0 0 0 0 0
107 2015-12-28 10 0 0 0 0 0 0 15 0 0
188 2015-12-28 10 0 0 0 0 2 0 0 0 0
34 2015-12-28 11 0 0 0 0 0 0 14 0 0
69 2015-12-28 11 0 0 0 0 2 0 0 0 0
134 2015-12-28 11 0 11 0 0 0 0 0 0 0
158 2015-12-28 11 2 0 0 0 0 0 0 0 0
我想要的部分输出应该像这样:
yyyymmdd hr ariel cat kiki mmax vicky gaolie shiu nick ck
2015-12-28 10 85 13 0 0 2 0 15 0 0
请分享一些我可以在 python pandas 或 SQL 中使用的想法,谢谢!
============================================= ============================
现在我还有2个问题想问:
如何 "fill" 数据帧的 "hr" 索引? 它假设应该是这样的:
yyyymmdd hr ariel cat kiki mmax vicky gaolie shiu nick ck 0 2015-12-27 8 NaN NaN NaN NaN NaN NaN NaN NaN NaN 1 2015-12-27 9 0 0 0 0 0 0 0 23 0 2 2015-12-27 10 0 0 0 0 0 0 0 2 0 3 2015-12-27 11 0 0 0 0 0 0 0 20 0 4 2015-12-27 12 0 0 0 0 0 0 0 4 0 5 2015-12-27 13 NaN NaN NaN NaN NaN NaN NaN NaN NaN 6 2015-12-27 14 NaN NaN NaN NaN NaN NaN NaN NaN NaN 7 2015-12-27 15 NaN NaN NaN NaN NaN NaN NaN NaN NaN 8 2015-12-27 16 NaN NaN NaN NaN NaN NaN NaN NaN NaN 9 2015-12-27 17 0 0 0 0 0 0 0 2 0 10 2015-12-27 18 NaN NaN NaN NaN NaN NaN NaN NaN NaN 11 2015-12-27 19 1 0 0 0 0 0 0 0 0 12 2015-12-27 20 NaN NaN NaN NaN NaN NaN NaN NaN NaN 13 2015-12-28 8 0 8 0 0 0 0 0 0 0 14 2015-12-28 9 11 11 0 0 0 0 19 0 0 15 2015-12-28 10 85 13 0 0 2 0 15 0 0 16 2015-12-28 11 2 11 0 0 2 0 14 0 0 17 2015-12-28 12 2 20 0 4 0 0 10 0 0 18 2015-12-28 13 8 9 0 9 3 0 9 0 0 19 2015-12-28 14 4 10 0 8 0 0 22 0 0 20 2015-12-28 15 3 3 0 2 0 0 16 0 0 21 2015-12-28 16 14 5 1 1 0 0 19 0 0 22 2015-12-28 17 15 1 2 0 0 0 19 0 0 23 2015-12-28 18 0 0 0 6 0 0 0 0 0 24 2015-12-28 19 0 0 0 5 0 0 0 0 0 25 2015-12-28 20 0 0 0 1 0 0 0 0 0
如何绘制基于列和小时的折线图? (x 轴 = 列,即:ariel、cat、kiki...) (y 轴 = 小时,即:8,9,10...20) 每个图表代表一个数据(即 2015-12-27、2015-12-28..)
谢谢!!
将你的数据放入一个Pandas数据框中,然后groupby并得到每组的最大值, Copy-Pasting 你的例子变成了 csv,它看起来像这样:
import pandas as pd
df = pd.read_csv('df.csv',index_col=0)
df_combined = df.groupby(['yyyymmdd','hr']).max()
df_combined
输出:
如果您不想要 multi-index.
,请使用 reset_index()