根据 python 中的键将键值数据从数据帧转置到列
transpose key value data from a dataframe to a column based on keys in python
我从一个网站获得了 XML 格式的输入,并且我能够按照以下格式将其放入数据框中,
你能帮我写一个 python 代码来将数据转换成预期的输出吗,如下所示。
Dataframe 中的数据
pDate | pname |meta_key |meta_value
0 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|access_code |67433
1 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|email |xxx@dddd.com
2 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|activity_id |43
3 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|duration_step|50
4 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|type |M
5 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|multiplier |122
6 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|date |2021-07-17
7 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-13254-42|access_code |13254
8 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-13254-42|email |xxxx@ccc.com
9 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-13254-42|activity_id |42
数据框中的预期输出可用于图表
pDate | name | access_code | email | activity_id | duration_step | type | multiplier |date |
Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43 | 67433 |xxx@dddd.com |43 |50 | M |. 122. | 2021-07-17
尝试 .pivot
:
print(
df.pivot(index=["pDate", "pname"], columns="meta_key", values="meta_value")
.reset_index()
.rename_axis("", axis=1)
)
打印:
pDate pname access_code activity_id date duration_step email multiplier type
0 Mon, 19 Jul 2021 06:13:05 +0000 2021-07-17-13254-42 13254 42 NaN NaN xxxx@ccc.com NaN NaN
1 Mon, 19 Jul 2021 06:13:05 +0000 2021-07-17-67433-43 67433 43 2021-07-17 50 xxx@dddd.com 122 M
我从一个网站获得了 XML 格式的输入,并且我能够按照以下格式将其放入数据框中, 你能帮我写一个 python 代码来将数据转换成预期的输出吗,如下所示。
Dataframe 中的数据
pDate | pname |meta_key |meta_value
0 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|access_code |67433
1 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|email |xxx@dddd.com
2 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|activity_id |43
3 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|duration_step|50
4 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|type |M
5 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|multiplier |122
6 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43|date |2021-07-17
7 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-13254-42|access_code |13254
8 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-13254-42|email |xxxx@ccc.com
9 Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-13254-42|activity_id |42
数据框中的预期输出可用于图表
pDate | name | access_code | email | activity_id | duration_step | type | multiplier |date |
Mon, 19 Jul 2021 06:13:05 +0000|2021-07-17-67433-43 | 67433 |xxx@dddd.com |43 |50 | M |. 122. | 2021-07-17
尝试 .pivot
:
print(
df.pivot(index=["pDate", "pname"], columns="meta_key", values="meta_value")
.reset_index()
.rename_axis("", axis=1)
)
打印:
pDate pname access_code activity_id date duration_step email multiplier type
0 Mon, 19 Jul 2021 06:13:05 +0000 2021-07-17-13254-42 13254 42 NaN NaN xxxx@ccc.com NaN NaN
1 Mon, 19 Jul 2021 06:13:05 +0000 2021-07-17-67433-43 67433 43 2021-07-17 50 xxx@dddd.com 122 M