Pandas pivot_table: "merge" 列值
Pandas pivot_table: "merge" column values
假设我有以下 table:
from datetime import datetime
import pandas as pd
d = [[datetime(year=2021, month=1, day=1, minute=5), "A", "new", 3],
[datetime(year=2021, month=1, day=1, minute=5), "B", "new", 6],
[datetime(year=2021, month=1, day=1, minute=5), "C", "new", 7],
[datetime(year=2021, month=1, day=1, minute=15), "A", "old", 6],
[datetime(year=2021, month=1, day=1, minute=15), "B", "old", 2],
[datetime(year=2021, month=1, day=1, minute=15), "C", "old", 14],
]
df = pd.DataFrame(data=d, columns=["Time", "Article", "Status", "Qty"])
我想重组这些数据,每个“时间”值一行,然后每篇文章都有“数量”和“状态”列。
我几乎可以使用 pivot_table 实现如下:
pd.pivot_table(data=df, index=["Time"], columns=["Article"], values=["Status", "Qty"], aggfunc="last")
这会为我生成以下输出:
Qty
Status
Article
A
B
C
A
B
C
Time
2021-01-01 00:05:00
3
6
7
new
new
new
2021-01-01 00:15:00
6
2
14
old
old
old
但是,我希望按文章而不是值列对其进行分组。就像它会由以下代码生成:
arrays = [
["A", "A", "B", "B", "C", "C", "qux", "qux"],
["Qty", "Status", "Qty", "Status", "Qty", "Status"],
]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=["Article", "Value"])
data_pivot=[
[3, "new", 6, "new", 6, "new"],
[6, "old", 2, "old", 14, "old"]
]
pd.DataFrame(data=data_pivot, columns=index, index=[datetime(year=2021, month=1, day=1, minute=5), datetime(year=2021, month=1, day=1, minute=15)])
Article
A
B
C
Value
Qty
Status
Qty
Status
Qty
Status
2021-01-01 00:05:00
3
new
6
new
6
new
2021-01-01 00:15:00
6
old
2
old
14
old
在 pivot_table 调用中简单地切换值和 columns-keyword 也没有给我预期的输出。
不幸的是,我在命名这个问题时遇到了问题,所以我很难找到现有的解决方案(因此,这个问题的标题可能很奇怪),如果这个问题已经被很多人问过,我很抱歉次。
使用DataFrame.swaplevel
with DataFrame.sort_index
:
df = df.swaplevel(1,0,axis=1).sort_index(axis=1)
假设我有以下 table:
from datetime import datetime
import pandas as pd
d = [[datetime(year=2021, month=1, day=1, minute=5), "A", "new", 3],
[datetime(year=2021, month=1, day=1, minute=5), "B", "new", 6],
[datetime(year=2021, month=1, day=1, minute=5), "C", "new", 7],
[datetime(year=2021, month=1, day=1, minute=15), "A", "old", 6],
[datetime(year=2021, month=1, day=1, minute=15), "B", "old", 2],
[datetime(year=2021, month=1, day=1, minute=15), "C", "old", 14],
]
df = pd.DataFrame(data=d, columns=["Time", "Article", "Status", "Qty"])
我想重组这些数据,每个“时间”值一行,然后每篇文章都有“数量”和“状态”列。
我几乎可以使用 pivot_table 实现如下:
pd.pivot_table(data=df, index=["Time"], columns=["Article"], values=["Status", "Qty"], aggfunc="last")
这会为我生成以下输出:
Qty | Status | |||||
---|---|---|---|---|---|---|
Article | A | B | C | A | B | C |
Time | ||||||
2021-01-01 00:05:00 | 3 | 6 | 7 | new | new | new |
2021-01-01 00:15:00 | 6 | 2 | 14 | old | old | old |
但是,我希望按文章而不是值列对其进行分组。就像它会由以下代码生成:
arrays = [
["A", "A", "B", "B", "C", "C", "qux", "qux"],
["Qty", "Status", "Qty", "Status", "Qty", "Status"],
]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=["Article", "Value"])
data_pivot=[
[3, "new", 6, "new", 6, "new"],
[6, "old", 2, "old", 14, "old"]
]
pd.DataFrame(data=data_pivot, columns=index, index=[datetime(year=2021, month=1, day=1, minute=5), datetime(year=2021, month=1, day=1, minute=15)])
Article | A | B | C | |||
---|---|---|---|---|---|---|
Value | Qty | Status | Qty | Status | Qty | Status |
2021-01-01 00:05:00 | 3 | new | 6 | new | 6 | new |
2021-01-01 00:15:00 | 6 | old | 2 | old | 14 | old |
在 pivot_table 调用中简单地切换值和 columns-keyword 也没有给我预期的输出。
不幸的是,我在命名这个问题时遇到了问题,所以我很难找到现有的解决方案(因此,这个问题的标题可能很奇怪),如果这个问题已经被很多人问过,我很抱歉次。
使用DataFrame.swaplevel
with DataFrame.sort_index
:
df = df.swaplevel(1,0,axis=1).sort_index(axis=1)