Pandas 如何为每个新行分解多个列表项
Pandas how to explode several items of list for each new row
我有一个数据框:
c1. c2. c3. l
1. 2. 3 [1,2,3,4,5,6,7]
3. 4. 8. [8,9,0]
我想展开它,使第 l 列中每个列表的每 3 个元素成为一个新行,并且原始列表中的三元组索引列。所以我会得到:
c1. c2. c3. l idx
1. 2. 3 [1,2,3]. 0
1. 2. 3. [4,5,6]. 1
3. 4. 8. [8,9,0]. 0
最好的方法是什么?
先将列表元素分成块,然后 explode
:
df.l = df.l.apply(lambda lst: [lst[3*i:3*(i+1)] for i in range(len(lst) // 3)])
df
# c1 c2 c3 l
#0 1 2 3 [[1, 2, 3], [4, 5, 6]]
#1 3 4 8 [[8, 9, 0]]
df.explode('l')
# c1 c2 c3 l
#0 1 2 3 [1, 2, 3]
#0 1 2 3 [4, 5, 6]
#1 3 4 8 [8, 9, 0]
如果需要索引列:
# store index as second element of the tuple
df.l = df.l.apply(lambda lst: [(lst[3*i:3*(i+1)], i) for i in range(len(lst) // 3)])
df
# c1 c2 c3 l
#0 1 2 3 [([1, 2, 3], 0), ([4, 5, 6], 1)]
#1 3 4 8 [([8, 9, 0], 0)]
df = df.explode('l')
df
# c1 c2 c3 l
#0 1 2 3 ([1, 2, 3], 0)
#0 1 2 3 ([4, 5, 6], 1)
#1 3 4 8 ([8, 9, 0], 0)
# extract list and index from the tuple column
df['l'], df['idx'] = df.l.str[0], df.l.str[1]
df
# c1 c2 c3 l idx
#0 1 2 3 [1, 2, 3] 0
#0 1 2 3 [4, 5, 6] 1
#1 3 4 8 [8, 9, 0] 0
我有一个数据框:
c1. c2. c3. l
1. 2. 3 [1,2,3,4,5,6,7]
3. 4. 8. [8,9,0]
我想展开它,使第 l 列中每个列表的每 3 个元素成为一个新行,并且原始列表中的三元组索引列。所以我会得到:
c1. c2. c3. l idx
1. 2. 3 [1,2,3]. 0
1. 2. 3. [4,5,6]. 1
3. 4. 8. [8,9,0]. 0
最好的方法是什么?
先将列表元素分成块,然后 explode
:
df.l = df.l.apply(lambda lst: [lst[3*i:3*(i+1)] for i in range(len(lst) // 3)])
df
# c1 c2 c3 l
#0 1 2 3 [[1, 2, 3], [4, 5, 6]]
#1 3 4 8 [[8, 9, 0]]
df.explode('l')
# c1 c2 c3 l
#0 1 2 3 [1, 2, 3]
#0 1 2 3 [4, 5, 6]
#1 3 4 8 [8, 9, 0]
如果需要索引列:
# store index as second element of the tuple
df.l = df.l.apply(lambda lst: [(lst[3*i:3*(i+1)], i) for i in range(len(lst) // 3)])
df
# c1 c2 c3 l
#0 1 2 3 [([1, 2, 3], 0), ([4, 5, 6], 1)]
#1 3 4 8 [([8, 9, 0], 0)]
df = df.explode('l')
df
# c1 c2 c3 l
#0 1 2 3 ([1, 2, 3], 0)
#0 1 2 3 ([4, 5, 6], 1)
#1 3 4 8 ([8, 9, 0], 0)
# extract list and index from the tuple column
df['l'], df['idx'] = df.l.str[0], df.l.str[1]
df
# c1 c2 c3 l idx
#0 1 2 3 [1, 2, 3] 0
#0 1 2 3 [4, 5, 6] 1
#1 3 4 8 [8, 9, 0] 0