如何使用 map 函数使用包含嵌套列表的字典重新映射数据框？

Question

我有一个这样的数据框：

我想使用 dict d:

添加一个带有 map 函数的新列

d = {'123': [1, 2, 3, 1, 2, 1, 5], '456': [1, 2, 1]}

预期输出：

    ID  Count
0   123   1
1   123   2
2   123   3
3   123   1
4   123   2
5   123   1
6   456   1
7   456   2
8   456   1
9   123   5

但是df.ID.map(d) returns:

0    [1, 2, 3, 1, 2, 1, 5]
1    [1, 2, 3, 1, 2, 1, 5]
2    [1, 2, 3, 1, 2, 1, 5]
3    [1, 2, 3, 1, 2, 1, 5]
4    [1, 2, 3, 1, 2, 1, 5]
5    [1, 2, 3, 1, 2, 1, 5]
6             [1, 2, 1]
7             [1, 2, 1]
8             [1, 2, 1]
9    [1, 2, 3, 1, 2, 1, 5]

Answer 1

您可以使用groupby+apply:

df.groupby('ID').apply(lambda g: pd.Series(d[g.name]))

示例：

>>> df['Count'] = df.groupby('ID').apply(lambda g: pd.Series(d[g.name])).to_list()
>>> df
    ID  Count
0  123      1
1  123      2
2  123      3
3  123      1
4  123      2
5  123      1
6  456      1
7  456      2
8  456      1

编辑。无序输入的变体：

(df.join(df.groupby('ID').apply(lambda g: pd.Series(d[g.name],
                                                    name='Count',
                                                    index=g.index))
           .droplevel(0))
)

输出：

    ID  Count
0  123      1
1  123      2
2  123      3
3  123      1
4  123      2
5  123      1
6  456      1
7  456      2
8  456      1
9  123      5

Answer 2

从你的口述中你可以在explode

之后得到你需要的东西

pd.Series(d).explode().reset_index()
Out[115]: 
  index  0
0   123  1
1   123  2
2   123  3
3   123  1
4   123  2
5   123  1
6   456  1
7   456  2
8   456  1

Answer 3

在 df 上创建一个累积计数 -> 这里的假设是每个 ID 的计数应该与 d 中每个 ID 的值的计数相同：

df = df.assign(counter = df.groupby('ID').cumcount())

从 d 构建数据框，使用 pd.concat:

# converted the keys to integers
# so that it matches the dtype of ID in df
frame = pd.concat([pd.Series(val) for _, val in d.items()], 
                  keys = map(int, d))
frame.name = 'Count'

运行 a pd.merge 对齐 frame 到 df:

df.merge(frame, 
        left_on = ['ID', 'counter'], 
        right_index = True).drop(columns='counter')

    ID  Count
0  123      1
1  123      2
2  123      3
3  123      1
4  123      2
5  123      1
6  456      1
7  456      2
8  456      1
9  123      5

如何使用 map 函数使用包含嵌套列表的字典重新映射数据框？

How do I remap a dataframe with a dict containing nested list using map function?

series

dataframe

pandas

编辑。无序输入的变体：