创建具有重复值的字典

Create a dictionary with repeating values

我正在尝试在字典中创建 [1, 2, 3] 的唯一值,以便我可以 select 将属于任一值的那些行作为指定数字的值以重塑为列。

我以这个数据为例:

data = {'id': [],
 'players': ['NEUER, Manuel',
  'Bayern München',
  'Bundesliga',
  'OBLAK, Jan',
  'Atlético Madrid',
  'La Liga',
  'KANE, Harry',
  'Tottenham Hotspur',
  'Premier League',
  'RAMOS, Sergio',
  'Paris Saint-Germain',
  'Ligue 1',
  'TER STEGEN, Marc-André',
  'Barcelona',
  'La Liga',
  'VARANE, Raphaël',
  'Real Madrid',
  'La Liga',
  'STERLING, Raheem',
  'Manchester City',
  'Premier League',
  'MANÉ, Sadio',
  'Liverpool',
  'Premier League',
  'MARQUINHOS, Corrêa',
  'Paris Saint-Germain',
  'Ligue 1',
  'FERNANDES, Bruno',
  'Manchester United',
  'Premier League']}

这是我用过的代码:

p = []
for i in itertools.repeat([1, 2, 3], sum(map(len, data.values()))):
    p+=i
    data['id'].append(p)

当我创建一个 pandas 数据框时,我得到:


                        id                                 players
0   [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ...   NEUER, Manuel
1   [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ...   Bayern München
2   [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ...   Bundesliga
3   [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ...   OBLAK, Jan
4   [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ...   Atlético Madrid
... ... ...
115 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ...   Juventus
116 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ...   Serie A
117 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ...   ALEXANDER-ARNOLD, Trent
118 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ...   Liverpool
119 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ...   Premier League
120 rows × 2 columns

预期结果:

    id    players
0   1   NEUER, Manuel
1   2   Bayern München
2   3   Bundesliga
3   1   OBLAK, Jan
4   2   Atlético Madrid
... ... ...
115 2   Juventus
116 3   Serie A
117 1   ALEXANDER-ARNOLD, Trent
118 2   Liverpool
119 3   Premier League
120 rows × 2 columns

#to then get:

players               2                   3
NEUER, Manuel     Bayern München       Bundesliga
OBLAK, Jan        Atlético Madrid        ...

这下清楚多了。 您可以这样做(使用您的 data):

out1 = {'id': [], 'players': []}
for i in zip(itertools.cycle([1,2,3]), data['players']):
    out1['id'].append(i[0])
    out1['players'].append(i[1])

>>> out1
{'id': [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3], 'players': ['NEUER, Manuel', 'Bayern München', 'Bundesliga', 'OBLAK, Jan', 'Atlético Madrid', 'La Liga', 'KANE, Harry', 'Tottenham Hotspur', 'Premier League', 'RAMOS, Sergio', 'Paris Saint-Germain', 'Ligue 1', 'TER STEGEN, Marc-André', 'Barcelona', 'La Liga', 'VARANE, Raphaël', 'Real Madrid', 'La Liga', 'STERLING, Raheem', 'Manchester City', 'Premier League', 'MANÉ, Sadio', 'Liverpool', 'Premier League', 'MARQUINHOS, Corrêa', 'Paris Saint-Germain', 'Ligue 1', 'FERNANDES, Bruno', 'Manchester United', 'Premier League']}

不过也可以直接得到第二步:

out2 = {'player':[], 'team':[], 'league':[]}
for i in zip(cycle([1,2,3]), data['players']):
    if i[0] == 1:
        out2['player'].append(i[1])
    elif i[0] == 2:
        out2['team'].append(i[1])
    elif i[0] == 3:
        out2['league'].append(i[1])

>>> out2
{'player': ['NEUER, Manuel', 'OBLAK, Jan', 'KANE, Harry', 'RAMOS, Sergio', 'TER STEGEN, Marc-André', 'VARANE, Raphaël', 'STERLING, Raheem', 'MANÉ, Sadio', 'MARQUINHOS, Corrêa', 'FERNANDES, Bruno'], 'team': ['Bayern München', 'Atlético Madrid', 'Tottenham Hotspur', 'Paris Saint-Germain', 'Barcelona', 'Real Madrid', 'Manchester City', 'Liverpool', 'Paris Saint-Germain', 'Manchester United'], 'league': ['Bundesliga', 'La Liga', 'Premier League', 'Ligue 1', 'La Liga', 'La Liga', 'Premier League', 'Premier League', 'Ligue 1', 'Premier League']}