创建具有重复值的字典
Create a dictionary with repeating values
我正在尝试在字典中创建 [1, 2, 3]
的唯一值,以便我可以 select 将属于任一值的那些行作为指定数字的值以重塑为列。
我以这个数据为例:
data = {'id': [],
'players': ['NEUER, Manuel',
'Bayern München',
'Bundesliga',
'OBLAK, Jan',
'Atlético Madrid',
'La Liga',
'KANE, Harry',
'Tottenham Hotspur',
'Premier League',
'RAMOS, Sergio',
'Paris Saint-Germain',
'Ligue 1',
'TER STEGEN, Marc-André',
'Barcelona',
'La Liga',
'VARANE, Raphaël',
'Real Madrid',
'La Liga',
'STERLING, Raheem',
'Manchester City',
'Premier League',
'MANÉ, Sadio',
'Liverpool',
'Premier League',
'MARQUINHOS, Corrêa',
'Paris Saint-Germain',
'Ligue 1',
'FERNANDES, Bruno',
'Manchester United',
'Premier League']}
这是我用过的代码:
p = []
for i in itertools.repeat([1, 2, 3], sum(map(len, data.values()))):
p+=i
data['id'].append(p)
当我创建一个 pandas 数据框时,我得到:
id players
0 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... NEUER, Manuel
1 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Bayern München
2 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Bundesliga
3 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... OBLAK, Jan
4 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Atlético Madrid
... ... ...
115 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Juventus
116 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Serie A
117 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... ALEXANDER-ARNOLD, Trent
118 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Liverpool
119 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Premier League
120 rows × 2 columns
预期结果:
id players
0 1 NEUER, Manuel
1 2 Bayern München
2 3 Bundesliga
3 1 OBLAK, Jan
4 2 Atlético Madrid
... ... ...
115 2 Juventus
116 3 Serie A
117 1 ALEXANDER-ARNOLD, Trent
118 2 Liverpool
119 3 Premier League
120 rows × 2 columns
#to then get:
players 2 3
NEUER, Manuel Bayern München Bundesliga
OBLAK, Jan Atlético Madrid ...
这下清楚多了。
您可以这样做(使用您的 data
):
out1 = {'id': [], 'players': []}
for i in zip(itertools.cycle([1,2,3]), data['players']):
out1['id'].append(i[0])
out1['players'].append(i[1])
>>> out1
{'id': [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3], 'players': ['NEUER, Manuel', 'Bayern München', 'Bundesliga', 'OBLAK, Jan', 'Atlético Madrid', 'La Liga', 'KANE, Harry', 'Tottenham Hotspur', 'Premier League', 'RAMOS, Sergio', 'Paris Saint-Germain', 'Ligue 1', 'TER STEGEN, Marc-André', 'Barcelona', 'La Liga', 'VARANE, Raphaël', 'Real Madrid', 'La Liga', 'STERLING, Raheem', 'Manchester City', 'Premier League', 'MANÉ, Sadio', 'Liverpool', 'Premier League', 'MARQUINHOS, Corrêa', 'Paris Saint-Germain', 'Ligue 1', 'FERNANDES, Bruno', 'Manchester United', 'Premier League']}
不过也可以直接得到第二步:
out2 = {'player':[], 'team':[], 'league':[]}
for i in zip(cycle([1,2,3]), data['players']):
if i[0] == 1:
out2['player'].append(i[1])
elif i[0] == 2:
out2['team'].append(i[1])
elif i[0] == 3:
out2['league'].append(i[1])
>>> out2
{'player': ['NEUER, Manuel', 'OBLAK, Jan', 'KANE, Harry', 'RAMOS, Sergio', 'TER STEGEN, Marc-André', 'VARANE, Raphaël', 'STERLING, Raheem', 'MANÉ, Sadio', 'MARQUINHOS, Corrêa', 'FERNANDES, Bruno'], 'team': ['Bayern München', 'Atlético Madrid', 'Tottenham Hotspur', 'Paris Saint-Germain', 'Barcelona', 'Real Madrid', 'Manchester City', 'Liverpool', 'Paris Saint-Germain', 'Manchester United'], 'league': ['Bundesliga', 'La Liga', 'Premier League', 'Ligue 1', 'La Liga', 'La Liga', 'Premier League', 'Premier League', 'Ligue 1', 'Premier League']}
我正在尝试在字典中创建 [1, 2, 3]
的唯一值,以便我可以 select 将属于任一值的那些行作为指定数字的值以重塑为列。
我以这个数据为例:
data = {'id': [],
'players': ['NEUER, Manuel',
'Bayern München',
'Bundesliga',
'OBLAK, Jan',
'Atlético Madrid',
'La Liga',
'KANE, Harry',
'Tottenham Hotspur',
'Premier League',
'RAMOS, Sergio',
'Paris Saint-Germain',
'Ligue 1',
'TER STEGEN, Marc-André',
'Barcelona',
'La Liga',
'VARANE, Raphaël',
'Real Madrid',
'La Liga',
'STERLING, Raheem',
'Manchester City',
'Premier League',
'MANÉ, Sadio',
'Liverpool',
'Premier League',
'MARQUINHOS, Corrêa',
'Paris Saint-Germain',
'Ligue 1',
'FERNANDES, Bruno',
'Manchester United',
'Premier League']}
这是我用过的代码:
p = []
for i in itertools.repeat([1, 2, 3], sum(map(len, data.values()))):
p+=i
data['id'].append(p)
当我创建一个 pandas 数据框时,我得到:
id players
0 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... NEUER, Manuel
1 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Bayern München
2 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Bundesliga
3 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... OBLAK, Jan
4 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Atlético Madrid
... ... ...
115 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Juventus
116 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Serie A
117 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... ALEXANDER-ARNOLD, Trent
118 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Liverpool
119 [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, ... Premier League
120 rows × 2 columns
预期结果:
id players
0 1 NEUER, Manuel
1 2 Bayern München
2 3 Bundesliga
3 1 OBLAK, Jan
4 2 Atlético Madrid
... ... ...
115 2 Juventus
116 3 Serie A
117 1 ALEXANDER-ARNOLD, Trent
118 2 Liverpool
119 3 Premier League
120 rows × 2 columns
#to then get:
players 2 3
NEUER, Manuel Bayern München Bundesliga
OBLAK, Jan Atlético Madrid ...
这下清楚多了。
您可以这样做(使用您的 data
):
out1 = {'id': [], 'players': []}
for i in zip(itertools.cycle([1,2,3]), data['players']):
out1['id'].append(i[0])
out1['players'].append(i[1])
>>> out1
{'id': [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3], 'players': ['NEUER, Manuel', 'Bayern München', 'Bundesliga', 'OBLAK, Jan', 'Atlético Madrid', 'La Liga', 'KANE, Harry', 'Tottenham Hotspur', 'Premier League', 'RAMOS, Sergio', 'Paris Saint-Germain', 'Ligue 1', 'TER STEGEN, Marc-André', 'Barcelona', 'La Liga', 'VARANE, Raphaël', 'Real Madrid', 'La Liga', 'STERLING, Raheem', 'Manchester City', 'Premier League', 'MANÉ, Sadio', 'Liverpool', 'Premier League', 'MARQUINHOS, Corrêa', 'Paris Saint-Germain', 'Ligue 1', 'FERNANDES, Bruno', 'Manchester United', 'Premier League']}
不过也可以直接得到第二步:
out2 = {'player':[], 'team':[], 'league':[]}
for i in zip(cycle([1,2,3]), data['players']):
if i[0] == 1:
out2['player'].append(i[1])
elif i[0] == 2:
out2['team'].append(i[1])
elif i[0] == 3:
out2['league'].append(i[1])
>>> out2
{'player': ['NEUER, Manuel', 'OBLAK, Jan', 'KANE, Harry', 'RAMOS, Sergio', 'TER STEGEN, Marc-André', 'VARANE, Raphaël', 'STERLING, Raheem', 'MANÉ, Sadio', 'MARQUINHOS, Corrêa', 'FERNANDES, Bruno'], 'team': ['Bayern München', 'Atlético Madrid', 'Tottenham Hotspur', 'Paris Saint-Germain', 'Barcelona', 'Real Madrid', 'Manchester City', 'Liverpool', 'Paris Saint-Germain', 'Manchester United'], 'league': ['Bundesliga', 'La Liga', 'Premier League', 'Ligue 1', 'La Liga', 'La Liga', 'Premier League', 'Premier League', 'Ligue 1', 'Premier League']}