将列表更改为字典
change a list into dictionary
如何更改这样的列表:
[[0, 'Ealing Broadway', 103.89],
[0, 'Notting Hill Gate', 103.89],
[0, 'Mile End', 103.89],
[1, 'Ealing Broadway', 59.089999999999996],
[2, 'Notting Hill Gate', 40.279999999999994],
[3, 'Mile End', 68.86999999999999]]
像
这样的字典
{0:{'length':103.89,'interchange':['Ealing Broadway','Notting Hill Gate','Mile End']},
1:{'length':59.089999999999996,'interchange':['Ealing Broadway']},
2:{'length':40.279999999999994,'interchange':['Notting Hill Gate']},
3:{'length':68.86999999999999,'interchange':['Mile End']}}
谢谢
I am trying to start with:
d2 = defaultdict(list)
for k, v in all_info:
d2[k].append(v)
with_length=dict((k,list(v)) for k,v in d2.iteritems())
with_length
但它不起作用,我正在纠结从哪里开始。
这是一个具体的例子,说明您将如何做到这一点:
l = [[0, 'Ealing Broadway', 103.89],
[0, 'Notting Hill Gate', 103.89],
[0, 'Mile End', 103.89],
[1, 'Ealing Broadway', 59.089999999999996],
[2, 'Notting Hill Gate', 40.279999999999994],
[3, 'Mile End', 68.86999999999999]]
d = {}
for pair in l:
if pair[0] not in d.keys():
d[pair[0]] = { 'interchange': [] }
d[pair[0]]['length'] = pair[2]
d[pair[0]]['interchange'].append(pair[1])
这是假设您想在向 d[0]
添加元素时覆盖 d['length']
。
与 Majora 类似的答案,但首先使用 groupby
。没有错误查找,但可能需要事先进行排序。
from itertools import groupby
lst = [[0, 'Ealing Broadway', 103.89],
[0, 'Notting Hill Gate', 103.89],
[0, 'Mile End', 103.89],
[1, 'Ealing Broadway', 59.089999999999996],
[2, 'Notting Hill Gate', 40.279999999999994],
[3, 'Mile End', 68.86999999999999]]
new_list = []
for key, group in groupby(lst, lambda x: x[0]):
new_list.append(list(group))
main_dict = {}
for item in new_list:
main_dict[item[0][0]] = {'length': item[0][2], 'interchange': [stn[1] for stn in item]}
b = {}
for i in a:
if b.has_key(i[0]):
b[i[0]]['interchange'].append(i[1])
else:
b[i[0]] = {'length': i[2], 'interchange': [i[1]]}
这是一种需要两遍的方法。它的优点是易于理解。
import pprint
if __name__ == '__main__':
rows = [
[0, 'Ealing Broadway', 103.89],
[0, 'Notting Hill Gate', 103.89],
[0, 'Mile End', 103.89],
[1, 'Ealing Broadway', 59.089999999999996],
[2, 'Notting Hill Gate', 40.279999999999994],
[3, 'Mile End', 68.86999999999999]]
print('First Pass')
d = {}
for key, interchange, length in rows:
inner_dict = d.setdefault((key, length), {})
interchanges = inner_dict.setdefault('interchange', [])
interchanges.append(interchange)
pprint.pprint(d)
print('=' * 72)
print('Second Pass')
d2 = {}
for (key, length), v in d.items():
v['length'] = length
d2[key] = v
pprint.pprint(d2)
输出
First Pass
{(0, 103.89): {'interchange': ['Ealing Broadway',
'Notting Hill Gate',
'Mile End']},
(1, 59.089999999999996): {'interchange': ['Ealing Broadway']},
(2, 40.279999999999994): {'interchange': ['Notting Hill Gate']},
(3, 68.86999999999999): {'interchange': ['Mile End']}}
========================================================================
Second Pass
{0: {'interchange': ['Ealing Broadway', 'Notting Hill Gate', 'Mile End'],
'length': 103.89},
1: {'interchange': ['Ealing Broadway'], 'length': 59.089999999999996},
2: {'interchange': ['Notting Hill Gate'], 'length': 40.279999999999994},
3: {'interchange': ['Mile End'], 'length': 68.86999999999999}}
讨论
- 在第一遍中,我使用第一列和最后一列作为字典的键。这个字典的值是另一个字典(
inner_dict
)
- 在第二遍中,我将键和值调整为最终形式。
- 这个解决方案可能不是最有效或最优雅的,但我希望它很容易理解
请将我的回答视为 Pandas(强大的 Python 数据分析工具包)模块方法的演示。
如果您想处理大量数据,我很确定快速 - pandas 是您的工具...
import pandas as pd
data = [[0, 'Ealing Broadway', 103.89],
[0, 'Notting Hill Gate', 103.89],
[0, 'Mile End', 103.89],
[1, 'Ealing Broadway', 59.089999999999996],
[2, 'Notting Hill Gate', 40.279999999999994],
[3, 'Mile End', 68.86999999999999]
]
# create pandas DF
df = pd.DataFrame(data, columns=['route','interchange','length'])
原DF:
In [235]: df
Out[235]:
route interchange length
0 0 Ealing Broadway 103.89
1 0 Notting Hill Gate 103.89
2 0 Mile End 103.89
3 1 Ealing Broadway 59.09
4 2 Notting Hill Gate 40.28
5 3 Mile End 68.87
让我们对数据进行分组:
In [239]: df.groupby(['route','length'])['interchange'].apply(lambda x: x.tolist()).reset_index()
Out[239]:
route length interchange
0 0 103.89 [Ealing Broadway, Notting Hill Gate, Mile End]
1 1 59.09 [Ealing Broadway]
2 2 40.28 [Notting Hill Gate]
3 3 68.87 [Mile End]
我们也可以将其转换为字典列表:
In [240]: df.groupby(['route','length'])['interchange'].apply(lambda x: x.tolist()).reset_index().to_dict('record')
Out[240]:
[{'interchange': ['Ealing Broadway', 'Notting Hill Gate', 'Mile End'],
'length': 103.89,
'route': 0},
{'interchange': ['Ealing Broadway'],
'length': 59.089999999999996,
'route': 1},
{'interchange': ['Notting Hill Gate'],
'length': 40.279999999999994,
'route': 2},
{'interchange': ['Mile End'], 'length': 68.86999999999999, 'route': 3}]
我家用笔记本上 600.000 行数据帧的时间:
设置:
In [245]: a = pd.concat([df] * 10**5)
合并的形状 a
DF:
In [246]: a.shape
Out[246]: (600000, 3)
时间:
In [251]: %timeit a.groupby(['route','length'])['interchange'].apply(lambda x: x.tolist()).reset_index()
10 loops, best of 3: 130 ms per loop
非矢量化方法(对于loops/listcomprehension/etc):
In [262]: %paste
def roganjosh(lst):
new_list = []
for key, group in groupby(lst, lambda x: x[0]):
new_list.append(list(group))
main_dict = {}
for item in new_list:
main_dict[item[0][0]] = {'length': item[0][2], 'interchange': [stn[1] for stn in item]}
return main_dict
## -- End pasted text --
In [263]: lst = a.values.tolist()
In [264]: len(lst)
Out[264]: 600000
In [265]: %timeit roganjosh(lst)
1 loop, best of 3: 650 ms per loop
如何更改这样的列表:
[[0, 'Ealing Broadway', 103.89],
[0, 'Notting Hill Gate', 103.89],
[0, 'Mile End', 103.89],
[1, 'Ealing Broadway', 59.089999999999996],
[2, 'Notting Hill Gate', 40.279999999999994],
[3, 'Mile End', 68.86999999999999]]
像
这样的字典{0:{'length':103.89,'interchange':['Ealing Broadway','Notting Hill Gate','Mile End']},
1:{'length':59.089999999999996,'interchange':['Ealing Broadway']},
2:{'length':40.279999999999994,'interchange':['Notting Hill Gate']},
3:{'length':68.86999999999999,'interchange':['Mile End']}}
谢谢
I am trying to start with:
d2 = defaultdict(list)
for k, v in all_info:
d2[k].append(v)
with_length=dict((k,list(v)) for k,v in d2.iteritems())
with_length
但它不起作用,我正在纠结从哪里开始。
这是一个具体的例子,说明您将如何做到这一点:
l = [[0, 'Ealing Broadway', 103.89],
[0, 'Notting Hill Gate', 103.89],
[0, 'Mile End', 103.89],
[1, 'Ealing Broadway', 59.089999999999996],
[2, 'Notting Hill Gate', 40.279999999999994],
[3, 'Mile End', 68.86999999999999]]
d = {}
for pair in l:
if pair[0] not in d.keys():
d[pair[0]] = { 'interchange': [] }
d[pair[0]]['length'] = pair[2]
d[pair[0]]['interchange'].append(pair[1])
这是假设您想在向 d[0]
添加元素时覆盖 d['length']
。
与 Majora 类似的答案,但首先使用 groupby
。没有错误查找,但可能需要事先进行排序。
from itertools import groupby
lst = [[0, 'Ealing Broadway', 103.89],
[0, 'Notting Hill Gate', 103.89],
[0, 'Mile End', 103.89],
[1, 'Ealing Broadway', 59.089999999999996],
[2, 'Notting Hill Gate', 40.279999999999994],
[3, 'Mile End', 68.86999999999999]]
new_list = []
for key, group in groupby(lst, lambda x: x[0]):
new_list.append(list(group))
main_dict = {}
for item in new_list:
main_dict[item[0][0]] = {'length': item[0][2], 'interchange': [stn[1] for stn in item]}
b = {}
for i in a:
if b.has_key(i[0]):
b[i[0]]['interchange'].append(i[1])
else:
b[i[0]] = {'length': i[2], 'interchange': [i[1]]}
这是一种需要两遍的方法。它的优点是易于理解。
import pprint
if __name__ == '__main__':
rows = [
[0, 'Ealing Broadway', 103.89],
[0, 'Notting Hill Gate', 103.89],
[0, 'Mile End', 103.89],
[1, 'Ealing Broadway', 59.089999999999996],
[2, 'Notting Hill Gate', 40.279999999999994],
[3, 'Mile End', 68.86999999999999]]
print('First Pass')
d = {}
for key, interchange, length in rows:
inner_dict = d.setdefault((key, length), {})
interchanges = inner_dict.setdefault('interchange', [])
interchanges.append(interchange)
pprint.pprint(d)
print('=' * 72)
print('Second Pass')
d2 = {}
for (key, length), v in d.items():
v['length'] = length
d2[key] = v
pprint.pprint(d2)
输出
First Pass
{(0, 103.89): {'interchange': ['Ealing Broadway',
'Notting Hill Gate',
'Mile End']},
(1, 59.089999999999996): {'interchange': ['Ealing Broadway']},
(2, 40.279999999999994): {'interchange': ['Notting Hill Gate']},
(3, 68.86999999999999): {'interchange': ['Mile End']}}
========================================================================
Second Pass
{0: {'interchange': ['Ealing Broadway', 'Notting Hill Gate', 'Mile End'],
'length': 103.89},
1: {'interchange': ['Ealing Broadway'], 'length': 59.089999999999996},
2: {'interchange': ['Notting Hill Gate'], 'length': 40.279999999999994},
3: {'interchange': ['Mile End'], 'length': 68.86999999999999}}
讨论
- 在第一遍中,我使用第一列和最后一列作为字典的键。这个字典的值是另一个字典(
inner_dict
) - 在第二遍中,我将键和值调整为最终形式。
- 这个解决方案可能不是最有效或最优雅的,但我希望它很容易理解
请将我的回答视为 Pandas(强大的 Python 数据分析工具包)模块方法的演示。
如果您想处理大量数据,我很确定快速 - pandas 是您的工具...
import pandas as pd
data = [[0, 'Ealing Broadway', 103.89],
[0, 'Notting Hill Gate', 103.89],
[0, 'Mile End', 103.89],
[1, 'Ealing Broadway', 59.089999999999996],
[2, 'Notting Hill Gate', 40.279999999999994],
[3, 'Mile End', 68.86999999999999]
]
# create pandas DF
df = pd.DataFrame(data, columns=['route','interchange','length'])
原DF:
In [235]: df
Out[235]:
route interchange length
0 0 Ealing Broadway 103.89
1 0 Notting Hill Gate 103.89
2 0 Mile End 103.89
3 1 Ealing Broadway 59.09
4 2 Notting Hill Gate 40.28
5 3 Mile End 68.87
让我们对数据进行分组:
In [239]: df.groupby(['route','length'])['interchange'].apply(lambda x: x.tolist()).reset_index()
Out[239]:
route length interchange
0 0 103.89 [Ealing Broadway, Notting Hill Gate, Mile End]
1 1 59.09 [Ealing Broadway]
2 2 40.28 [Notting Hill Gate]
3 3 68.87 [Mile End]
我们也可以将其转换为字典列表:
In [240]: df.groupby(['route','length'])['interchange'].apply(lambda x: x.tolist()).reset_index().to_dict('record')
Out[240]:
[{'interchange': ['Ealing Broadway', 'Notting Hill Gate', 'Mile End'],
'length': 103.89,
'route': 0},
{'interchange': ['Ealing Broadway'],
'length': 59.089999999999996,
'route': 1},
{'interchange': ['Notting Hill Gate'],
'length': 40.279999999999994,
'route': 2},
{'interchange': ['Mile End'], 'length': 68.86999999999999, 'route': 3}]
我家用笔记本上 600.000 行数据帧的时间:
设置:
In [245]: a = pd.concat([df] * 10**5)
合并的形状 a
DF:
In [246]: a.shape
Out[246]: (600000, 3)
时间:
In [251]: %timeit a.groupby(['route','length'])['interchange'].apply(lambda x: x.tolist()).reset_index()
10 loops, best of 3: 130 ms per loop
非矢量化方法(对于loops/listcomprehension/etc):
In [262]: %paste
def roganjosh(lst):
new_list = []
for key, group in groupby(lst, lambda x: x[0]):
new_list.append(list(group))
main_dict = {}
for item in new_list:
main_dict[item[0][0]] = {'length': item[0][2], 'interchange': [stn[1] for stn in item]}
return main_dict
## -- End pasted text --
In [263]: lst = a.values.tolist()
In [264]: len(lst)
Out[264]: 600000
In [265]: %timeit roganjosh(lst)
1 loop, best of 3: 650 ms per loop