列表到字典

List to Dictionary

基本上我有一个这样的列表:

['  ROOT S . ', '  ROOT S ! ', '  ROOT is it true that S ? ', ' ', '  S   NP VP ', '  VP  Verb NP ', '  NP DT Noun ', '  NP NP PP ', '  PP Prep NP ', '  Noun Adj Noun ', ' ', '  Verb ate ', '  Verb wanted ', '  Verb kissed ', '  Verb understood ', '  Verb pickled ', ' ', '  DT the ', '  DT a ', '  DT  every ', ' ', '  Noun president ', '  Noun sandwich ', '  Noun pickle ', '  Noun chief of staff ', '  Noun floor ', ' ', '  Adj fine ', '  Adj delicious ', '  Adj perplexed ', '  Adj pickled ', ' ', '  Prep    with ', '  Prep on ', '  Prep under ', '  Prep    in '] 

我想将第一个值与每个条目的其余部分分开,并将其放入字典中,例如,对于列表中的第一个值,将输入到字典行中:

Key=ROOT
Value=S .

如果有多个相同种类的id喜欢用|分隔。 这是 id 喜欢字典的样子

ROOT = 'S . | S ! | is it true that S ?',
S  = 'NP VP',
VP = 'Verb NP',
NP = 'DT Noun | NP PP',
PP = 'Prep NP',
Noun = 'Adj Noun | president | sandwich | pickle | chief of staff | floor',
DT = 'the | a | every',
Verb  = 'ate | wanted | kissed | understood | pickled',
Adj = 'fine | delicious | perplexed | pickled',
Prep = 'with | on | under | in'

有没有不使用外部库的方法? 谢谢

一个有用的方法可能是:

   import collections
   dl = collections.defaultdict(list)
   for s in thelist:
       k, _, v = s.strip().partition(' ')
       dl[k].append(v)
   d = dict((k, ' | '.join(dl[k])) for k in dl)
dict = {}

for item in list:
   _item = item.strip()
   split_item  = _item.split(" ",1)
   key = split_item[0]
   if key in dict:
       dict[key] = dict[key] + "|" + split_item[1] 
   else:
       dict[key] = split_item[1]

不使用任何库或模块:

x = ['  ROOT S . ', '  ROOT S ! ', '  ROOT is it true that S ? ', ' ', '  S   NP VP ', '  VP  Verb NP ', '  NP DT Noun ', '  NP NP PP ', '  PP Prep NP ', '  Noun Adj Noun ', ' ', '  Verb ate ', '  Verb wanted ', '  Verb kissed ', '  Verb understood ', '  Verb pickled ', ' ', '  DT the ', '  DT a ', '  DT  every ', ' ', '  Noun president ', '  Noun sandwich ', '  Noun pickle ', '  Noun chief of staff ', '  Noun floor ', ' ', '  Adj fine ', '  Adj delicious ', '  Adj perplexed ', '  Adj pickled ', ' ', '  Prep    with ', '  Prep on ', '  Prep under ', '  Prep    in ']

d = {}
for k, v in (s.lstrip().split(' ',1) for s in x if ' ' in s.lstrip()):
    if k in d:
        d[k]+='|' + v
    else:
        d[k]=v

这会生成字典:

{'Adj': 'fine |delicious |perplexed |pickled ',
 'DT': 'the |a | every ',
 'NP': 'DT Noun |NP PP ',
 'Noun': 'Adj Noun |president |sandwich |pickle |chief of staff |floor ',
 'PP': 'Prep NP ',
 'Prep': '   with |on |under |   in ',
 'ROOT': 'S . |S ! |is it true that S ? ',
 'S': '  NP VP ',
 'VP': ' Verb NP ',
 'Verb': 'ate |wanted |kissed |understood |pickled '}

工作原理

这会初始化一个空字典:

d = {}

这将开始对列表 x 中的所有项目进行循环:

for k, v in (s.lstrip().split(' ',1) for s in x if ' ' in s.lstrip()):

这个列表生成器的形式是

(function(s) for s in x if condition(s))

因此,它依次从列表 x 中提取每个字符串 s。字符串 s 将被忽略,除非它们通过条件,在我们的例子中是:' ' in s.lstrip()。此条件只是确保在 s 中的第一个单词之后至少有一个 space。换句话说,这会删除格式错误或空的条目。

生成器 returns 使用的键和值:s.lstrip().split(' ',1)。这从 s 中提取第一个词用作键,该词之后剩下的就是值。

下面将找到的项目添加到字典中:

    if k in d:
        d[k]+='|' + v
    else:
        d[k]=v