ValueError: dictionary update sequence element #0 has length 3; 2 is required when attempting to coerce generator function into dictionary

ValueError: dictionary update sequence element #0 has length 3; 2 is required when attempting to coerce generator function into dictionary

这是我正在使用的 CSV 文件:

"A","B","C","D","E","F","G","H","I","J"

"88",18,1,"<Req TID=""34"" ReqType=""MS""><IISO /><CID>2</CID><MemID>0000</MemID><MemPass /><RequestData><S>[REMOVED]</S><Na /><La /><Card>[REMOVED]</Card><Address /><HPhone /><Mail /></ReqData></Req>","<Response T=""3"" RequestType=""MS""><MS><Memb><PrivateMembers /><Ob>0-12-af</Ob><Locator /></Memb><S>[REMOVED]</S><CNum>[REMOVED]</CNum><FName /><LaName /><Address /><HPhone /><Email /><IISO /><MemID /><MemPass /><T /><CID /><T /></MS></Response>",0-JAN-10 12.00.02 AM,27-JUN-15 12.00.00 AM,"26",667,0
"22",22,1,"<Req TID=""45"" ReqType=""MS""><IISO /><CID>4</CID><MemID>0000</MemID><MemPass /><RequestData><S>[REMOVED]</S><Na /><La /><Card>[REMOVED]</Card><Address /><HPhone /><Mail /></ReqData></Req>","<Response T=""10"" RequestType=""MS""><MS><Memb><PrivateMembers /><Ob>0-12-af</Ob><Locator /></Memb><S>[REMOVED]</S><CNum>[REMOVED]</CNum><FName /><LaName /><Address /><HPhone /><Email /><IISO /><MemID /><MemPass /><T /><CID /><T /></MS></Response>",0-JAN-22 12.00.02 AM,27-JUN-22 12.00.00 AM,"26",667,0
"32",22,1,"<Req TID=""15"" ReqType=""MS""><IISO /><CID>45</CID><MemID>0000</MemID><MemPass /><RequestData><S>[REMOVED]</S><Na /><La /><Card>[REMOVED]</Card><Address /><HPhone /><Mail /></ReqData></Req>","<Response T=""10"" RequestType=""MS""><MS><Memb><PrivateMembers /><Ob>0-12-af</Ob><Locator /></Memb><S>[REMOVED]</S><CNum>[REMOVED]</CNum><FName /><LaName /><Address /><HPhone /><Email /><IISO /><MemID /><MemPass /><T /><CID /><T /></MS></Response>",0-JAN-20 12.00.02 AM,27-JUN-34 12.00.00 AM,"26",667,0

到目前为止,我已经编写了两个生成器函数来解析 列 E 中的 XML 数据,以便将 XML 标签及其文本转换为Python 字典。具体来说,flatten_dict() 函数 returns 一个可迭代的(键,值)对序列。可以将其转换为 list(flatten_dict(root)).

对列表

生成以下元组列表:

[('ResponseRequestType', 'MS'),
('ResponseT', '10'),
 ('PrivateMembers', None),
 ('Ob', '0-12-af'),
 ('Locator', None),
 ('S', '[REMOVED]'),
 ('CNum', '[REMOVED]'),
 ('FName', None),
 ('LaName', None),
 ('Address', None),
 ('HPhone', None),
 ('Email', None),
 ('IISO', None),
 ('MemID', None),
 ('MemPass', None),
 ('T', None),
 ('CID', None),
 ('T', None)]

实施块,从第 75 行 开始,returns 将 csv 文件的列作为键,将所有行实例作为值的列表对象

对列表 L(或可迭代的对)也可以用 zip(*L)

转置

当我尝试在生成器函数上构造字典时,我的错误发生在 行 79 上。我已经审查了几个帖子,即 this and this

我意识到我需要传入一组元组,但为什么我收到此错误对我来说是一个悖论。我正在使用 Python 3.4 和 Jupyter notebooks 进行实验。

我欢迎建设性的(这里强调建设性的)反馈。

# In[37]:

import xml.etree.ElementTree as ET


def flatten_list(aList, prefix=''):
    for i, element in enumerate(aList, 1):
        eprefix = "{}{}".format(prefix, i)
        if element:
            # treat like dict 
            if len(element) == 1 or element[0].tag != element[1].tag: 
                yield flatten_dict(element, eprefix)
            # treat like list 
            elif element[0].tag == element[1].tag: 
                yield flatten_list(element, eprefix)
        elif element.text: 
            text = element.text.strip() 
            if text: 
                yield eprefix[:].rstrip('.'), element.text

def flatten_dict(parent_element, prefix=''):
    prefix = prefix + parent_element.tag 
    if parent_element.items():
        for k, v in parent_element.items():
            yield prefix + k, v
    for element in parent_element:
        eprefix = element.tag  
        if element:
            # treat like dict - we assume that if the first two tags 
            # in a series are different, then they are all different. 
            if len(element) == 1 or element[0].tag != element[1].tag: 
                yield flatten_dict(element, prefix=prefix)
            # treat like list - we assume that if the first two tags 
            # in a series are the same, then the rest are the same. 
            else: 
                # here, we put the list in dictionary; the key is the 
                # tag name the list elements all share in common, and 
                # the value is the list itself
                yield flatten_list(element, prefix=eprefix)
            # if the tag has attributes, add those to the dict
            if element.items():
                for k, v in element.items():
                    yield eprefix+k
        # this assumes that if you've got an attribute in a tag, 
        # you won't be having any text. This may or may not be a 
        # good idea -- time will tell. It works for the way we are 
        # currently doing XML configuration files... 
        elif element.items(): 
            for k, v in element.items():
                yield eprefix+k
        # finally, if there are no child tags and no attributes, extract 
        # the text 
        else:
            yield eprefix, element.text     


# In[75]:

from glob import iglob
import csv
from collections import OrderedDict
from xml.etree.ElementTree import ParseError
from parsexml2 import flatten_dict, flatten_list
import xml.etree.cElementTree as ElementTree 
import csv

headers = set()
results = []
with open('s.csv', 'rU') as infile:
    reader = csv.DictReader(infile)        
    data = {}
    for item in reader:
        for header, value in item.items():
            try:
                data[header].append(value)
            except KeyError:
                data[header] = [value]


client_responses = data['E'] #returns a list of values
for client_response in client_responses:
     print('\n' + client_response)
    xml_string = (''.join(client_response)) #may be not necessary
     print(xml_string)
     xml_string = xml_string.replace('&amp;', '')
     xml_string = xml_string.replace('&#x0;','')
     print(xml_string)
    try:
        roots = ElementTree.fromstring(xml_string) #serialization step here             
    except ET.ParseError:
        print("catastrophic failure")
        continue                

# In[79]:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-79-7053ec7639d9> in <module>()
----> 1 dict(flatten_dict(root))

ValueError: dictionary update sequence element #0 has length 3; 2 is required

有时您会产生 eprefix + k(例如一个元素),有时您会产生 eprefix, element.text(例如两个元素)。

可能生成器首先生成的是一个长度为 3 的字符串。

您的 flatten_dict 代码中存在一些问题。

  1. 正如@PatrickMaupin 所指出的那样,您有时会产生这样的值 - yield eprefix+k - 如果生成器确实产生了这个,它 dict() 将无法工作,这个可能是导致问题的原因。我相信你想要的是 - yield eprefix+k,v .

  2. 您有时会屈服 - flatten_dict(element, prefix=prefix) -(或 flatten_list() 对应),这也不起作用。让我们举一个简单的例子-

    >>> def a():
    ...     yield a()
    ...
    >>> for i in a():
    ...     print(i)
    ...
    <generator object a at 0x00593B48>
    

    如您所见,这生成了生成器对象,它没有迭代该生成器对象并生成实际结果。为此,您需要迭代并手动生成结果。示例 -

        if len(element) == 1 or element[0].tag != element[1].tag: 
            for k,v in flatten_dict(element, prefix=prefix):
                yield k,v
    

    或者从 Python 3.3 开始,您可以使用 yield from 构造,从另一个可迭代对象中生成值。示例 -

        if len(element) == 1 or element[0].tag != element[1].tag: 
            yield from flatten_dict(element, prefix=prefix)
    

    同样适用于flatten_list()