xml.etree

Question

我有一个文件，其中包含一些 .txt 格式的原始数据，我需要将数据输入到结构化程度更高的 .xml 文档中。在Python。我的源文件大约有 10000 行，但为了方便起见，我只附上了包含三个短列表（“ID”、“姓名”和“父 ID”）的代码。

原始 .txt 如下所示：

Ac  Value 1
Ac_05   Value 2 Ac
Ac_05_00    Value 3 Ac_05
Ac_15   Value 4 Ac_05

如果元素有父 ID（在我的代码中称为 pID 的列表），那么它应该是与子元素的父 ID 具有相同 ID 的元素的子元素...希望它有意义。

我已经了解了以下代码：

import xml.etree.cElementTree as ET

IDs = ['Ac', 'Ac_05', 'Ac_05_00', 'Ac_15']
Names = ['Value 1', 'Value 2', 'Value 3', 'Value 4']
pID = ['', 'Ac', 'Ac_05', 'Ac']

# make xml file
Items = ET.Element('Items')

for i in range(len(IDs)):

    if pID[i] in IDs:

        # index of the parent ID
        # IDs.index(pID[i])

        # value of the parent ID
        # IDs[IDs.index(pID[i])]
       
        Item = ET.SubElement(Children, 'Item')

        ID = ET.SubElement(Item, 'ID')
        ID.text = IDs[i]

        Name = ET.SubElement(Item, 'Name')
        Name.text = Names[i]

        Children = ET.SubElement(Item, 'Children')

    else:
        Item = ET.SubElement(Items, 'Item')
            
        ID = ET.SubElement(Item, 'ID')
        ID.text = IDs[i]

        Name = ET.SubElement(Item, 'Name')
        Name.text = Names[i]

        Children = ET.SubElement(Item, 'Children')

tree = ET.ElementTree(Items)
ET.indent(tree, space='\t', level=0)
tree.write('filename.xml', encoding='utf-8')

我不知道如何将子项附加到 .xml 中的特定元素。例如，ID 为“AC_15”的最后一项应该是“AC”的子项。 .xml 中的正确输出应如下所示：

<Items>
    <Item>
        <ID>Ac</ID>
        <Name>Value 1</Name>
        <Children>
            <Item>
                <ID>Ac_05</ID>
                <Name>Value 2</Name>
                <Children>
                    <Item>
                        <ID>Ac_05_00</ID>
                        <Name>Value 3</Name>
                        <Children/>                         
                    </Item>
                </Children>
            </Item>
            <Item>
                <ID>Ac_15</ID>
                <Name>Value 4</Name>
                <Children/>
            </Item>
        </Children>
    </Item>
</Items>

有人像我一样对 Python 的初学者有什么建议吗？

Answer 1

尽管对 if 部分的更改很小，但您还有一个很大的不同，因此您应该专注于真正的差异。这里真正的区别只是项目的父元素

我做了一个字典来存储所有父元素的子标签，应该没问题。至少它可以按预期在您的示例中运行

Items = ET.Element('Items')
elems=dict()
for id_, name, pid in zip(IDs, Names, pID):
    el=ET.Element('Item')
    ET.SubElement(el, 'ID').text = id_
    ET.SubElement(el, 'Name').text = name
    elems[id_]=ET.SubElement(el, 'Children')
    
    if not pid:
        parent=Items
    else:
        parent=elems[pid]
    parent.append(el)

xml.etree - 在特定元素处插入元素作为子元素

xml.etree - insert element as a child at specific element

python