使用 xml mini dom python 获取元素文本

Get Element Text Using xml mini dom python

我正在尝试使用 mini dom 获取元素的文本,在下面的代码中,我也尝试了 getText() 建议的方法 here,但我无法获得所需的输出,以下是我的代码。我没有从我尝试处理的元素中获取文本值。

import xml.dom.minidom

doc = xml.dom.minidom.parse("DL_INVOICE_DETAIL_TCB.xml")
results = doc.getElementsByTagName("G_TRANSACTIONS")
def getText(nodelist):
    rc = []
    for node in nodelist:
        if node.nodeType == node.TEXT_NODE:
            rc.append(node.data)
    return ''.join(rc)
for result in results:
    for element in result.getElementsByTagName("INVOICE_NUMBER"):
        print(element.nodeType)
        print(element.nodeValue)

以下是我的XML样本

<LIST_G_TRANSACTIONS>
    <G_TRANSACTIONS>
        <INVOICE_NUMBER>31002</INVOICE_NUMBER>
        <TRANSACTION_CLASS>Invoice</TRANSACTION_CLASS>
    </G_TRANSACTIONS>
</LIST_G_TRANSACTIONS>

我正在使用以下

如果您可以使用 ElementTree,代码如下:

import xml.etree.ElementTree as ET

xml = '''<LIST_G_TRANSACTIONS>
    <G_TRANSACTIONS>
        <INVOICE_NUMBER>31002</INVOICE_NUMBER>
        <TRANSACTION_CLASS>Invoice1</TRANSACTION_CLASS>
    </G_TRANSACTIONS>
    <G_TRANSACTIONS>
        <INVOICE_NUMBER>31006</INVOICE_NUMBER>
        <TRANSACTION_CLASS>Invoice2</TRANSACTION_CLASS>
    </G_TRANSACTIONS>    
</LIST_G_TRANSACTIONS>'''

root = ET.fromstring(xml)
invoice_numbers = [entry.text for entry in list(root.findall('.//INVOICE_NUMBER'))]
print(invoice_numbers)

输出

['31002', '31006']

基于迷你王国的答案

from xml.dom import minidom

xml = """\
<LIST_G_TRANSACTIONS>
    <G_TRANSACTIONS>
        <INVOICE_NUMBER>31002</INVOICE_NUMBER>
        <TRANSACTION_CLASS>Invoice1</TRANSACTION_CLASS>
    </G_TRANSACTIONS>
    <G_TRANSACTIONS>
        <INVOICE_NUMBER>31006</INVOICE_NUMBER>
        <TRANSACTION_CLASS>Invoice2</TRANSACTION_CLASS>
    </G_TRANSACTIONS>    
</LIST_G_TRANSACTIONS>"""

dom = minidom.parseString(xml)
invoice_numbers = [int(x.firstChild.data) for x in dom.getElementsByTagName("INVOICE_NUMBER")]
print(invoice_numbers)

输出

[31002, 31006]