使用 xml mini dom python 获取元素文本
Get Element Text Using xml mini dom python
我正在尝试使用 mini dom
获取元素的文本,在下面的代码中,我也尝试了 getText()
建议的方法 here,但我无法获得所需的输出,以下是我的代码。我没有从我尝试处理的元素中获取文本值。
import xml.dom.minidom
doc = xml.dom.minidom.parse("DL_INVOICE_DETAIL_TCB.xml")
results = doc.getElementsByTagName("G_TRANSACTIONS")
def getText(nodelist):
rc = []
for node in nodelist:
if node.nodeType == node.TEXT_NODE:
rc.append(node.data)
return ''.join(rc)
for result in results:
for element in result.getElementsByTagName("INVOICE_NUMBER"):
print(element.nodeType)
print(element.nodeValue)
以下是我的XML样本
<LIST_G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31002</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice</TRANSACTION_CLASS>
</G_TRANSACTIONS>
</LIST_G_TRANSACTIONS>
我正在使用以下
如果您可以使用 ElementTree,代码如下:
import xml.etree.ElementTree as ET
xml = '''<LIST_G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31002</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice1</TRANSACTION_CLASS>
</G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31006</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice2</TRANSACTION_CLASS>
</G_TRANSACTIONS>
</LIST_G_TRANSACTIONS>'''
root = ET.fromstring(xml)
invoice_numbers = [entry.text for entry in list(root.findall('.//INVOICE_NUMBER'))]
print(invoice_numbers)
输出
['31002', '31006']
基于迷你王国的答案
from xml.dom import minidom
xml = """\
<LIST_G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31002</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice1</TRANSACTION_CLASS>
</G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31006</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice2</TRANSACTION_CLASS>
</G_TRANSACTIONS>
</LIST_G_TRANSACTIONS>"""
dom = minidom.parseString(xml)
invoice_numbers = [int(x.firstChild.data) for x in dom.getElementsByTagName("INVOICE_NUMBER")]
print(invoice_numbers)
输出
[31002, 31006]
我正在尝试使用 mini dom
获取元素的文本,在下面的代码中,我也尝试了 getText()
建议的方法 here,但我无法获得所需的输出,以下是我的代码。我没有从我尝试处理的元素中获取文本值。
import xml.dom.minidom
doc = xml.dom.minidom.parse("DL_INVOICE_DETAIL_TCB.xml")
results = doc.getElementsByTagName("G_TRANSACTIONS")
def getText(nodelist):
rc = []
for node in nodelist:
if node.nodeType == node.TEXT_NODE:
rc.append(node.data)
return ''.join(rc)
for result in results:
for element in result.getElementsByTagName("INVOICE_NUMBER"):
print(element.nodeType)
print(element.nodeValue)
以下是我的XML样本
<LIST_G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31002</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice</TRANSACTION_CLASS>
</G_TRANSACTIONS>
</LIST_G_TRANSACTIONS>
我正在使用以下
如果您可以使用 ElementTree,代码如下:
import xml.etree.ElementTree as ET
xml = '''<LIST_G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31002</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice1</TRANSACTION_CLASS>
</G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31006</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice2</TRANSACTION_CLASS>
</G_TRANSACTIONS>
</LIST_G_TRANSACTIONS>'''
root = ET.fromstring(xml)
invoice_numbers = [entry.text for entry in list(root.findall('.//INVOICE_NUMBER'))]
print(invoice_numbers)
输出
['31002', '31006']
基于迷你王国的答案
from xml.dom import minidom
xml = """\
<LIST_G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31002</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice1</TRANSACTION_CLASS>
</G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31006</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice2</TRANSACTION_CLASS>
</G_TRANSACTIONS>
</LIST_G_TRANSACTIONS>"""
dom = minidom.parseString(xml)
invoice_numbers = [int(x.firstChild.data) for x in dom.getElementsByTagName("INVOICE_NUMBER")]
print(invoice_numbers)
输出
[31002, 31006]