获取一个元素的所有元素
Get all elements of an element
我知道了 XML:
<?xml version="1.0" encoding="ISO-8859-1"?>
<root>
<record ID="#046CE9401D01467B2BDBAF0" NumDoc="1461">
<NAME>
<P>Pedrito De Rosa</P>
<P>NIE X1111222233</P>
<P>tf 2283396922</P>
<P>efael@hostmailer.com</P>
</NAME>
<ADDRESS>
<P>Paseo Jauregizahar 234 - 1. A. Donostia </P>
</ADDRESS>
<SUBJECT>
<P>paisaje y ciudad </P>
</SUBJECT>
<QUERYS>
<P>2014-12-10 Avance Normas Subsidiarias</P>
<P>Otras consultas</P>
</QUERYS>
</record>
</root>
我正在尝试阅读此 XML 并将值插入 mysql table(名称、地址、主题、查询)。问题是当我尝试读取例如这样的 NAME 字段时:
from lxml import etree as ET
tree = ET.parse('data/data.xml')
root = tree.getroot()
records = tree.findall('record')
for i, record in enumerate(records):
myname = record.find("NAME/P")
print (myname.text)
此代码的输出是 "Pedrito De Rosa" 而不是所有内容。我的意思是,它应该获取 "NAME" 标签内的所有 P 元素,否则我们将丢失数据...
如何获取元素中的所有数据?我尝试使用 record.findAll("NAME/P") 但没有 findAll 方法。
任何帮助或线索?
如果有人可以提供帮助,我创建了一个 pyfiddle...
https://pyfiddle.io/fiddle/9ed9743d-4d6e-4400-bfb5-19ba2bbf65f7/?i=true
提前致谢
from lxml import etree as ET
tree = ET.parse('data.xml')
root = tree.getroot()
records = tree.findall('record')
for i, record in enumerate(records):
myname = record.findall("NAME/P")
for item in myname:
print (item.text)
输出:
Pedrito De Rosa
NIE X1111222233
tf 2283396922
efael@hostmailer.com
具有灵活的element.xpath
功能:
...
root = tree.getroot()
records = tree.findall('record')
for i, record in enumerate(records):
names = record.xpath("NAME/P/text()")
print(names)
addresses = record.xpath("ADDRESS/P/text()")
print(addresses)
subjects = record.xpath("SUBJECT/P/text()")
print(subjects)
querys = record.xpath("QUERYS/P/text()")
print(querys)
输出:
['Pedrito De Rosa', 'NIE X1111222233', 'tf 2283396922', 'efael@hostmailer.com']
['Paseo Jauregizahar 234 - 1. A. Donostia ']
['paisaje y ciudad ']
['2014-12-10 Avance Normas Subsidiarias', 'Otras consultas']
试试这个代码。
我选择正则表达式从 XML.
中获取名称
代码:
import re
line = "<NAME><P>Pedrito De Rosa</P></NAME>"
matchObj = re.search( r'.*NAME..P.(.*)..P...NAME', line, re.M|re.I)
if matchObj:
print("Name : ", matchObj.group(1))
输出:
Name : Pedrito De Rosa
低于
import xml.etree.ElementTree as ET
elements = ['NAME','ADDRESS','SUBJECT','QUERYS']
data = {}
xml = '''<?xml version="1.0" encoding="ISO-8859-1"?>
<root>
<record ID="#046CE9401D01467B2BDBAF0" NumDoc="1461">
<NAME>
<P>Pedrito De Rosa</P>
<P>NIE X1111222233</P>
<P>tf 2283396922</P>
<P>efael@hostmailer.com</P>
</NAME>
<ADDRESS>
<P>Paseo Jauregizahar 234 - 1. A. Donostia </P>
</ADDRESS>
<SUBJECT>
<P>paisaje y ciudad </P>
</SUBJECT>
<QUERYS>
<P>2014-12-10 Avance Normas Subsidiarias</P>
<P>Otras consultas</P>
</QUERYS>
</record>
</root>'''
root = ET.fromstring(xml)
for e in elements:
lst = root.find('.//record/{}'.format(e)).getchildren()
data[e] = [x.text for x in lst]
我知道了 XML:
<?xml version="1.0" encoding="ISO-8859-1"?>
<root>
<record ID="#046CE9401D01467B2BDBAF0" NumDoc="1461">
<NAME>
<P>Pedrito De Rosa</P>
<P>NIE X1111222233</P>
<P>tf 2283396922</P>
<P>efael@hostmailer.com</P>
</NAME>
<ADDRESS>
<P>Paseo Jauregizahar 234 - 1. A. Donostia </P>
</ADDRESS>
<SUBJECT>
<P>paisaje y ciudad </P>
</SUBJECT>
<QUERYS>
<P>2014-12-10 Avance Normas Subsidiarias</P>
<P>Otras consultas</P>
</QUERYS>
</record>
</root>
我正在尝试阅读此 XML 并将值插入 mysql table(名称、地址、主题、查询)。问题是当我尝试读取例如这样的 NAME 字段时:
from lxml import etree as ET
tree = ET.parse('data/data.xml')
root = tree.getroot()
records = tree.findall('record')
for i, record in enumerate(records):
myname = record.find("NAME/P")
print (myname.text)
此代码的输出是 "Pedrito De Rosa" 而不是所有内容。我的意思是,它应该获取 "NAME" 标签内的所有 P 元素,否则我们将丢失数据...
如何获取元素中的所有数据?我尝试使用 record.findAll("NAME/P") 但没有 findAll 方法。
任何帮助或线索?
如果有人可以提供帮助,我创建了一个 pyfiddle... https://pyfiddle.io/fiddle/9ed9743d-4d6e-4400-bfb5-19ba2bbf65f7/?i=true
提前致谢
from lxml import etree as ET
tree = ET.parse('data.xml')
root = tree.getroot()
records = tree.findall('record')
for i, record in enumerate(records):
myname = record.findall("NAME/P")
for item in myname:
print (item.text)
输出:
Pedrito De Rosa
NIE X1111222233
tf 2283396922
efael@hostmailer.com
具有灵活的element.xpath
功能:
...
root = tree.getroot()
records = tree.findall('record')
for i, record in enumerate(records):
names = record.xpath("NAME/P/text()")
print(names)
addresses = record.xpath("ADDRESS/P/text()")
print(addresses)
subjects = record.xpath("SUBJECT/P/text()")
print(subjects)
querys = record.xpath("QUERYS/P/text()")
print(querys)
输出:
['Pedrito De Rosa', 'NIE X1111222233', 'tf 2283396922', 'efael@hostmailer.com']
['Paseo Jauregizahar 234 - 1. A. Donostia ']
['paisaje y ciudad ']
['2014-12-10 Avance Normas Subsidiarias', 'Otras consultas']
试试这个代码。 我选择正则表达式从 XML.
中获取名称代码:
import re
line = "<NAME><P>Pedrito De Rosa</P></NAME>"
matchObj = re.search( r'.*NAME..P.(.*)..P...NAME', line, re.M|re.I)
if matchObj:
print("Name : ", matchObj.group(1))
输出:
Name : Pedrito De Rosa
低于
import xml.etree.ElementTree as ET
elements = ['NAME','ADDRESS','SUBJECT','QUERYS']
data = {}
xml = '''<?xml version="1.0" encoding="ISO-8859-1"?>
<root>
<record ID="#046CE9401D01467B2BDBAF0" NumDoc="1461">
<NAME>
<P>Pedrito De Rosa</P>
<P>NIE X1111222233</P>
<P>tf 2283396922</P>
<P>efael@hostmailer.com</P>
</NAME>
<ADDRESS>
<P>Paseo Jauregizahar 234 - 1. A. Donostia </P>
</ADDRESS>
<SUBJECT>
<P>paisaje y ciudad </P>
</SUBJECT>
<QUERYS>
<P>2014-12-10 Avance Normas Subsidiarias</P>
<P>Otras consultas</P>
</QUERYS>
</record>
</root>'''
root = ET.fromstring(xml)
for e in elements:
lst = root.find('.//record/{}'.format(e)).getchildren()
data[e] = [x.text for x in lst]