如何遍历 xml 文件以提取某些属性?
How to iterate over a xml file to extract some attributes?
我正在尝试从 xml 文件中提取值并将其保存为数据框。对于每个 line 元素,我想添加 chk 元素中的日期。
<?xml version="1.0" encoding="ISO-8859-1"?>
<sales>
<chk no="xxx" date="xxxx" time="xxx" total="xxxx" debtor="xxxx" name="xxx" cardnumber="xxxxxxx" mobil="" >
<line productId="xxxx" product="xxxx" productGroupId="xxx" productGroup="xxx" amount="x" price="xxx" />
<line productId="xxx" product="xxx" productGroupId="xxx" productGroup="xxx" amount="xx" price="xxxx" />
</chk>
<chk no="xxx" date="xxxx" time="xx" total="xxxx" debtor="xxxx" name="xxxx" cardnumber="xxxx" mobil="xxxxx" >
<line productId="xxxx" product="xxxxx" productGroupId="xxxx" productGroup="xxx" amount="xxxx" price="xxxx" />
<line productId="xxxxx" product="xxxxx" productGroupId="xxxx" productGroup="xxxx" amount="xxx" price="xxxxx" />
</chk>
</sales>
root = ET.fromstring(response.content)
sales = []
for date in root.iter('chk'):
sales.append(date.attrib)
lines = []
for line in root.iter('line'):
lines.append(line.attrib)
我能够分别提取 chk 和 line 元素。如何将日期附加到行列表?
迭代 chk 迭代中的行并使用日期 i/o 根作为迭代对象。类似的东西
root = ET.fromstring(resp)
for date in root.iter('chk'):
for line in date.iter('line'):
print(date.attrib,line.attrib)
import xml.etree.ElementTree as ET
xml = '''<?xml version="1.0" encoding="ISO-8859-1"?>
<sales>
<chk no="xxx" date="xxxx" time="xxx" total="xxxx" debtor="xxxx" name="xxx" cardnumber="xxxxxxx" mobil="" >
<line productId="xxxx" product="xxxx" productGroupId="xxx" productGroup="xxx" amount="x" price="xxx" />
<line productId="xxx" product="xxx" productGroupId="xxx" productGroup="xxx" amount="xx" price="xxxx" />
</chk>
<chk no="xxx" date="zzzz" time="xx" total="xxxx" debtor="xxxx" name="xxxx" cardnumber="xxxx" mobil="xxxxx" >
<line productId="xxxx" product="xxxxx" productGroupId="xxxx" productGroup="xxx" amount="xxxx" price="xxxx" />
<line productId="xxxxx" product="xxxxx" productGroupId="xxxx" productGroup="xxxx" amount="xxx" price="xxxxx" />
</chk>
</sales>'''
root = ET.fromstring(xml)
for chk in root.findall('.//chk'):
for line in chk.findall('line'):
line.attrib['date'] = chk.attrib['date']
ET.dump(root)
我正在尝试从 xml 文件中提取值并将其保存为数据框。对于每个 line 元素,我想添加 chk 元素中的日期。
<?xml version="1.0" encoding="ISO-8859-1"?>
<sales>
<chk no="xxx" date="xxxx" time="xxx" total="xxxx" debtor="xxxx" name="xxx" cardnumber="xxxxxxx" mobil="" >
<line productId="xxxx" product="xxxx" productGroupId="xxx" productGroup="xxx" amount="x" price="xxx" />
<line productId="xxx" product="xxx" productGroupId="xxx" productGroup="xxx" amount="xx" price="xxxx" />
</chk>
<chk no="xxx" date="xxxx" time="xx" total="xxxx" debtor="xxxx" name="xxxx" cardnumber="xxxx" mobil="xxxxx" >
<line productId="xxxx" product="xxxxx" productGroupId="xxxx" productGroup="xxx" amount="xxxx" price="xxxx" />
<line productId="xxxxx" product="xxxxx" productGroupId="xxxx" productGroup="xxxx" amount="xxx" price="xxxxx" />
</chk>
</sales>
root = ET.fromstring(response.content)
sales = []
for date in root.iter('chk'):
sales.append(date.attrib)
lines = []
for line in root.iter('line'):
lines.append(line.attrib)
我能够分别提取 chk 和 line 元素。如何将日期附加到行列表?
迭代 chk 迭代中的行并使用日期 i/o 根作为迭代对象。类似的东西
root = ET.fromstring(resp)
for date in root.iter('chk'):
for line in date.iter('line'):
print(date.attrib,line.attrib)
import xml.etree.ElementTree as ET
xml = '''<?xml version="1.0" encoding="ISO-8859-1"?>
<sales>
<chk no="xxx" date="xxxx" time="xxx" total="xxxx" debtor="xxxx" name="xxx" cardnumber="xxxxxxx" mobil="" >
<line productId="xxxx" product="xxxx" productGroupId="xxx" productGroup="xxx" amount="x" price="xxx" />
<line productId="xxx" product="xxx" productGroupId="xxx" productGroup="xxx" amount="xx" price="xxxx" />
</chk>
<chk no="xxx" date="zzzz" time="xx" total="xxxx" debtor="xxxx" name="xxxx" cardnumber="xxxx" mobil="xxxxx" >
<line productId="xxxx" product="xxxxx" productGroupId="xxxx" productGroup="xxx" amount="xxxx" price="xxxx" />
<line productId="xxxxx" product="xxxxx" productGroupId="xxxx" productGroup="xxxx" amount="xxx" price="xxxxx" />
</chk>
</sales>'''
root = ET.fromstring(xml)
for chk in root.findall('.//chk'):
for line in chk.findall('line'):
line.attrib['date'] = chk.attrib['date']
ET.dump(root)