xml 文件每次迭代的新列表
New list each iteration for xml file
示例 XML 文件:
<main>
<data>
<some>111</some>
<other>222</other>
<more>333</more>
</data>
<data>
<some>444</some>
<other>555</other>
<more>666</more>
</data>
<data>
<some>777</some>
<other>888</other>
<more>999</more>
</data>
</main>
我想为数据的每个子项创建一个列表。例如:
1 = [111, 222, 333]
2 = [444, 555, 666]
3 = [777, 888, 999]
我想创建一个循环遍历整个 XML 文件,然后创建一个新列表(不覆盖先前创建的列表)来存储下一组数据。
tree = et.parse(xml_file)
root = tree.getroot()
num = 0
for child in root:
num = []
num += 1
for element in child:
num.append(element.text)
我知道这段代码行不通,但我希望它能让我了解我想要达到的目的。我不确定如何解决这个问题,正在寻找想法。
您可以使用 BeautifulSoup
解析 xml
并将每个 data
块的子子项存储在字典中。 enumerate
可用于提供数字父键:
from bs4 import BeautifulSoup as soup
import re
d = soup(open('file.xml').read(), 'xml')
result = {i:[int(j.text) for j in a.find_all(re.compile('some|other|more'))] for i, a in enumerate(d.find_all('data'), 1)}
输出:
{1: [111, 222, 333], 2: [444, 555, 666], 3: [777, 888, 999]}
如果你不想创建字典,你可以简单地使用解包:
a, b, c = [[int(i.text) for i in a.find_all(re.compile('some|other|more'))] for a in d.find_all('data')]
输出:
[111, 222, 333]
[444, 555, 666]
[777, 888, 999]
这里(没有使用外部库)
import xml.etree.ElementTree as ET
xml = '''<main>
<data>
<some>111</some>
<other>222</other>
<more>333</more>
</data>
<data>
<some>444</some>
<other>555</other>
<more>666</more>
</data>
<data>
<some>777</some>
<other>888</other>
<more>999</more>
</data>
</main>'''
root = ET.fromstring(xml)
collected_data = []
for d in root.findall('.//data'):
collected_data.append([d.find(x).text for x in ['some', 'other', 'more']])
print(collected_data)
# if the output needs to be a dict
collected_data = {idx + 1: entry for idx, entry in enumerate(collected_data)}
print(collected_data)
输出
[['111', '222', '333'], ['444', '555', '666'], ['777', '888', '999']]
{1: ['111', '222', '333'], 2: ['444', '555', '666'], 3: ['777', '888', '999']}
示例 XML 文件:
<main>
<data>
<some>111</some>
<other>222</other>
<more>333</more>
</data>
<data>
<some>444</some>
<other>555</other>
<more>666</more>
</data>
<data>
<some>777</some>
<other>888</other>
<more>999</more>
</data>
</main>
我想为数据的每个子项创建一个列表。例如:
1 = [111, 222, 333]
2 = [444, 555, 666]
3 = [777, 888, 999]
我想创建一个循环遍历整个 XML 文件,然后创建一个新列表(不覆盖先前创建的列表)来存储下一组数据。
tree = et.parse(xml_file)
root = tree.getroot()
num = 0
for child in root:
num = []
num += 1
for element in child:
num.append(element.text)
我知道这段代码行不通,但我希望它能让我了解我想要达到的目的。我不确定如何解决这个问题,正在寻找想法。
您可以使用 BeautifulSoup
解析 xml
并将每个 data
块的子子项存储在字典中。 enumerate
可用于提供数字父键:
from bs4 import BeautifulSoup as soup
import re
d = soup(open('file.xml').read(), 'xml')
result = {i:[int(j.text) for j in a.find_all(re.compile('some|other|more'))] for i, a in enumerate(d.find_all('data'), 1)}
输出:
{1: [111, 222, 333], 2: [444, 555, 666], 3: [777, 888, 999]}
如果你不想创建字典,你可以简单地使用解包:
a, b, c = [[int(i.text) for i in a.find_all(re.compile('some|other|more'))] for a in d.find_all('data')]
输出:
[111, 222, 333]
[444, 555, 666]
[777, 888, 999]
这里(没有使用外部库)
import xml.etree.ElementTree as ET
xml = '''<main>
<data>
<some>111</some>
<other>222</other>
<more>333</more>
</data>
<data>
<some>444</some>
<other>555</other>
<more>666</more>
</data>
<data>
<some>777</some>
<other>888</other>
<more>999</more>
</data>
</main>'''
root = ET.fromstring(xml)
collected_data = []
for d in root.findall('.//data'):
collected_data.append([d.find(x).text for x in ['some', 'other', 'more']])
print(collected_data)
# if the output needs to be a dict
collected_data = {idx + 1: entry for idx, entry in enumerate(collected_data)}
print(collected_data)
输出
[['111', '222', '333'], ['444', '555', '666'], ['777', '888', '999']]
{1: ['111', '222', '333'], 2: ['444', '555', '666'], 3: ['777', '888', '999']}