使用 Python 脚本将 XML 导入 Orange3
Import XML into Orange3 using Python script
我的计算机中有一个 xml 文档,看起来像这样:
<?xml version="1.0" encoding=UTF-8"?>
<IPDatas xmlns:xsi="http://www.w3.org/...>
<datas>
<dna>
<profile>
<loci>
<locus name="one">
<allele order="1">10</allele>
<allele order="2">12.3</allele>
</locus>
<locus name="two">
<allele order="1">11.1</allele>
<allele order="2">17</allele>
</locus>
<locus name="three">
<allele order="1">13.2</allele>
<allele order="2">12.3</allele>
</locus>
</loci>
</profile>
</dna>
</datas>
</IPdatas>
我想将文档导入到 Orange 中而不先将其转换到 Orange 之外,因此我可能需要使用“Python 脚本”小部件。导入后,我想把它转换成这样的table:
one_1
one_2
two_1
two_2
three_1
three_2
10
12.3
11.1
17
13.2
12.3
我对 Python 的了解不多,如有任何建议,我们将不胜感激!
类似于下面的内容:
import xml.etree.ElementTree as ET
import pprint
xml = '''
<IPDatas xmlns:xsi="http://www.w3.org/...">
<datas>
<dna>
<profile>
<loci>
<locus name="one">
<allele order="1">10</allele>
<allele order="2">12.3</allele>
</locus>
<locus name="two">
<allele order="1">11.1</allele>
<allele order="2">17</allele>
</locus>
<locus name="three">
<allele order="1">13.2</allele>
<allele order="2">12.3</allele>
</locus>
</loci>
</profile>
</dna>
</datas>
</IPDatas> '''
data = {}
root = ET.fromstring(xml)
locus_lst = root.findall('.//locus')
for locus in locus_lst:
name = locus.attrib['name']
allele_lst = locus.findall('allele')
for allele in allele_lst:
final_name = f"{name}_{allele.attrib['order']}"
value = float(allele.text)
data[final_name] = value
pprint.pprint(data)
输出(你应该能够与 Orange 一起使用的字典)
{'one_1': 10.0,
'one_2': 12.3,
'three_1': 13.2,
'three_2': 12.3,
'two_1': 11.1,
'two_2': 17.0}
我的计算机中有一个 xml 文档,看起来像这样:
<?xml version="1.0" encoding=UTF-8"?>
<IPDatas xmlns:xsi="http://www.w3.org/...>
<datas>
<dna>
<profile>
<loci>
<locus name="one">
<allele order="1">10</allele>
<allele order="2">12.3</allele>
</locus>
<locus name="two">
<allele order="1">11.1</allele>
<allele order="2">17</allele>
</locus>
<locus name="three">
<allele order="1">13.2</allele>
<allele order="2">12.3</allele>
</locus>
</loci>
</profile>
</dna>
</datas>
</IPdatas>
我想将文档导入到 Orange 中而不先将其转换到 Orange 之外,因此我可能需要使用“Python 脚本”小部件。导入后,我想把它转换成这样的table:
one_1 | one_2 | two_1 | two_2 | three_1 | three_2 |
---|---|---|---|---|---|
10 | 12.3 | 11.1 | 17 | 13.2 | 12.3 |
我对 Python 的了解不多,如有任何建议,我们将不胜感激!
类似于下面的内容:
import xml.etree.ElementTree as ET
import pprint
xml = '''
<IPDatas xmlns:xsi="http://www.w3.org/...">
<datas>
<dna>
<profile>
<loci>
<locus name="one">
<allele order="1">10</allele>
<allele order="2">12.3</allele>
</locus>
<locus name="two">
<allele order="1">11.1</allele>
<allele order="2">17</allele>
</locus>
<locus name="three">
<allele order="1">13.2</allele>
<allele order="2">12.3</allele>
</locus>
</loci>
</profile>
</dna>
</datas>
</IPDatas> '''
data = {}
root = ET.fromstring(xml)
locus_lst = root.findall('.//locus')
for locus in locus_lst:
name = locus.attrib['name']
allele_lst = locus.findall('allele')
for allele in allele_lst:
final_name = f"{name}_{allele.attrib['order']}"
value = float(allele.text)
data[final_name] = value
pprint.pprint(data)
输出(你应该能够与 Orange 一起使用的字典)
{'one_1': 10.0,
'one_2': 12.3,
'three_1': 13.2,
'three_2': 12.3,
'two_1': 11.1,
'two_2': 17.0}