使用 python 比较 xml 个文件
compare xml files using python
我想比较这两个 xml 文件:
File1.xml:
<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
<type st="9999" />
</gastro_prelim_st>
</results>
</ngs_sample>
File2.xml:
<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
</gastro_prelim_st>
</results>
</ngs_sample>
我用 xmldiff
来比较 a.xml 和 b.xml:
def compare_xmls(observed,expected):
from xmldiff import main, formatting
formatter = formatting.DiffFormatter()
diff = main.diff_files(observed,expected,formatter=formatter)
return diff
out = compare_xmls(a.xml, b.xml)
print(out)
输出:
[delete, /ngs_sample/results/gastro_prelim_st/type[2]]
任何人都知道如何识别两个 xml 文件之间的区别,即与文件 b.xml 相比删除了什么。有人推荐任何其他比较 python 中的 xml 文件的方法吗?
使用 xmldiff 执行此任务。
main.py
from xmldiff import main
diff = main.diff_files("file1.xml", "file2.xml")
print(diff)
输出
[DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]
您可以切换到 XMLFormatter
并手动过滤出结果:
...
# Change formatter:
formatter = formatting.XMLFormatter(normalize=formatting.WS_BOTH)
...
# after `out` has been retrieved:
import re
for i in out.splitlines():
if re.search(r'\bdiff:\w+', i):
print(i)
# Result:
# <type st="9999" diff:delete=""/>
另一种选择是使用 xml2
https://github.com/clone/xml2(以及类似 bash
进程替换)
$ diff --color <(xml2 < File1.xml) <(xml2 < File2.xml)
7,8d6
< /ngs_sample/results/gastro_prelim_st/type
< /ngs_sample/results/gastro_prelim_st/type/@st=9999
我想比较这两个 xml 文件:
File1.xml:
<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
<type st="9999" />
</gastro_prelim_st>
</results>
</ngs_sample>
File2.xml:
<ngs_sample id="40332">
<workflow value="salmonella" version="101_provisional" />
<results>
<gastro_prelim_st reason="not novel" success="false">
<type st="1364" />
</gastro_prelim_st>
</results>
</ngs_sample>
我用 xmldiff
来比较 a.xml 和 b.xml:
def compare_xmls(observed,expected):
from xmldiff import main, formatting
formatter = formatting.DiffFormatter()
diff = main.diff_files(observed,expected,formatter=formatter)
return diff
out = compare_xmls(a.xml, b.xml)
print(out)
输出:
[delete, /ngs_sample/results/gastro_prelim_st/type[2]]
任何人都知道如何识别两个 xml 文件之间的区别,即与文件 b.xml 相比删除了什么。有人推荐任何其他比较 python 中的 xml 文件的方法吗?
使用 xmldiff 执行此任务。
main.py
from xmldiff import main
diff = main.diff_files("file1.xml", "file2.xml")
print(diff)
输出
[DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]
您可以切换到 XMLFormatter
并手动过滤出结果:
...
# Change formatter:
formatter = formatting.XMLFormatter(normalize=formatting.WS_BOTH)
...
# after `out` has been retrieved:
import re
for i in out.splitlines():
if re.search(r'\bdiff:\w+', i):
print(i)
# Result:
# <type st="9999" diff:delete=""/>
另一种选择是使用 xml2
https://github.com/clone/xml2(以及类似 bash
进程替换)
$ diff --color <(xml2 < File1.xml) <(xml2 < File2.xml)
7,8d6
< /ngs_sample/results/gastro_prelim_st/type
< /ngs_sample/results/gastro_prelim_st/type/@st=9999