使用 lxml python 从 xml 文件中删除元素
remove element from xml file with lxml python
我正在尝试从大 xml 文件中删除特定条目。
我从应删除的文本条目列表中按文本找到了特定条目。
我运行这个代码:
#!/usr/bin/env python
from lxml import etree
g = open("/root/simplexml.xml", "rw")
f = etree.parse(g)
listdown = ["http://aiddp.org/administrator/components/com_attachments/controllers/Global%20Service/86af744091ea22ad5b1372ac7978b51f","http://primepromap.com/es/wp-includes/css/survey/survey/index.php?randInboxLightaspxn.17http://primepromap.com/es/wp-includes/css/survey/survey/index.php?randInboxLightaspxn.1774256418http:/peelrealest.com/property/ihttp://www.nwolb.com.default.aspx.refererident.568265843.puntopatrones.cl/wp-admin/js/upgrade/upgrade1.zip-extracted/upgrade/newp/loading.php="]
for downsite in listdown:
for found in f.xpath(".//url[text()='"+downsite+"']"):
print "deleted "+str(found)
found.getparent().remove(found)
print "over"
它应该可以工作,但是在我打开 xml 文件后,应该删除的条目仍然存在...
这里有什么问题?
您需要将修改后的树转储回xml文件:
f.write("/root/simplexml.xml")
我正在尝试从大 xml 文件中删除特定条目。
我从应删除的文本条目列表中按文本找到了特定条目。
我运行这个代码:
#!/usr/bin/env python
from lxml import etree
g = open("/root/simplexml.xml", "rw")
f = etree.parse(g)
listdown = ["http://aiddp.org/administrator/components/com_attachments/controllers/Global%20Service/86af744091ea22ad5b1372ac7978b51f","http://primepromap.com/es/wp-includes/css/survey/survey/index.php?randInboxLightaspxn.17http://primepromap.com/es/wp-includes/css/survey/survey/index.php?randInboxLightaspxn.1774256418http:/peelrealest.com/property/ihttp://www.nwolb.com.default.aspx.refererident.568265843.puntopatrones.cl/wp-admin/js/upgrade/upgrade1.zip-extracted/upgrade/newp/loading.php="]
for downsite in listdown:
for found in f.xpath(".//url[text()='"+downsite+"']"):
print "deleted "+str(found)
found.getparent().remove(found)
print "over"
它应该可以工作,但是在我打开 xml 文件后,应该删除的条目仍然存在... 这里有什么问题?
您需要将修改后的树转储回xml文件:
f.write("/root/simplexml.xml")