Python / XML: lxml 插入在带有 deepcopy 的循环中不起作用
Python / XML: lxml insert not functioning in a loop with deepcopy
我正在尝试使用 lxml + itertools 插入先前元素的深度复制。无论我做什么我都只能让新元素插入一次。
在示例中,您可以看到我尝试插入 5 次(实际上这是一个变量,但您知道 - 保持简单)。
我的实际 XML 并不太复杂,但确实有很多元素,所以我会提供一个较小的示例版本。
我在这里错过了什么?在继续循环之前执行 root.insert(insertPosition newWx) 之后,我是否需要以某种方式将更新写入根或树?
from lxml import etree
import copy
import itertools
tree = etree.parse('myfile.xml')
root = tree.getroot()
# Find element to copy
originalWx = tree.find("Weather")
insertPosition = int(tree.xpath('count(//Weather[last()]/preceding-sibling::*)')+1)
print("Next position for <Weather> is: " + str(insertPosition))
# Create a copy
newWx = copy.deepcopy(originalWx)
for _ in itertools.repeat(None, 5):
root.insert(insertPosition, newWx)
insertPosition = insertPosition + 1
开始 XML:
<ProjectDataSet>
<Project>
<Id>0.1.2</Id>
<Project_Name>Weather Stream Sample</Project_Name>
<End_Time>2021-06-30T13:00:00+10:00</End_Time>
<Comments>Project Comments</Comments>
</Project>
<Thing>
<Id>2</Id>
<Project_Id>492</Project_Id>
<Weather_Id>2</Weather_Id>
<Merged_By>0</Merged_By>
<Merged>0001-01-01T00:00:00+10:00</Merged>
<Comments/>
</Thing>
<Detail>
<Order_Id>1</Order_Id>
<X>1095935</X>
<Y>6999365</Y>
</Detail>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather_Entry>
<Weather_Id>2</Weather_Id>
<Weather_Time>2021-06-29T10:00:00+10:00</Weather_Time>
<Temperature>28</Temperature>
<Rel_Humidity>20</Rel_Humidity>
<Wind_Speed>15.4</Wind_Speed>
</Weather_Entry>
<Weather_Entry>
<Weather_Id>2</Weather_Id>
<Weather_Time>2021-06-29T11:00:00+10:00</Weather_Time>
<Temperature>29</Temperature>
<Rel_Humidity>24</Rel_Humidity>
<Wind_Speed>12.4</Wind_Speed>
</Weather_Entry>
<Setting>
<stuff>True</stuff>
</Setting>
<Setting>
<stuff2>False</stuff2>
</Setting>
<ProjectDataSet>
我得到的:
<ProjectDataSet>
<Project>
<Id>0.1.2</Id>
<Project_Name>Weather Stream Sample</Project_Name>
<End_Time>2021-06-30T13:00:00+10:00</End_Time>
<Comments>Project Comments</Comments>
</Project>
<Thing>
<Id>2</Id>
<Project_Id>492</Project_Id>
<Weather_Id>2</Weather_Id>
<Merged_By>0</Merged_By>
<Merged>0001-01-01T00:00:00+10:00</Merged>
<Comments/>
</Thing>
<Detail>
<Order_Id>1</Order_Id>
<X>1095935</X>
<Y>6999365</Y>
</Detail>
*** Added break for clarity ***
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather_Entry>
<Weather_Id>2</Weather_Id>
<Weather_Time>2021-06-29T10:00:00+10:00</Weather_Time>
<Temperature>28</Temperature>
<Rel_Humidity>20</Rel_Humidity>
<Wind_Speed>15.4</Wind_Speed>
</Weather_Entry>
<Weather_Entry>
<Weather_Id>2</Weather_Id>
<Weather_Time>2021-06-29T11:00:00+10:00</Weather_Time>
<Temperature>29</Temperature>
<Rel_Humidity>24</Rel_Humidity>
<Wind_Speed>12.4</Wind_Speed>
</Weather_Entry>
<Setting>
<stuff>True</stuff>
</Setting>
<Setting>
<stuff2>False</stuff2>
</Setting>
<ProjectDataSet>
我期望得到的:
<ProjectDataSet>
<Project>
<Id>0.1.2</Id>
<Project_Name>Weather Stream Sample</Project_Name>
<End_Time>2021-06-30T13:00:00+10:00</End_Time>
<Comments>Project Comments</Comments>
</Project>
<Thing>
<Id>2</Id>
<Project_Id>492</Project_Id>
<Weather_Id>2</Weather_Id>
<Merged_By>0</Merged_By>
<Merged>0001-01-01T00:00:00+10:00</Merged>
<Comments/>
</Thing>
<Detail>
<Order_Id>1</Order_Id>
<X>1095935</X>
<Y>6999365</Y>
</Detail>
*** Added break for clarity ***
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather_Entry>
<Weather_Id>2</Weather_Id>
<Weather_Time>2021-06-29T10:00:00+10:00</Weather_Time>
<Temperature>28</Temperature>
<Rel_Humidity>20</Rel_Humidity>
<Wind_Speed>15.4</Wind_Speed>
</Weather_Entry>
<Weather_Entry>
<Weather_Id>2</Weather_Id>
<Weather_Time>2021-06-29T11:00:00+10:00</Weather_Time>
<Temperature>29</Temperature>
<Rel_Humidity>24</Rel_Humidity>
<Wind_Speed>12.4</Wind_Speed>
</Weather_Entry>
<Setting>
<stuff>True</stuff>
</Setting>
<Setting>
<stuff2>False</stuff2>
</Setting>
<ProjectDataSet>
如果要添加 originalWx
的五个副本,则需要制作五个副本。您只能制作一个副本,并且不能多次添加同一子树。
所以删除
newWx = copy.deepcopy(originalWx)
然后将循环更改为
root.insert(insertPosition, copy.deepcopy(originalWx))
这样每个循环都会插入一个新副本。
在这方面,lxml.etree
API 与 JavaScript DOM 的实现非常相似。 XML 树中的每个元素都有一个 parent(您可以使用 parent()
方法发现),这意味着当您将元素作为 child 插入树中时某个节点的,它不能再是另一个节点的child。换句话说,它将被移动。 compatibility with ElementTree section of the lxml.etree documentation 中对此进行了说明。 (这是第四个要点。我找不到更准确的 link。)
我正在尝试使用 lxml + itertools 插入先前元素的深度复制。无论我做什么我都只能让新元素插入一次。
在示例中,您可以看到我尝试插入 5 次(实际上这是一个变量,但您知道 - 保持简单)。
我的实际 XML 并不太复杂,但确实有很多元素,所以我会提供一个较小的示例版本。
我在这里错过了什么?在继续循环之前执行 root.insert(insertPosition newWx) 之后,我是否需要以某种方式将更新写入根或树?
from lxml import etree
import copy
import itertools
tree = etree.parse('myfile.xml')
root = tree.getroot()
# Find element to copy
originalWx = tree.find("Weather")
insertPosition = int(tree.xpath('count(//Weather[last()]/preceding-sibling::*)')+1)
print("Next position for <Weather> is: " + str(insertPosition))
# Create a copy
newWx = copy.deepcopy(originalWx)
for _ in itertools.repeat(None, 5):
root.insert(insertPosition, newWx)
insertPosition = insertPosition + 1
开始 XML:
<ProjectDataSet>
<Project>
<Id>0.1.2</Id>
<Project_Name>Weather Stream Sample</Project_Name>
<End_Time>2021-06-30T13:00:00+10:00</End_Time>
<Comments>Project Comments</Comments>
</Project>
<Thing>
<Id>2</Id>
<Project_Id>492</Project_Id>
<Weather_Id>2</Weather_Id>
<Merged_By>0</Merged_By>
<Merged>0001-01-01T00:00:00+10:00</Merged>
<Comments/>
</Thing>
<Detail>
<Order_Id>1</Order_Id>
<X>1095935</X>
<Y>6999365</Y>
</Detail>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather_Entry>
<Weather_Id>2</Weather_Id>
<Weather_Time>2021-06-29T10:00:00+10:00</Weather_Time>
<Temperature>28</Temperature>
<Rel_Humidity>20</Rel_Humidity>
<Wind_Speed>15.4</Wind_Speed>
</Weather_Entry>
<Weather_Entry>
<Weather_Id>2</Weather_Id>
<Weather_Time>2021-06-29T11:00:00+10:00</Weather_Time>
<Temperature>29</Temperature>
<Rel_Humidity>24</Rel_Humidity>
<Wind_Speed>12.4</Wind_Speed>
</Weather_Entry>
<Setting>
<stuff>True</stuff>
</Setting>
<Setting>
<stuff2>False</stuff2>
</Setting>
<ProjectDataSet>
我得到的:
<ProjectDataSet>
<Project>
<Id>0.1.2</Id>
<Project_Name>Weather Stream Sample</Project_Name>
<End_Time>2021-06-30T13:00:00+10:00</End_Time>
<Comments>Project Comments</Comments>
</Project>
<Thing>
<Id>2</Id>
<Project_Id>492</Project_Id>
<Weather_Id>2</Weather_Id>
<Merged_By>0</Merged_By>
<Merged>0001-01-01T00:00:00+10:00</Merged>
<Comments/>
</Thing>
<Detail>
<Order_Id>1</Order_Id>
<X>1095935</X>
<Y>6999365</Y>
</Detail>
*** Added break for clarity ***
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather_Entry>
<Weather_Id>2</Weather_Id>
<Weather_Time>2021-06-29T10:00:00+10:00</Weather_Time>
<Temperature>28</Temperature>
<Rel_Humidity>20</Rel_Humidity>
<Wind_Speed>15.4</Wind_Speed>
</Weather_Entry>
<Weather_Entry>
<Weather_Id>2</Weather_Id>
<Weather_Time>2021-06-29T11:00:00+10:00</Weather_Time>
<Temperature>29</Temperature>
<Rel_Humidity>24</Rel_Humidity>
<Wind_Speed>12.4</Wind_Speed>
</Weather_Entry>
<Setting>
<stuff>True</stuff>
</Setting>
<Setting>
<stuff2>False</stuff2>
</Setting>
<ProjectDataSet>
我期望得到的:
<ProjectDataSet>
<Project>
<Id>0.1.2</Id>
<Project_Name>Weather Stream Sample</Project_Name>
<End_Time>2021-06-30T13:00:00+10:00</End_Time>
<Comments>Project Comments</Comments>
</Project>
<Thing>
<Id>2</Id>
<Project_Id>492</Project_Id>
<Weather_Id>2</Weather_Id>
<Merged_By>0</Merged_By>
<Merged>0001-01-01T00:00:00+10:00</Merged>
<Comments/>
</Thing>
<Detail>
<Order_Id>1</Order_Id>
<X>1095935</X>
<Y>6999365</Y>
</Detail>
*** Added break for clarity ***
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather>
<Id>2</Id>
<Weather_Name>Original Weather</Weather_Name>
<Comments>Original Weather</Comments>
<Latitude>-27</Latitude>
<Longitude>153</Longitude>
</Weather>
<Weather_Entry>
<Weather_Id>2</Weather_Id>
<Weather_Time>2021-06-29T10:00:00+10:00</Weather_Time>
<Temperature>28</Temperature>
<Rel_Humidity>20</Rel_Humidity>
<Wind_Speed>15.4</Wind_Speed>
</Weather_Entry>
<Weather_Entry>
<Weather_Id>2</Weather_Id>
<Weather_Time>2021-06-29T11:00:00+10:00</Weather_Time>
<Temperature>29</Temperature>
<Rel_Humidity>24</Rel_Humidity>
<Wind_Speed>12.4</Wind_Speed>
</Weather_Entry>
<Setting>
<stuff>True</stuff>
</Setting>
<Setting>
<stuff2>False</stuff2>
</Setting>
<ProjectDataSet>
如果要添加 originalWx
的五个副本,则需要制作五个副本。您只能制作一个副本,并且不能多次添加同一子树。
所以删除
newWx = copy.deepcopy(originalWx)
然后将循环更改为
root.insert(insertPosition, copy.deepcopy(originalWx))
这样每个循环都会插入一个新副本。
在这方面,lxml.etree
API 与 JavaScript DOM 的实现非常相似。 XML 树中的每个元素都有一个 parent(您可以使用 parent()
方法发现),这意味着当您将元素作为 child 插入树中时某个节点的,它不能再是另一个节点的child。换句话说,它将被移动。 compatibility with ElementTree section of the lxml.etree documentation 中对此进行了说明。 (这是第四个要点。我找不到更准确的 link。)