Python / XML: lxml 插入在带有 deepcopy 的循环中不起作用

Python / XML: lxml insert not functioning in a loop with deepcopy

我正在尝试使用 lxml + itertools 插入先前元素的深度复制。无论我做什么我都只能让新元素插入一次。

在示例中,您可以看到我尝试插入 5 次(实际上这是一个变量,但您知道 - 保持简单)。

我的实际 XML 并不太复杂,但确实有很多元素,所以我会提供一个较小的示例版本。

我在这里错过了什么?在继续循环之前执行 root.insert(insertPosition newWx) 之后,我是否需要以某种方式将更新写入根或树?

from lxml import etree
import copy
import itertools

tree = etree.parse('myfile.xml')
root = tree.getroot()

# Find element to copy 
originalWx = tree.find("Weather")

insertPosition = int(tree.xpath('count(//Weather[last()]/preceding-sibling::*)')+1)
print("Next position for <Weather> is: " + str(insertPosition))

# Create a copy
newWx = copy.deepcopy(originalWx)

for _ in itertools.repeat(None, 5):
    root.insert(insertPosition, newWx)
    insertPosition = insertPosition + 1

开始 XML:

<ProjectDataSet>
  <Project>
    <Id>0.1.2</Id>
    <Project_Name>Weather Stream Sample</Project_Name>
    <End_Time>2021-06-30T13:00:00+10:00</End_Time>
    <Comments>Project Comments</Comments>
  </Project>
  <Thing>
    <Id>2</Id>
    <Project_Id>492</Project_Id>
    <Weather_Id>2</Weather_Id>
    <Merged_By>0</Merged_By>
    <Merged>0001-01-01T00:00:00+10:00</Merged>
    <Comments/>
  </Thing>
  <Detail>
    <Order_Id>1</Order_Id>
    <X>1095935</X>
    <Y>6999365</Y>
  </Detail>
  <Weather>
    <Id>2</Id>
    <Weather_Name>Original Weather</Weather_Name>
    <Comments>Original Weather</Comments>
    <Latitude>-27</Latitude>
    <Longitude>153</Longitude>
  </Weather>
  <Weather_Entry>
    <Weather_Id>2</Weather_Id>
    <Weather_Time>2021-06-29T10:00:00+10:00</Weather_Time>
    <Temperature>28</Temperature>
    <Rel_Humidity>20</Rel_Humidity>
    <Wind_Speed>15.4</Wind_Speed>
  </Weather_Entry>
  <Weather_Entry>
    <Weather_Id>2</Weather_Id>
    <Weather_Time>2021-06-29T11:00:00+10:00</Weather_Time>
    <Temperature>29</Temperature>
    <Rel_Humidity>24</Rel_Humidity>
    <Wind_Speed>12.4</Wind_Speed>
  </Weather_Entry>
  <Setting>
    <stuff>True</stuff>
  </Setting>
  <Setting>
    <stuff2>False</stuff2>
  </Setting>
<ProjectDataSet>

我得到的:

<ProjectDataSet>
  <Project>
    <Id>0.1.2</Id>
    <Project_Name>Weather Stream Sample</Project_Name>
    <End_Time>2021-06-30T13:00:00+10:00</End_Time>
    <Comments>Project Comments</Comments>
  </Project>
  <Thing>
    <Id>2</Id>
    <Project_Id>492</Project_Id>
    <Weather_Id>2</Weather_Id>
    <Merged_By>0</Merged_By>
    <Merged>0001-01-01T00:00:00+10:00</Merged>
    <Comments/>
  </Thing>
  <Detail>
    <Order_Id>1</Order_Id>
    <X>1095935</X>
    <Y>6999365</Y>
  </Detail>

*** Added break for clarity ***

  <Weather>
    <Id>2</Id>
    <Weather_Name>Original Weather</Weather_Name>
    <Comments>Original Weather</Comments>
    <Latitude>-27</Latitude>
    <Longitude>153</Longitude>
  </Weather>
  <Weather>
    <Id>2</Id>
    <Weather_Name>Original Weather</Weather_Name>
    <Comments>Original Weather</Comments>
    <Latitude>-27</Latitude>
    <Longitude>153</Longitude>
  </Weather>



  <Weather_Entry>
    <Weather_Id>2</Weather_Id>
    <Weather_Time>2021-06-29T10:00:00+10:00</Weather_Time>
    <Temperature>28</Temperature>
    <Rel_Humidity>20</Rel_Humidity>
    <Wind_Speed>15.4</Wind_Speed>
  </Weather_Entry>
  <Weather_Entry>
    <Weather_Id>2</Weather_Id>
    <Weather_Time>2021-06-29T11:00:00+10:00</Weather_Time>
    <Temperature>29</Temperature>
    <Rel_Humidity>24</Rel_Humidity>
    <Wind_Speed>12.4</Wind_Speed>
  </Weather_Entry>
  <Setting>
    <stuff>True</stuff>
  </Setting>
  <Setting>
    <stuff2>False</stuff2>
  </Setting>
<ProjectDataSet>

我期望得到的:

<ProjectDataSet>
  <Project>
    <Id>0.1.2</Id>
    <Project_Name>Weather Stream Sample</Project_Name>
    <End_Time>2021-06-30T13:00:00+10:00</End_Time>
    <Comments>Project Comments</Comments>
  </Project>
  <Thing>
    <Id>2</Id>
    <Project_Id>492</Project_Id>
    <Weather_Id>2</Weather_Id>
    <Merged_By>0</Merged_By>
    <Merged>0001-01-01T00:00:00+10:00</Merged>
    <Comments/>
  </Thing>
  <Detail>
    <Order_Id>1</Order_Id>
    <X>1095935</X>
    <Y>6999365</Y>
  </Detail>

*** Added break for clarity ***

  <Weather>
    <Id>2</Id>
    <Weather_Name>Original Weather</Weather_Name>
    <Comments>Original Weather</Comments>
    <Latitude>-27</Latitude>
    <Longitude>153</Longitude>
  </Weather>
  <Weather>
    <Id>2</Id>
    <Weather_Name>Original Weather</Weather_Name>
    <Comments>Original Weather</Comments>
    <Latitude>-27</Latitude>
    <Longitude>153</Longitude>
  </Weather>
  <Weather>
    <Id>2</Id>
    <Weather_Name>Original Weather</Weather_Name>
    <Comments>Original Weather</Comments>
    <Latitude>-27</Latitude>
    <Longitude>153</Longitude>
  </Weather>
  <Weather>
    <Id>2</Id>
    <Weather_Name>Original Weather</Weather_Name>
    <Comments>Original Weather</Comments>
    <Latitude>-27</Latitude>
    <Longitude>153</Longitude>
  </Weather>
  <Weather>
    <Id>2</Id>
    <Weather_Name>Original Weather</Weather_Name>
    <Comments>Original Weather</Comments>
    <Latitude>-27</Latitude>
    <Longitude>153</Longitude>
  </Weather>
  <Weather>
    <Id>2</Id>
    <Weather_Name>Original Weather</Weather_Name>
    <Comments>Original Weather</Comments>
    <Latitude>-27</Latitude>
    <Longitude>153</Longitude>
  </Weather>



  <Weather_Entry>
    <Weather_Id>2</Weather_Id>
    <Weather_Time>2021-06-29T10:00:00+10:00</Weather_Time>
    <Temperature>28</Temperature>
    <Rel_Humidity>20</Rel_Humidity>
    <Wind_Speed>15.4</Wind_Speed>
  </Weather_Entry>
  <Weather_Entry>
    <Weather_Id>2</Weather_Id>
    <Weather_Time>2021-06-29T11:00:00+10:00</Weather_Time>
    <Temperature>29</Temperature>
    <Rel_Humidity>24</Rel_Humidity>
    <Wind_Speed>12.4</Wind_Speed>
  </Weather_Entry>
  <Setting>
    <stuff>True</stuff>
  </Setting>
  <Setting>
    <stuff2>False</stuff2>
  </Setting>
<ProjectDataSet>

如果要添加 originalWx 的五个副本,则需要制作五个副本。您只能制作一个副本,并且不能多次添加同一子树。

所以删除

newWx = copy.deepcopy(originalWx)

然后将循环更改为

root.insert(insertPosition, copy.deepcopy(originalWx))

这样每个循环都会插入一个新副本。

在这方面,lxml.etree API 与 JavaScript DOM 的实现非常相似。 XML 树中的每个元素都有一个 parent(您可以使用 parent() 方法发现),这意味着当您将元素作为 child 插入树中时某个节点的,它不能再是另一个节点的child。换句话说,它将被移动。 compatibility with ElementTree section of the lxml.etree documentation 中对此进行了说明。 (这是第四个要点。我找不到更准确的 link。)