需要帮助将子元素中的文本替换为 XML 中多个子元素的文本
Need help replacing text in a sub-element with text from multiple sub-elements in an XML
我正在尝试用树中其他子元素的文本替换 XML 树中的值文本。我是 Python 的新手,需要一些关于如何编写此内容的帮助。
我的 XML 示例,其中省略了一些长度元素:
<SalesOrder>
<SalesOrderLines>
<SalesOrderLine>
<Item>
<LineNo>1</LineNo>
<Quantity>4.00</Quantity>
</Item>
<ConfigurationDetails>
<ConfigurationDetail>
<ConfigurationAttribute>
<Name>ConfigurationModel</Name>
<Value>HV</Value>
</ConfigurationAttribute>
<ConfigurationAttribute>
<Name>EXWidth</Name>
<Value>59.5</Value>
</ConfigurationAttribute>
<ConfigurationAttribute>
<Name>EXHeight</Name>
<Value>59.5</Value>
</ConfigurationAttribute>
<ConfigurationAttribute>
<Name>Handing</Name>
<Value>XO</Value>
</ConfigurationAttribute>
<ConfigurationAttribute>
<Name>LongDescription</Name>
<Value>This is a long paragraph of text i want to replace with
the above text for the Value sub-element</Value>
</ConfigurationAttribute>
</ConfigurationDetail>
</ConfigurationDetails>
</SalesOrderLine>
</SalesOrderLines>
</SalesOrder>
这是我第一次尝试 Python 使用 ElementTree
库的代码:
import xml.etree.ElementTree as ET
from tkinter import Tk
from tkinter.filedialog import askopenfilename, asksaveasfilename
Tk().withdraw()
file = askopenfilename()
tree = ET.parse(file)
root = tree.getroot()
def model():
for ConfigurationAttribute in root.iter('ConfigurationAttribute'):
descrip = ConfigurationAttribute.find('Name').text
model = ''
if descrip == 'ConfigurationModel':
model = ConfigurationAttribute.find('Value').text
def handing():
for ConfigurationAttribute in root.iter('ConfigurationAttribute'):
descrip = ConfigurationAttribute.find('Name').text
handing = ''
if descrip == 'Handing' and ConfigurationAttribute.find('Value') is
not None:
handing = ConfigurationAttribute.find('Value').text
def width():
for ConfigurationAttribute in root.iter('ConfigurationAttribute'):
descrip = ConfigurationAttribute.find('Name').text
width = ''
if descrip == 'EXWidth':
width = ConfigurationAttribute.find('Value').text
def height():
for ConfigurationAttribute in root.iter('ConfigurationAttribute'):
descrip = ConfigurationAttribute.find('Name').text
height = ''
if descrip == 'EXHeight':
height = ConfigurationAttribute.find('Value').text
for ConfigurationAttribute in root.iter('ConfigurationAttribute'):
descrip = ConfigurationAttribute.find('Name').text
if descrip == 'LongDescription':
model()
handing()
width()
height()
ConfigurationAttribute.find('Value').text = str(model), str(handing),
str(width), '" x ', str(height), '"'
tree.write(asksaveasfilename(defaultextension='.xml',))
这会输出错误。我正在查看的是 Value 子元素中的文本段落,将替换为来自 ConfigurationModel、Handing、EXWidth 和 EXHeight Name 子元素的 Value 子元素文本,如下所示:
<ConfigurationAttribute>
<Name>LongDescription</Name>
<Value> HV, XO, 59.5" x 59.5"</Value>
</ConfigurationAttribute>
下面是我在 运行 代码时收到的错误:
回溯(最近调用最后):
文件“\app\users\Home\natep\Documents\NP\py\PrestoParse.py”,第 59 行,位于
tree.write(asksaveasfilename(defaultextension='.xml',))
文件 "C:\Users\natep.RANDK\AppData\Local\Programs\Python\Python37-32\lib\xml\etree\ElementTree.py",第 777 行,写入中
short_empty_elements=short_empty_elements)
文件 "C:\Users\natep.RANDK\AppData\Local\Programs\Python\Python37-32\lib\xml\etree\ElementTree.py",第 942 行,在 _serialize_xml 中
short_empty_elements=short_empty_elements)
文件 "C:\Users\natep.RANDK\AppData\Local\Programs\Python\Python37-32\lib\xml\etree\ElementTree.py",第 942 行,在 _serialize_xml 中
short_empty_elements=short_empty_elements)
文件 "C:\Users\natep.RANDK\AppData\Local\Programs\Python\Python37-32\lib\xml\etree\ElementTree.py",第 942 行,在 _serialize_xml 中
short_empty_elements=short_empty_elements)
[上一行重复了 3 次]
文件 "C:\Users\natep.RANDK\AppData\Local\Programs\Python\Python37-32\lib\xml\etree\ElementTree.py",第 939 行,在 _serialize_xml 中
写(_escape_cdata(文本))
TypeError: write() 参数必须是 str,而不是 tuple
在输出文件中,我尝试更改的 Value 子元素是空的,没有结束标记,现在已删除超过此标记的所有内容。
考虑一下 XSLT,设计用于转换 XML 文件的专用语言。 Python 的第三方模块,lxml
可以 运行 XSLT 1.0 脚本(不是内置的 etree
)并且不需要一个循环。
具体来说,XSLT 脚本 运行s Identity Transform 可以按原样复制整个文档。然后,脚本调整最后一个 Value 节点,方法是使用条件 XPath(XSLT 的同级)表达式提取前面的同级节点,最后将文本值与逗号分隔符和所需的引号连接起来。
XSLT (另存为.xsl文件,特殊.xml文件在下面Python加载)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="ConfigurationAttribute[Name='LongDescription']">
<xsl:copy>
<xsl:apply-templates select="Name"/>
<Value>
<xsl:value-of select="concat(preceding-sibling::ConfigurationAttribute[Name='ConfigurationModel']/Value, ', ',
preceding-sibling::ConfigurationAttribute[Name='Handing']/Value, ', ',
preceding-sibling::ConfigurationAttribute[Name='EXWidth']/Value, '"', ' X ',
preceding-sibling::ConfigurationAttribute[Name='EXHeight']/Value, '"')"/>
</Value>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Python
import lxml.etree as et
# LOAD XML AND XSL
doc = et.parse('/path/to/Input.xml')
xsl = et.parse('/path/to/XSLT_Script.xsl')
# CONFIGURE TRANSFORMER
transform = et.XSLT(xsl)
# RUN TRANSFORMATION
result = transform(doc)
# PRINT RESULT
print(result)
# SAVE TO FILE
with open('output.xml', 'wb') as f:
f.write(result)
我正在尝试用树中其他子元素的文本替换 XML 树中的值文本。我是 Python 的新手,需要一些关于如何编写此内容的帮助。
我的 XML 示例,其中省略了一些长度元素:
<SalesOrder>
<SalesOrderLines>
<SalesOrderLine>
<Item>
<LineNo>1</LineNo>
<Quantity>4.00</Quantity>
</Item>
<ConfigurationDetails>
<ConfigurationDetail>
<ConfigurationAttribute>
<Name>ConfigurationModel</Name>
<Value>HV</Value>
</ConfigurationAttribute>
<ConfigurationAttribute>
<Name>EXWidth</Name>
<Value>59.5</Value>
</ConfigurationAttribute>
<ConfigurationAttribute>
<Name>EXHeight</Name>
<Value>59.5</Value>
</ConfigurationAttribute>
<ConfigurationAttribute>
<Name>Handing</Name>
<Value>XO</Value>
</ConfigurationAttribute>
<ConfigurationAttribute>
<Name>LongDescription</Name>
<Value>This is a long paragraph of text i want to replace with
the above text for the Value sub-element</Value>
</ConfigurationAttribute>
</ConfigurationDetail>
</ConfigurationDetails>
</SalesOrderLine>
</SalesOrderLines>
</SalesOrder>
这是我第一次尝试 Python 使用 ElementTree
库的代码:
import xml.etree.ElementTree as ET
from tkinter import Tk
from tkinter.filedialog import askopenfilename, asksaveasfilename
Tk().withdraw()
file = askopenfilename()
tree = ET.parse(file)
root = tree.getroot()
def model():
for ConfigurationAttribute in root.iter('ConfigurationAttribute'):
descrip = ConfigurationAttribute.find('Name').text
model = ''
if descrip == 'ConfigurationModel':
model = ConfigurationAttribute.find('Value').text
def handing():
for ConfigurationAttribute in root.iter('ConfigurationAttribute'):
descrip = ConfigurationAttribute.find('Name').text
handing = ''
if descrip == 'Handing' and ConfigurationAttribute.find('Value') is
not None:
handing = ConfigurationAttribute.find('Value').text
def width():
for ConfigurationAttribute in root.iter('ConfigurationAttribute'):
descrip = ConfigurationAttribute.find('Name').text
width = ''
if descrip == 'EXWidth':
width = ConfigurationAttribute.find('Value').text
def height():
for ConfigurationAttribute in root.iter('ConfigurationAttribute'):
descrip = ConfigurationAttribute.find('Name').text
height = ''
if descrip == 'EXHeight':
height = ConfigurationAttribute.find('Value').text
for ConfigurationAttribute in root.iter('ConfigurationAttribute'):
descrip = ConfigurationAttribute.find('Name').text
if descrip == 'LongDescription':
model()
handing()
width()
height()
ConfigurationAttribute.find('Value').text = str(model), str(handing),
str(width), '" x ', str(height), '"'
tree.write(asksaveasfilename(defaultextension='.xml',))
这会输出错误。我正在查看的是 Value 子元素中的文本段落,将替换为来自 ConfigurationModel、Handing、EXWidth 和 EXHeight Name 子元素的 Value 子元素文本,如下所示:
<ConfigurationAttribute>
<Name>LongDescription</Name>
<Value> HV, XO, 59.5" x 59.5"</Value>
</ConfigurationAttribute>
下面是我在 运行 代码时收到的错误:
回溯(最近调用最后): 文件“\app\users\Home\natep\Documents\NP\py\PrestoParse.py”,第 59 行,位于 tree.write(asksaveasfilename(defaultextension='.xml',)) 文件 "C:\Users\natep.RANDK\AppData\Local\Programs\Python\Python37-32\lib\xml\etree\ElementTree.py",第 777 行,写入中 short_empty_elements=short_empty_elements) 文件 "C:\Users\natep.RANDK\AppData\Local\Programs\Python\Python37-32\lib\xml\etree\ElementTree.py",第 942 行,在 _serialize_xml 中 short_empty_elements=short_empty_elements) 文件 "C:\Users\natep.RANDK\AppData\Local\Programs\Python\Python37-32\lib\xml\etree\ElementTree.py",第 942 行,在 _serialize_xml 中 short_empty_elements=short_empty_elements) 文件 "C:\Users\natep.RANDK\AppData\Local\Programs\Python\Python37-32\lib\xml\etree\ElementTree.py",第 942 行,在 _serialize_xml 中 short_empty_elements=short_empty_elements) [上一行重复了 3 次] 文件 "C:\Users\natep.RANDK\AppData\Local\Programs\Python\Python37-32\lib\xml\etree\ElementTree.py",第 939 行,在 _serialize_xml 中 写(_escape_cdata(文本)) TypeError: write() 参数必须是 str,而不是 tuple
在输出文件中,我尝试更改的 Value 子元素是空的,没有结束标记,现在已删除超过此标记的所有内容。
考虑一下 XSLT,设计用于转换 XML 文件的专用语言。 Python 的第三方模块,lxml
可以 运行 XSLT 1.0 脚本(不是内置的 etree
)并且不需要一个循环。
具体来说,XSLT 脚本 运行s Identity Transform 可以按原样复制整个文档。然后,脚本调整最后一个 Value 节点,方法是使用条件 XPath(XSLT 的同级)表达式提取前面的同级节点,最后将文本值与逗号分隔符和所需的引号连接起来。
XSLT (另存为.xsl文件,特殊.xml文件在下面Python加载)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="ConfigurationAttribute[Name='LongDescription']">
<xsl:copy>
<xsl:apply-templates select="Name"/>
<Value>
<xsl:value-of select="concat(preceding-sibling::ConfigurationAttribute[Name='ConfigurationModel']/Value, ', ',
preceding-sibling::ConfigurationAttribute[Name='Handing']/Value, ', ',
preceding-sibling::ConfigurationAttribute[Name='EXWidth']/Value, '"', ' X ',
preceding-sibling::ConfigurationAttribute[Name='EXHeight']/Value, '"')"/>
</Value>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Python
import lxml.etree as et
# LOAD XML AND XSL
doc = et.parse('/path/to/Input.xml')
xsl = et.parse('/path/to/XSLT_Script.xsl')
# CONFIGURE TRANSFORMER
transform = et.XSLT(xsl)
# RUN TRANSFORMATION
result = transform(doc)
# PRINT RESULT
print(result)
# SAVE TO FILE
with open('output.xml', 'wb') as f:
f.write(result)