如何使用 Beautiful Soup 修改 xml?
How to modify xml using Beautiful Soup?
我正在尝试修改 xml 文件中的查找数据元素。 xml的一个片段如下:
<?xml version="1.0" encoding="UTF-8"?>
<Configuration>
<Options>
<SampleRate>1000</SampleRate>
<MaxStateSize>1</MaxStateSize>
<MaxOutputSize>1</MaxOutputSize>
</Options>
<CustomDefinitions>
<MyRser class="OhmicResistance">
<Object class="LookupObj2dWithState">
<RowState cacheref="Soc"/>
<ColState cacheref="ThermalState"/>
<LookupData>
0.02597518381655694900, 0.02513715386193249600, 0.02394715132636577100, 0.02325996676357371800, 0.02317075771456176400, 0.02277814077034603900, 0.02267913709322775700, 0.02258569292134297900, 0.02235026503875497600, 0.02222478423822949300, 0.02207606555239715500, 0.02198493491067361700, 0.02188144525929673300, 0.02167985791309091600, 0.02145797158835977700, 0.02137484908165417400, 0.02126561803424023600, 0.02124462299304301700, 0.02123310358079429400, 0.02126287857906075300, 0.02094998489960795500, 0.02073326148328196600, 0.02062489977511897100, 0.02038933084432985300;
</LookupData>
<MeasurementPointsRow desc="StateOfCharge">
-5, 0, 7.100000e+00, 1.120000e+01, 16, 2.080000e+01, 2.560000e+01, 3.040000e+01, 3.520000e+01, 4.010000e+01, 4.490000e+01, 4.970000e+01, 5.450000e+01, 5.930000e+01, 6.420000e+01, 69, 7.380000e+01, 7.860000e+01, 8.350000e+01, 8.830000e+01, 9.310000e+01, 9.770000e+01, 100, 105
</MeasurementPointsRow>
<MeasurementPointsColumn desc="ThermalState">
25
</MeasurementPointsColumn>
</Object>
</MyRser>
我想修改查找数据并保存包含该修改的 xml 的副本。我是这样做的:
with open('....xml') as fp:
contents = fp.read()
soup = BeautifulSoup(contents, 'lxml')
tag = soup.find(elem_name).find(elem_path).lookupdata
tag.replace_with(str(values))
#saves the modified data as a new xml version
teslaname= elem_name+key
with open('modified.xml', 'w') as file:
file.write(str(soup))
file.close()
但是,当我这样做时,特定的修改已经完成,但它改变了 xml 结构。
<?xml version="1.0" encoding="UTF-8"?><html><body><configuration>
<options>
<samplerate>1000</samplerate>
<maxstatesize>1</maxstatesize>
<maxoutputsize>1</maxoutputsize>
</options>
<customdefinitions>
<myrser class="OhmicResistance">
<object class="LookupObj2dWithState">
<rowstate cacheref="Soc"></rowstate>
<colstate cacheref="ThermalState"></colstate>
0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339
<measurementpointsrow desc="StateOfCharge">
-5, 0, 7.100000e+00, 1.120000e+01, 16, 2.080000e+01, 2.560000e+01, 3.040000e+01, 3.520000e+01, 4.010000e+01, 4.490000e+01, 4.970000e+01, 5.450000e+01, 5.930000e+01, 6.420000e+01, 69, 7.380000e+01, 7.860000e+01, 8.350000e+01, 8.830000e+01, 9.310000e+01, 9.770000e+01, 100, 105
</measurementpointsrow>
<measurementpointscolumn desc="ThermalState">
25
</measurementpointscolumn>
</object>
</myrser>
而且我想保留结构,只修改数据。我知道这可以通过 ElementTree 来完成,但是我需要我的代码如何运行,beautifulsoup 使用起来更简单。那么如果只考虑使用beautifulsoup,如何在不丢失xml的原始结构的情况下编辑并保存xml的副本?
任何帮助将不胜感激。
要保持XML结构,写入文件时使用.prettify()
方法:
file.write(str(soup.prettify()))
注:
BeautifulSoup
将 XML 标签转换为小写。
由于您使用上下文管理器打开文件,因此无需使用 file.close()
关闭文件,退出缩进块时文件将自动关闭。
使用 lxml,您可以执行以下操作:
from lxml import etree
config = """[your xml above, corrected - it's not well formed]"""
new_values = "1,2,3,4"
doc = etree.XML(config.encode())
target = doc.xpath('//LookupData')[0]
target.text = new_values
print(etree.tostring(doc).decode())
输出:
<Configuration>
<Options>
<SampleRate>1000</SampleRate>
<MaxStateSize>1</MaxStateSize>
<MaxOutputSize>1</MaxOutputSize>
</Options>
<CustomDefinitions>
<MyRser class="OhmicResistance">
<Object class="LookupObj2dWithState">
<RowState cacheref="Soc"/>
<ColState cacheref="ThermalState"/>
<LookupData>1,2,3,4</LookupData>
<MeasurementPointsRow desc="StateOfCharge">
-5, 0, 7.100000e+00, 1.120000e+01, 16, 2.080000e+01, 2.560000e+01, 3.040000e+01, 3.520000e+01, 4.010000e+01, 4.490000e+01, 4.970000e+01, 5.450000e+01, 5.930000e+01, 6.420000e+01, 69, 7.380000e+01, 7.860000e+01, 8.350000e+01, 8.830000e+01, 9.310000e+01, 9.770000e+01, 100, 105
</MeasurementPointsRow>
<MeasurementPointsColumn desc="ThermalState">
25
</MeasurementPointsColumn>
</Object>
</MyRser>
</CustomDefinitions>
</Configuration>
我正在尝试修改 xml 文件中的查找数据元素。 xml的一个片段如下:
<?xml version="1.0" encoding="UTF-8"?>
<Configuration>
<Options>
<SampleRate>1000</SampleRate>
<MaxStateSize>1</MaxStateSize>
<MaxOutputSize>1</MaxOutputSize>
</Options>
<CustomDefinitions>
<MyRser class="OhmicResistance">
<Object class="LookupObj2dWithState">
<RowState cacheref="Soc"/>
<ColState cacheref="ThermalState"/>
<LookupData>
0.02597518381655694900, 0.02513715386193249600, 0.02394715132636577100, 0.02325996676357371800, 0.02317075771456176400, 0.02277814077034603900, 0.02267913709322775700, 0.02258569292134297900, 0.02235026503875497600, 0.02222478423822949300, 0.02207606555239715500, 0.02198493491067361700, 0.02188144525929673300, 0.02167985791309091600, 0.02145797158835977700, 0.02137484908165417400, 0.02126561803424023600, 0.02124462299304301700, 0.02123310358079429400, 0.02126287857906075300, 0.02094998489960795500, 0.02073326148328196600, 0.02062489977511897100, 0.02038933084432985300;
</LookupData>
<MeasurementPointsRow desc="StateOfCharge">
-5, 0, 7.100000e+00, 1.120000e+01, 16, 2.080000e+01, 2.560000e+01, 3.040000e+01, 3.520000e+01, 4.010000e+01, 4.490000e+01, 4.970000e+01, 5.450000e+01, 5.930000e+01, 6.420000e+01, 69, 7.380000e+01, 7.860000e+01, 8.350000e+01, 8.830000e+01, 9.310000e+01, 9.770000e+01, 100, 105
</MeasurementPointsRow>
<MeasurementPointsColumn desc="ThermalState">
25
</MeasurementPointsColumn>
</Object>
</MyRser>
我想修改查找数据并保存包含该修改的 xml 的副本。我是这样做的:
with open('....xml') as fp:
contents = fp.read()
soup = BeautifulSoup(contents, 'lxml')
tag = soup.find(elem_name).find(elem_path).lookupdata
tag.replace_with(str(values))
#saves the modified data as a new xml version
teslaname= elem_name+key
with open('modified.xml', 'w') as file:
file.write(str(soup))
file.close()
但是,当我这样做时,特定的修改已经完成,但它改变了 xml 结构。
<?xml version="1.0" encoding="UTF-8"?><html><body><configuration>
<options>
<samplerate>1000</samplerate>
<maxstatesize>1</maxstatesize>
<maxoutputsize>1</maxoutputsize>
</options>
<customdefinitions>
<myrser class="OhmicResistance">
<object class="LookupObj2dWithState">
<rowstate cacheref="Soc"></rowstate>
<colstate cacheref="ThermalState"></colstate>
0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339, 0.02217779408499339
<measurementpointsrow desc="StateOfCharge">
-5, 0, 7.100000e+00, 1.120000e+01, 16, 2.080000e+01, 2.560000e+01, 3.040000e+01, 3.520000e+01, 4.010000e+01, 4.490000e+01, 4.970000e+01, 5.450000e+01, 5.930000e+01, 6.420000e+01, 69, 7.380000e+01, 7.860000e+01, 8.350000e+01, 8.830000e+01, 9.310000e+01, 9.770000e+01, 100, 105
</measurementpointsrow>
<measurementpointscolumn desc="ThermalState">
25
</measurementpointscolumn>
</object>
</myrser>
而且我想保留结构,只修改数据。我知道这可以通过 ElementTree 来完成,但是我需要我的代码如何运行,beautifulsoup 使用起来更简单。那么如果只考虑使用beautifulsoup,如何在不丢失xml的原始结构的情况下编辑并保存xml的副本? 任何帮助将不胜感激。
要保持XML结构,写入文件时使用.prettify()
方法:
file.write(str(soup.prettify()))
注:
BeautifulSoup
将 XML 标签转换为小写。由于您使用上下文管理器打开文件,因此无需使用
file.close()
关闭文件,退出缩进块时文件将自动关闭。
使用 lxml,您可以执行以下操作:
from lxml import etree
config = """[your xml above, corrected - it's not well formed]"""
new_values = "1,2,3,4"
doc = etree.XML(config.encode())
target = doc.xpath('//LookupData')[0]
target.text = new_values
print(etree.tostring(doc).decode())
输出:
<Configuration>
<Options>
<SampleRate>1000</SampleRate>
<MaxStateSize>1</MaxStateSize>
<MaxOutputSize>1</MaxOutputSize>
</Options>
<CustomDefinitions>
<MyRser class="OhmicResistance">
<Object class="LookupObj2dWithState">
<RowState cacheref="Soc"/>
<ColState cacheref="ThermalState"/>
<LookupData>1,2,3,4</LookupData>
<MeasurementPointsRow desc="StateOfCharge">
-5, 0, 7.100000e+00, 1.120000e+01, 16, 2.080000e+01, 2.560000e+01, 3.040000e+01, 3.520000e+01, 4.010000e+01, 4.490000e+01, 4.970000e+01, 5.450000e+01, 5.930000e+01, 6.420000e+01, 69, 7.380000e+01, 7.860000e+01, 8.350000e+01, 8.830000e+01, 9.310000e+01, 9.770000e+01, 100, 105
</MeasurementPointsRow>
<MeasurementPointsColumn desc="ThermalState">
25
</MeasurementPointsColumn>
</Object>
</MyRser>
</CustomDefinitions>
</Configuration>