使用 lxml 编辑 KML <description> 的 html 内容
Editing the html content of <description> of a KML using lxml
我想用新的格式化 html 替换 KML 描述标签内的 html。
我的 kml 具有以下结构:
<html>
<body>
<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:gx="http://www.google.com/kml/ext/2.2" xmlns:kml="http://www.opengis.net/kml/2.2">
<document id="WATER_MAINLINE_trim" xsi:schemalocation="http://www.opengis.net/kml/2.2 http://schemas.opengis.net/kml/2.2.0/ogckml22.xsd http://www.google.com/kml/ext/2.2 http://code.google.com/apis/kml/schema/kml22gx.xsd">
<name>
WATER_MAINLINE_trim
</name>
<open>
1
</open>
<snippet maxlines="0">
</snippet>
<style id="LineStyle00">
<LabelStyle>
<color>00000000</color>
<scale>0</scale>
</LabelStyle>
<LineStyle>
<color>ff240087</color>
</LineStyle>
<PolyStyle>
<color>00000000</color>
<outline>0</outline>
</PolyStyle>
</style>
<folder id="FeatureLayer0">
<name>
WATER_MAINLINE_trim
</name>
<open>
1
</open>
<snippet maxlines="0">
</snippet>
<placemark id="ID_00000">
<name>
0100026491
</name>
<snippet maxlines="0">
</snippet>
<description>
<meta content="text/html" http-equiv="Content-Type" />
<meta content="text/html; charset=utf-8" http-equiv="content-type" />
<table style="font-family:Arial,Verdana,Times;font-size:12px;text-align:left;width:100%;border-collapse:collapse;padding:3px 3px 3px 3px">
<tr style="text-align:center;font-weight:bold;background:#9CBCE2">
<td>
0100026491
</td>
</tr>
<tr>
<td>
<table style="font-family:Arial,Verdana,Times;font-size:12px;text-align:left;width:100%;border-spacing:0px; padding:3px 3px 3px 3px">
<tr>
<td>
FID
</td>
<td>
0
</td>
</tr>
<tr bgcolor="#D4E4F3">
<td>
PRIKEY
</td>
<td>
0100026491
</td>
</tr>
<tr>
<td>
YEAR_INST
</td>
<td>
2001
</td>
</tr>
<tr bgcolor="#D4E4F3">
<td>
PIPE_CLASS
</td>
<td>
PRIMARY
</td>
</tr>
<tr>
<td>
DIAMETER
</td>
<td>
1500
</td>
</tr>
<tr bgcolor="#D4E4F3">
<td>
MATERIAL
</td>
<td>
SP
</td>
</tr>
<tr>
<td>
STATUS
</td>
<td>
ACTIVE
</td>
</tr>
<tr bgcolor="#D4E4F3">
<td>
BA
</td>
<td>
FCOM
</td>
</tr>
<tr>
<td>
SUBCLASS
</td>
<td>
WATER MAINLINE
</td>
</tr>
</table>
</td>
</tr>
</table>
</description>
</placemark>
</folder>
</document>
</kml>
</body>
</html>
我有这个新的 html:
newhtml="""<![CDATA[ \n<!------------TITLE SUBCLASS---------------->\n <tr>\n <td colspan="2" align="center">\n <b><font color=\'#090259\' size=\'6\' style = \'bold\'>LA MESA BALARA</font><b>\n </td>/n </tr>\n<!------------IMAGE---------------->\n <tr>\n <td colspan="2" align="center">\n <img src= http://static.rappler.com/images/640-lamesadam-20120728.jpg, width=500, height = 223, alt="picture" />\n </td>\n </tr>\n<!------------PRIKEY---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>PRIKEY</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>0100026491</p>\n </td>\n<!------------YEAR INSTALLED---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Year Installed</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>2001</p>\n </td>\n<!------------PIPE CLASS---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Pipe Class</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>PRIMARY</p>\n </td>\n<!------------DIAMETER---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Diameter (mm)</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>1500.000000</p>\n </td>\n<!------------MATERIAL---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Material</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>SP</p>\n </td>\n<!------------STATUS---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Status</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>ACTIVE</p>\n </td>\n<!------------BUSINESS ADDRESS---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Business Address</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>Fairview-Commonwealth</p>\n </td>]]>"""
如何使用 lxml 在已解析的 kml 中正确替换它并且仍然是有效的 KML?使用 'valid',我属于可以在 Google 地球上加载的 kml。我曾尝试使用 BeautifulSoup 进行替换,但我的输出文件在 Google Earth 上加载时出现错误。它说,"Unexpected element "html""。所以我只想为此使用 lxml 。任何帮助将不胜感激。谢谢!
我有这个示例 kml,其中包含 5 个 LineString 地标。
trim.kml = https://sites.google.com/site/kmlhostingmwss/trim.kml
由于 KML 是一个有效的 XML 文件,请考虑 XSLT,专门用于修改 XML 文档和 Python 的 lxml 的转换语言可以 运行 XSLT 1.0 脚本。
具体来说,下面的动态 XSLT 从字符串中解析出来,运行首先使用 Identity Transform 复制文档,然后用 newhtml 变量替换每个出现的 <description>
。
import lxml.etree as ET
# READ IN KML FILE
dom = ET.parse('trim.kml')
newhtml = """<![CDATA[\n<!------------TITLE SUBCLASS---------------->\n <tr>\n <td colspan="2" align="center">\n <b><font color=\'#090259\' size=\'6\' style = \'bold\'>LA MESA BALARA</font><b>\n </td>/n </tr>\n<!------------IMAGE---------------->\n <tr>\n <td colspan="2" align="center">\n <img src= http://static.rappler.com/images/640-lamesadam-20120728.jpg, width=500, height = 223, alt="picture" />\n </td>\n </tr>\n<!------------PRIKEY---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>PRIKEY</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>0100026491</p>\n </td>\n<!------------YEAR INSTALLED---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Year Installed</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>2001</p>\n </td>\n<!------------PIPE CLASS---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Pipe Class</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>PRIMARY</p>\n </td>\n<!------------DIAMETER---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Diameter (mm)</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>1500.000000</p>\n </td>\n<!------------MATERIAL---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Material</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>SP</p>\n </td>\n<!------------STATUS---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Status</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>ACTIVE</p>\n </td>\n<!------------BUSINESS ADDRESS---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Business Address</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>Fairview-Commonwealth</p>\n </td>]]>"""
# PARSE XSL FROM STRING
xslstr = '''<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:ogc="http://www.opengis.net/ogc" xmlns:wfs="http://www.opengis.net/wfs">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="description">
<xsl:copy>
<xsl:text disable-output-escaping="yes"><![CDATA[</xsl:text>
<xsl:text disable-output-escaping="yes">{}</xsl:text>
<xsl:text disable-output-escaping="yes">]]></xsl:text>
</xsl:copy>
</xsl:template>
</xsl:transform>'''.format(newhtml)
xslt = ET.fromstring(xslstr)
# TRANSFORM SOURCE TO NEW TREE
transform = ET.XSLT(xslt)
newdom = transform(dom)
# OUTPUT TO FILE
tree_out = ET.tostring(newdom, encoding='UTF-8', pretty_print=True, xml_declaration=True)
xmlfile = open('newTrim.kml','wb')
xmlfile.write(tree_out)
xmlfile.close()
我想用新的格式化 html 替换 KML 描述标签内的 html。
我的 kml 具有以下结构:
<html>
<body>
<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:gx="http://www.google.com/kml/ext/2.2" xmlns:kml="http://www.opengis.net/kml/2.2">
<document id="WATER_MAINLINE_trim" xsi:schemalocation="http://www.opengis.net/kml/2.2 http://schemas.opengis.net/kml/2.2.0/ogckml22.xsd http://www.google.com/kml/ext/2.2 http://code.google.com/apis/kml/schema/kml22gx.xsd">
<name>
WATER_MAINLINE_trim
</name>
<open>
1
</open>
<snippet maxlines="0">
</snippet>
<style id="LineStyle00">
<LabelStyle>
<color>00000000</color>
<scale>0</scale>
</LabelStyle>
<LineStyle>
<color>ff240087</color>
</LineStyle>
<PolyStyle>
<color>00000000</color>
<outline>0</outline>
</PolyStyle>
</style>
<folder id="FeatureLayer0">
<name>
WATER_MAINLINE_trim
</name>
<open>
1
</open>
<snippet maxlines="0">
</snippet>
<placemark id="ID_00000">
<name>
0100026491
</name>
<snippet maxlines="0">
</snippet>
<description>
<meta content="text/html" http-equiv="Content-Type" />
<meta content="text/html; charset=utf-8" http-equiv="content-type" />
<table style="font-family:Arial,Verdana,Times;font-size:12px;text-align:left;width:100%;border-collapse:collapse;padding:3px 3px 3px 3px">
<tr style="text-align:center;font-weight:bold;background:#9CBCE2">
<td>
0100026491
</td>
</tr>
<tr>
<td>
<table style="font-family:Arial,Verdana,Times;font-size:12px;text-align:left;width:100%;border-spacing:0px; padding:3px 3px 3px 3px">
<tr>
<td>
FID
</td>
<td>
0
</td>
</tr>
<tr bgcolor="#D4E4F3">
<td>
PRIKEY
</td>
<td>
0100026491
</td>
</tr>
<tr>
<td>
YEAR_INST
</td>
<td>
2001
</td>
</tr>
<tr bgcolor="#D4E4F3">
<td>
PIPE_CLASS
</td>
<td>
PRIMARY
</td>
</tr>
<tr>
<td>
DIAMETER
</td>
<td>
1500
</td>
</tr>
<tr bgcolor="#D4E4F3">
<td>
MATERIAL
</td>
<td>
SP
</td>
</tr>
<tr>
<td>
STATUS
</td>
<td>
ACTIVE
</td>
</tr>
<tr bgcolor="#D4E4F3">
<td>
BA
</td>
<td>
FCOM
</td>
</tr>
<tr>
<td>
SUBCLASS
</td>
<td>
WATER MAINLINE
</td>
</tr>
</table>
</td>
</tr>
</table>
</description>
</placemark>
</folder>
</document>
</kml>
</body>
</html>
我有这个新的 html:
newhtml="""<![CDATA[ \n<!------------TITLE SUBCLASS---------------->\n <tr>\n <td colspan="2" align="center">\n <b><font color=\'#090259\' size=\'6\' style = \'bold\'>LA MESA BALARA</font><b>\n </td>/n </tr>\n<!------------IMAGE---------------->\n <tr>\n <td colspan="2" align="center">\n <img src= http://static.rappler.com/images/640-lamesadam-20120728.jpg, width=500, height = 223, alt="picture" />\n </td>\n </tr>\n<!------------PRIKEY---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>PRIKEY</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>0100026491</p>\n </td>\n<!------------YEAR INSTALLED---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Year Installed</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>2001</p>\n </td>\n<!------------PIPE CLASS---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Pipe Class</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>PRIMARY</p>\n </td>\n<!------------DIAMETER---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Diameter (mm)</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>1500.000000</p>\n </td>\n<!------------MATERIAL---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Material</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>SP</p>\n </td>\n<!------------STATUS---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Status</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>ACTIVE</p>\n </td>\n<!------------BUSINESS ADDRESS---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Business Address</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>Fairview-Commonwealth</p>\n </td>]]>"""
如何使用 lxml 在已解析的 kml 中正确替换它并且仍然是有效的 KML?使用 'valid',我属于可以在 Google 地球上加载的 kml。我曾尝试使用 BeautifulSoup 进行替换,但我的输出文件在 Google Earth 上加载时出现错误。它说,"Unexpected element "html""。所以我只想为此使用 lxml 。任何帮助将不胜感激。谢谢!
我有这个示例 kml,其中包含 5 个 LineString 地标。
trim.kml = https://sites.google.com/site/kmlhostingmwss/trim.kml
由于 KML 是一个有效的 XML 文件,请考虑 XSLT,专门用于修改 XML 文档和 Python 的 lxml 的转换语言可以 运行 XSLT 1.0 脚本。
具体来说,下面的动态 XSLT 从字符串中解析出来,运行首先使用 Identity Transform 复制文档,然后用 newhtml 变量替换每个出现的 <description>
。
import lxml.etree as ET
# READ IN KML FILE
dom = ET.parse('trim.kml')
newhtml = """<![CDATA[\n<!------------TITLE SUBCLASS---------------->\n <tr>\n <td colspan="2" align="center">\n <b><font color=\'#090259\' size=\'6\' style = \'bold\'>LA MESA BALARA</font><b>\n </td>/n </tr>\n<!------------IMAGE---------------->\n <tr>\n <td colspan="2" align="center">\n <img src= http://static.rappler.com/images/640-lamesadam-20120728.jpg, width=500, height = 223, alt="picture" />\n </td>\n </tr>\n<!------------PRIKEY---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>PRIKEY</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>0100026491</p>\n </td>\n<!------------YEAR INSTALLED---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Year Installed</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>2001</p>\n </td>\n<!------------PIPE CLASS---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Pipe Class</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>PRIMARY</p>\n </td>\n<!------------DIAMETER---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Diameter (mm)</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>1500.000000</p>\n </td>\n<!------------MATERIAL---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Material</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>SP</p>\n </td>\n<!------------STATUS---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Status</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>ACTIVE</p>\n </td>\n<!------------BUSINESS ADDRESS---------------->\n <tr>\n <td bgcolor = \'#090259\', align="center" >\n <p><font color = \'FFFFFF\', size =\'4\'>Business Address</p>\n </td>\n \n <td bgcolor = \'#d8d8ff\' align="center">\n <p>Fairview-Commonwealth</p>\n </td>]]>"""
# PARSE XSL FROM STRING
xslstr = '''<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:ogc="http://www.opengis.net/ogc" xmlns:wfs="http://www.opengis.net/wfs">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="description">
<xsl:copy>
<xsl:text disable-output-escaping="yes"><![CDATA[</xsl:text>
<xsl:text disable-output-escaping="yes">{}</xsl:text>
<xsl:text disable-output-escaping="yes">]]></xsl:text>
</xsl:copy>
</xsl:template>
</xsl:transform>'''.format(newhtml)
xslt = ET.fromstring(xslstr)
# TRANSFORM SOURCE TO NEW TREE
transform = ET.XSLT(xslt)
newdom = transform(dom)
# OUTPUT TO FILE
tree_out = ET.tostring(newdom, encoding='UTF-8', pretty_print=True, xml_declaration=True)
xmlfile = open('newTrim.kml','wb')
xmlfile.write(tree_out)
xmlfile.close()