使用pykml解析kml时出现Lxml错误
Lxml error when parsing kml using pykml
我正在尝试使用 pykml 解析包含多个地标的 kml 文件。我想编辑 kml 描述中的 HTML 代码,主要用于 Google Earth 中地理数据的可视化。我研究了很多方法:
- Extract Coordinates from KML BatchGeo File with Python
- Read kml file with multiple placemarks in pykml
- Using pyKML to parse KML Document
- KML to string in Python?
但是,我总是收到如下所示的 lxml 错误。 :(
Traceback (most recent call last):
File "C:\Users\Arellano\Copy\BSGE15-2016 SUMMER\trial7.py", line 5, in <module>
root = parser.fromstring(open('trim_KML.kml', 'r').read())
File "C:\Program Files (x86)\Python2.7.10\lib\site-packages\pykml-0.1.0-py2.7.egg\pykml\parser.py", line 41, in fromstring
return objectify.fromstring(text)
File "src/lxml/lxml.objectify.pyx", line 1801, in lxml.objectify.fromstring (src\lxml\lxml.objectify.c:25171)
File "src/lxml/lxml.etree.pyx", line 3213, in lxml.etree.fromstring (src\lxml\lxml.etree.c:77697)
File "src/lxml/parser.pxi", line 1819, in lxml.etree._parseMemoryDocument (src\lxml\lxml.etree.c:116494)
File "src/lxml/parser.pxi", line 1707, in lxml.etree._parseDoc (src\lxml\lxml.etree.c:115144)
File "src/lxml/parser.pxi", line 1079, in lxml.etree._BaseParser._parseDoc (src\lxml\lxml.etree.c:109543)
File "src/lxml/parser.pxi", line 573, in lxml.etree._ParserContext._handleParseResultDoc (src\lxml\lxml.etree.c:103404)
File "src/lxml/parser.pxi", line 683, in lxml.etree._handleParseResult (src\lxml\lxml.etree.c:105058)
File "src/lxml/parser.pxi", line 613, in lxml.etree._raiseParseError (src\lxml\lxml.etree.c:103967)
XMLSyntaxError: Namespace prefix xsi for schemaLocation on Document is not defined, line 3, column 32
这是我的代码片段:(它应该基于我的一个来源工作)
from pykml import parser
root = parser.fromstring(open('trim_KML.kml', 'r').read())
print etree.tostring(root.Document.Placemark.LineString.Description)
我已经安装了 pykml 和 lxml 3.6.0,我目前正在使用我的 Python 2.7.10。 kml 文件包含行。 (kml link: https://sites.google.com/site/kmlhostingmwss/trim.kml)
我的 ArcGIS 10.2 也有 Python 2.7。
我是处理 kml 文件的新手。有人可以告诉我我做错了什么吗?或者有没有更简单的方法来编辑 kml 文件的描述?非常感谢你。 :)))
xml有一些问题,如果要消除错误,在第二行添加xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
:
<kml xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2" xmlns:kml="http://www.opengis.net/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom">
然后使用 lxml,以下工作:
import lxml.etree as et
xml = et.parse("trim.kml").getroot()
print(xml.xpath("//kml:Document//kml:Placemark/kml:description", namespaces={"kml":xml.nsmap["kml"]}))
这给你:
[<Element {http://www.opengis.net/kml/2.2}description at 0x7f612d0885f0>, <Element {http://www.opengis.net/kml/2.2}description at 0x7f612d088cb0>, <Element {http://www.opengis.net/kml/2.2}description at 0x7f612d088d40>, <Element {http://www.opengis.net/kml/2.2}description at 0x7f612d088d88>, <Element {http://www.opengis.net/kml/2.2}description at 0x7f612d088dd0>, <Element {http://www.opengis.net/kml/2.2}description at 0x7f612d088e18>]
您也可以使用 lxml.html,它会更好地处理损坏的 xml,数据本身也是 99% html。
您可以通过以下方式从 document.placemark
中获取一个:
from lxml import html
xml = html.parse("trim.kml")
print(xml.xpath("//placemark/description"))
这给你:
[<Element description at 0x7f1c757fad08>, <Element description at 0x7f1c757fad60>, <Element description at 0x7f1c757fadb8>, <Element description at 0x7f1c757fae10>, <Element description at 0x7f1c757fae68>, <Element description at 0x7f1c757faec0>]
我正在尝试使用 pykml 解析包含多个地标的 kml 文件。我想编辑 kml 描述中的 HTML 代码,主要用于 Google Earth 中地理数据的可视化。我研究了很多方法:
- Extract Coordinates from KML BatchGeo File with Python
- Read kml file with multiple placemarks in pykml
- Using pyKML to parse KML Document
- KML to string in Python?
但是,我总是收到如下所示的 lxml 错误。 :(
Traceback (most recent call last):
File "C:\Users\Arellano\Copy\BSGE15-2016 SUMMER\trial7.py", line 5, in <module>
root = parser.fromstring(open('trim_KML.kml', 'r').read())
File "C:\Program Files (x86)\Python2.7.10\lib\site-packages\pykml-0.1.0-py2.7.egg\pykml\parser.py", line 41, in fromstring
return objectify.fromstring(text)
File "src/lxml/lxml.objectify.pyx", line 1801, in lxml.objectify.fromstring (src\lxml\lxml.objectify.c:25171)
File "src/lxml/lxml.etree.pyx", line 3213, in lxml.etree.fromstring (src\lxml\lxml.etree.c:77697)
File "src/lxml/parser.pxi", line 1819, in lxml.etree._parseMemoryDocument (src\lxml\lxml.etree.c:116494)
File "src/lxml/parser.pxi", line 1707, in lxml.etree._parseDoc (src\lxml\lxml.etree.c:115144)
File "src/lxml/parser.pxi", line 1079, in lxml.etree._BaseParser._parseDoc (src\lxml\lxml.etree.c:109543)
File "src/lxml/parser.pxi", line 573, in lxml.etree._ParserContext._handleParseResultDoc (src\lxml\lxml.etree.c:103404)
File "src/lxml/parser.pxi", line 683, in lxml.etree._handleParseResult (src\lxml\lxml.etree.c:105058)
File "src/lxml/parser.pxi", line 613, in lxml.etree._raiseParseError (src\lxml\lxml.etree.c:103967)
XMLSyntaxError: Namespace prefix xsi for schemaLocation on Document is not defined, line 3, column 32
这是我的代码片段:(它应该基于我的一个来源工作)
from pykml import parser
root = parser.fromstring(open('trim_KML.kml', 'r').read())
print etree.tostring(root.Document.Placemark.LineString.Description)
我已经安装了 pykml 和 lxml 3.6.0,我目前正在使用我的 Python 2.7.10。 kml 文件包含行。 (kml link: https://sites.google.com/site/kmlhostingmwss/trim.kml) 我的 ArcGIS 10.2 也有 Python 2.7。
我是处理 kml 文件的新手。有人可以告诉我我做错了什么吗?或者有没有更简单的方法来编辑 kml 文件的描述?非常感谢你。 :)))
xml有一些问题,如果要消除错误,在第二行添加xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
:
<kml xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2" xmlns:kml="http://www.opengis.net/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom">
然后使用 lxml,以下工作:
import lxml.etree as et
xml = et.parse("trim.kml").getroot()
print(xml.xpath("//kml:Document//kml:Placemark/kml:description", namespaces={"kml":xml.nsmap["kml"]}))
这给你:
[<Element {http://www.opengis.net/kml/2.2}description at 0x7f612d0885f0>, <Element {http://www.opengis.net/kml/2.2}description at 0x7f612d088cb0>, <Element {http://www.opengis.net/kml/2.2}description at 0x7f612d088d40>, <Element {http://www.opengis.net/kml/2.2}description at 0x7f612d088d88>, <Element {http://www.opengis.net/kml/2.2}description at 0x7f612d088dd0>, <Element {http://www.opengis.net/kml/2.2}description at 0x7f612d088e18>]
您也可以使用 lxml.html,它会更好地处理损坏的 xml,数据本身也是 99% html。
您可以通过以下方式从 document.placemark
中获取一个:
from lxml import html
xml = html.parse("trim.kml")
print(xml.xpath("//placemark/description"))
这给你:
[<Element description at 0x7f1c757fad08>, <Element description at 0x7f1c757fad60>, <Element description at 0x7f1c757fadb8>, <Element description at 0x7f1c757fae10>, <Element description at 0x7f1c757fae68>, <Element description at 0x7f1c757faec0>]