Python 和 XML(土地XML)

Python and XML (LandXML)

好的,我有一些 XML 文件看起来像这样,实际上它是 LandXML 文件,但与其他 XML 文件一样。我需要用 python(目前我正在使用 ElementTree 库)以这种方式解析它以循环遍历子元素 <CoordGeom></CoordGeom> 并制作依赖于子元素的列表 - 它可以是 lineList、spiralList 和 curveList。该列表的内容是属性值(不需要名称)作为每个对象的嵌套列表。类似于:

lineList=[[10.014571204947,209.285340662374,776.719431311241 -399.949629732524,813.113864060552 -193.853052659974],[287.308329990254,277.363320844698,558.639337133827 380.929458057393,293.835705515579 463.448840215686]]

这是一个示例代码:

<?xml version="1.0"?>
<LandXML xmlns="http://www.landxml.org/schema/LandXML-1.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.landxml.org/schema/LandXML-1.2 http://www.landxml.org/schema/LandXML-1.2/LandXML-1.2.xsd" date="2015-05-15" time="09:47:43" version="1.2" language="English" readOnly="false">
    <Units>
        <Metric areaUnit="squareMeter" linearUnit="meter" volumeUnit="cubicMeter" temperatureUnit="celsius" pressureUnit="milliBars" diameterUnit="millimeter" angularUnit="decimal degrees" directionUnit="decimal degrees"></Metric>
    </Units>
    <Project name="C:\Users\Rade\Downloads\Situacija i profili.dwg"></Project>
    <Application name="AutoCAD Civil 3D" desc="Civil 3D" manufacturer="Autodesk, Inc." version="2014" manufacturerURL="www.autodesk.com/civil" timeStamp="2015-05-15T09:47:43"></Application>
    <Alignments name="">
        <Alignment name="Proba" length="1201.057158008475" staStart="0." desc="">
            <CoordGeom>
                <Line dir="10.014571204947" length="209.285340662374">
                    <Start>776.719431311241 -399.949629732524</Start>
                    <End>813.113864060552 -193.853052659974</End>
                </Line>
                <Spiral length="435.309621307305" radiusEnd="300." radiusStart="INF" rot="cw" spiType="clothoid" theta="41.569006803911" totalY="101.382259815422" totalX="412.947724836996" tanLong="298.633648469722" tanShort="152.794210168398">
                    <Start>813.113864060552 -193.853052659974</Start>
                    <PI>865.04584458778 100.230482065888</PI>
                    <End>785.087350093002 230.433054310499</End>
                </Spiral>
                <Curve rot="cw" chord="150.078507004323" crvType="arc" delta="28.970510103309" dirEnd="299.475054297727" dirStart="328.445564401036" external="9.849481983234" length="151.689236185509" midOrd="9.536387074322" radius="300." tangent="77.502912753511">
                    <Start>785.087350093002 230.433054310499</Start>
                    <Center>529.44434090873 73.440532656728</Center>
                    <End>677.05771309169 334.61153478517</End>
                    <PI>744.529424397382 296.476647100012</PI>
                </Curve>
                <Spiral length="127.409639008589" radiusEnd="INF" radiusStart="300." rot="cw" spiType="clothoid" theta="12.166724307463" totalY="8.989447716697" totalX="126.8363181841" tanLong="85.141254974713" tanShort="42.653117896421">
                    <Start>677.05771309169 334.61153478517</Start>
                    <PI>639.925187941987 355.598770007863</PI>
                    <End>558.639337133827 380.929458057393</End>
                </Spiral>
                <Line dir="287.308329990254" length="277.363320844698">
                    <Start>558.639337133827 380.929458057393</Start>
                    <End>293.835705515579 463.448840215686</End>
                </Line>
            </CoordGeom>
            <Profile name="Proba">
                <ProfAlign name="Niveleta">
                    <PVI>0. 329.48636525895</PVI>
                    <CircCurve length="69.993187715052" radius="5000.">512.581836381869 330.511528931714</CircCurve>
                    <CircCurve length="39.994027682446" radius="5000.">948.834372016349 337.491569501865</CircCurve>
                    <PVI>1201.057158008475 339.509351789802</PVI>
                </ProfAlign>
            </Profile>
        </Alignment>
    </Alignments>
</LandXML>

此示例远未准备好投入生产,但它应该包含您实现完整解决方案所需的一切。这只搜索线,不知道它需要在其他类型的几何体上寻找哪些属性,并且没有任何类型的错误处理。

而且不漂亮。

from lxml import etree

doc = etree.parse('/tmp/stuff.xml')
geometry = doc.find('.//{http://www.landxml.org/schema/LandXML-1.2}CoordGeom')

line_list = []

for child in geometry:
    child_list = []
    if child.tag == '{http://www.landxml.org/schema/LandXML-1.2}Line':
        child_list.append(float(child.attrib['dir']))
        child_list.append(float(child.attrib['length']))

        start, end = child.find('{http://www.landxml.org/schema/LandXML-1.2}Start'), child.find('{http://www.landxml.org/schema/LandXML-1.2}End')
        child_list.extend([float(coord) for coord in start.text.split(' ')])
        child_list.extend([float(coord) for coord in end.text.split(' ')])

        line_list.append(child_list)

print line_list

如果您想扩展此示例,请通读 lxml tutorial。我用过的一切都在教程中。