Python 循环遍历 XML 中的元素并获取子元素值

Python loop to iterate through elements in an XML and get sub-elements values

我正在处理一个 XML 正在从一个 API 读取,它有几个酒店位置。每家酒店都有一个 "hotel code" 元素,它是 XML 输出中每家酒店的唯一值,我想获取每家酒店的 "latitude" 和 "longitude" 属性。我的代码现在可以解析 XML 并记录 "latitude" 和 "longitude" 的每个实例,但不是组织成对的 lat/lon 用于酒店,而是记录每个纬度XML 然后在 XML 中的每个经度。我不知道怎么说: IF hotel code == the previous hotel code, record latitude/longitude together; ELSE 前往下一家酒店并记录 lat/lon。下面是 XML 输出的示例部分以及我的代码和我的代码输出:

XML:

<hotel code="13272" name="Sonesta Fort Lauderdale Beach" categoryCode="4EST" categoryName="4 STARS" destinationCode="FLL" destinationName="Fort Lauderdale - Hollywood Area - FL" zoneCode="1" zoneName="Fort Lauderdale Beach Area" latitude="26.137508" longitude="-80.103438" minRate="1032.10" maxRate="1032.10" currency="USD"><rooms><room code="DBL.DX" name="DOUBLE DELUXE"><rates><rate rateKey="20161215|20161220|W|235|13272|DBL.DX|GC-ALL|RO||1~1~0||N@675BEABED1984D9E8073EB6154B41AEE" rateClass="NOR" rateType="BOOKABLE" net="1032.10" allotment="238" rateCommentsId="235|38788|431" paymentType="AT_WEB" packaging="false" boardCode="RO" boardName="ROOM ONLY" rooms="1" adults="1" children="0"><cancellationPolicies><cancellationPolicy amount="206.42" from="2016-12-11T23:59:00-05:00"/></cancellationPolicies></rate></rates></room></rooms></hotel>

代码:

import time, hashlib
import urllib2
from xml.dom import minidom

# Your API Key and secret
apiKey =
Secret =

# Signature is generated by SHA256 (Api-Key + Secret + Timestamp (in seconds))
sigStr = "%s%s%d" % (apiKey,Secret,int(time.time()))
signature = hashlib.sha256(sigStr).hexdigest()

endpoint = "https://api.test.hotelbeds.com/hotel-api/1.0/hotels"

try:
    # Create http request and add headers
    req = urllib2.Request(url=endpoint)
    req.add_header("X-Signature", signature)
    req.add_header("Api-Key", apiKey)
    req.add_header("Accept", "application/xml")
    req.add_header("Content-Type", "application/xml")
    req.add_data(' <availabilityRQ xmlns="http://www.hotelbeds.com/schemas/messages" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ><stay checkIn="2016-12-15" checkOut="2016-12-20"/><occupancies><occupancy rooms="1" adults="1" children="0"/></occupancies><geolocation longitude="-80.265323" latitude="26.131510" radius="10" unit="km"/></availabilityRQ>')

    # Reading response and print-out
    file = minidom.parse(urllib2.urlopen(req))
    hotels = file.getElementsByTagName("hotel")
    lat = [items.attributes['latitude'].value for items in hotels]
    lon = [items.attributes['longitude'].value for items in hotels]
    print lat + lon

except urllib2.HTTPError, e:
    # Reading body of response
    httpResonponse = e.read()
    print "%s, reason: %s " % (str(e), httpResonponse)
except urllib2.URLError, e:
    print "Client error: %s" % e.reason
except Exception, e:
    print "General exception: %s " % str(e)

我现在的输出:

[u'26.144224', u'26.122569', u'26.11437', u'26.1243414605478', u'26.119195', u'26.1942424979814', u'26.145488', u'26.1632014441,81914 , u'26.1457688280936', u'26.1868547339183', u'26.1037652256159', u'26.090442389015', u'26.187242', u'-80.325579', u'-80.251829', u'-80.25315', u'-80.2564349700697', U'-80.262738',U'-80.2919112076052',U'-80.258274',U'-80.2584545794579',U'-80.261252' -80.2272565662588', u'-80.20161000000002']

您可以将 XML 文件的结果放入像字典这样的可迭代结构中。 我已经获取了您的示例 xml 数据并将其放入名为 hotels.xml.

的文件中
from xml.dom import minidom

hotels_position = {}

dom = minidom.parse('hotels.xml')
hotels = dom.getElementsByTagName("hotel")

for hotel in hotels:
    hotel_id = hotel.attributes['code'].value
    position = {}
    position['latitude'] = hotel.attributes['latitude'].value
    position['longitude'] = hotel.attributes['longitude'].value
    hotels_position[hotel_id] = position

print(hotels_position)

这段代码输出如下结构(我加了第二家酒店)

{u'13272': {'latitude': u'26.137508', 'longitude': u'-80.103438'}, u'13273': {'latitude': u'26.137508', 'longitude': u'-80.103438'}}

您现在可以遍历字典中的每个酒店。

for hotel in hotels_position:
    print("Hotel {} is located at ({},{})".format(hotel,
                                                  hotels_position[hotel]['latitude'],
                                                  hotels_position[hotel]['latitude']))

既然您的数据结构井井有条,您的 'logic' 将更容易编写。