Python / XML 请求

Python / XML request

我是 python 的新手,我尝试请求网站获取 public 传输信息,然后我想在我的 raspberry-pi 的小显示器上显示这些信息。

import request

xml = """<?xml version="1.0" encoding="UTF-8"?>
<Trias version="1.1" xmlns="http://www.vdv.de/trias" xmlns:siri="http://www.siri.org.uk/siri" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <ServiceRequest>
        <siri:RequestTimestamp>2016-06-27T13:34:00</siri:RequestTimestamp>
        <siri:RequestorRef>EPSa</siri:RequestorRef>
        <RequestPayload>
            <StopEventRequest>
                <Location>
                    <LocationRef>
                        <StopPointRef>8578169</StopPointRef>
                    </LocationRef>
                </Location>
                <Params>
                    <NumberOfResults>5</NumberOfResults>
                    <StopEventType>departure</StopEventType>
                    <IncludePreviousCalls>false</IncludePreviousCalls>
                    <IncludeOnwardCalls>false</IncludeOnwardCalls>
                    <IncludeRealtimeData>true</IncludeRealtimeData>
                </Params>
            </StopEventRequest>
        </RequestPayload>
    </ServiceRequest>
</Trias>"""

headers = {'Authorization': *'#MYCODE'*, 'Content-Type': 'application/xml'}

answer = requests.post('https://api.opentransportdata.swiss/trias', data=xml, headers=headers)

答案是什么:

<?xml version="1.0" encoding="UTF-8"?>
<Trias xmlns="http://www.vdv.de/trias" version="1.1">
<ServiceDelivery>
    <ResponseTimestamp xmlns="http://www.siri.org.uk/siri">2018-11-19T14:17:42Z</ResponseTimestamp>
    <ProducerRef xmlns="http://www.siri.org.uk/siri">EFAController10.2.9.62-WIN-G0NJHFUK71P</ProducerRef>
    <Status xmlns="http://www.siri.org.uk/siri">true</Status>
    <MoreData>false</MoreData>
    <Language>de</Language>
    <DeliveryPayload>
        <StopEventResponse>
            <StopEventResult>
                <ResultId>ID-8E6262DF-2FB8-4591-97A3-AC3E94E56635</ResultId>
                <StopEvent>
                    <ThisCall>
                        <CallAtStop>
                            <StopPointRef>8578169</StopPointRef>
                            <StopPointName>
                                <Text>Basel, Thomaskirche</Text>
                                <Language>de</Language>
                            </StopPointName>
                            <ServiceDeparture>
                                <TimetabledTime>2018-11-19T14:16:00Z</TimetabledTime>
                                <EstimatedTime>2018-11-19T14:17:00Z</EstimatedTime>
                            </ServiceDeparture>
                            <StopSeqNumber>31</StopSeqNumber>
                        </CallAtStop>
                    </ThisCall>
                    <Service>
                        <OperatingDayRef>2018-11-19</OperatingDayRef>
                        <JourneyRef>odp:05036::H:j18:36143:36143</JourneyRef>
                        <LineRef>odp:05036::H</LineRef>
                        <DirectionRef>outward</DirectionRef>
                        <Mode>
                            <PtMode>bus</PtMode>
                            <BusSubmode>regionalBus</BusSubmode>
                            <Name>
                                <Text>Bus</Text>
                                <Language>de</Language>
                            </Name>
                        </Mode>
                        <PublishedLineName>
                            <Text>36</Text>
                            <Language>de</Language>
                        </PublishedLineName>
                        <OperatorRef>odp:823</OperatorRef>
                        <OriginStopPointRef>8589334</OriginStopPointRef>
                        <OriginText>
                            <Text>Basel, Kleinhüningen</Text>
                            <Language>de</Language>
                        </OriginText>
                        <DestinationStopPointRef>8588780</DestinationStopPointRef>
                        <DestinationText>
                            <Text>Basel, Schifflände</Text>
                            <Language>de</Language>
                        </DestinationText>
                    </Service>
                </StopEvent>
            </StopEventResult>
        </StopEventResponse>
    </DeliveryPayload>
</ServiceDelivery>

我现在如何才能继续从中获取一些信息? (对 TimetabledTime 和 EstimatedTime 感兴趣)

我尝试使用 ElementTree,但它并没有真正起作用。

提前致谢!

数据提供者的网站:https://opentransportdata.swiss/en/cookbook/departurearrival-display/

如果你想坚持标准 Python 你可以使用:html.parser

https://docs.python.org/3/library/html.parser.html

还有很多第三方库让生活更轻松(google "html parsing python")

I tried to use the ElementTree but it did not really work.

正如mzjn所说,您应该向我们提供更多有关您遇到的困难的信息。

无论如何,如果您想解析 xml,我建议您使用第三方库来简化您的工作。在我的示例中,我使用了 BeautifulSoup:

from bs4 import BeautifulSoup
from datetime import datetime

test_answer = """<?xml version="1.0" encoding="UTF-8"?>
<Trias xmlns="http://www.vdv.de/trias" version="1.1">
<ServiceDelivery>
    <ResponseTimestamp xmlns="http://www.siri.org.uk/siri">2018-11-19T14:17:42Z</ResponseTimestamp>
    <ProducerRef xmlns="http://www.siri.org.uk/siri">EFAController10.2.9.62-WIN-G0NJHFUK71P</ProducerRef>
    <Status xmlns="http://www.siri.org.uk/siri">true</Status>
    <MoreData>false</MoreData>
    <Language>de</Language>
    <DeliveryPayload>
        <StopEventResponse>
            <StopEventResult>
                <ResultId>ID-8E6262DF-2FB8-4591-97A3-AC3E94E56635</ResultId>
                <StopEvent>
                    <ThisCall>
                        <CallAtStop>
                            <StopPointRef>8578169</StopPointRef>
                            <StopPointName>
                                <Text>Basel, Thomaskirche</Text>
                                <Language>de</Language>
                            </StopPointName>
                            <ServiceDeparture>
                                <TimetabledTime>2018-11-19T14:16:00Z</TimetabledTime>
                                <EstimatedTime>2018-11-19T14:17:00Z</EstimatedTime>
                            </ServiceDeparture>
                            <StopSeqNumber>31</StopSeqNumber>
                        </CallAtStop>
                    </ThisCall>
                    <Service>
                        <OperatingDayRef>2018-11-19</OperatingDayRef>
                        <JourneyRef>odp:05036::H:j18:36143:36143</JourneyRef>
                        <LineRef>odp:05036::H</LineRef>
                        <DirectionRef>outward</DirectionRef>
                        <Mode>
                            <PtMode>bus</PtMode>
                            <BusSubmode>regionalBus</BusSubmode>
                            <Name>
                                <Text>Bus</Text>
                                <Language>de</Language>
                            </Name>
                        </Mode>
                        <PublishedLineName>
                            <Text>36</Text>
                            <Language>de</Language>
                        </PublishedLineName>
                        <OperatorRef>odp:823</OperatorRef>
                        <OriginStopPointRef>8589334</OriginStopPointRef>
                        <OriginText>
                            <Text>Basel, Kleinhüningen</Text>
                            <Language>de</Language>
                        </OriginText>
                        <DestinationStopPointRef>8588780</DestinationStopPointRef>
                        <DestinationText>
                            <Text>Basel, Schifflände</Text>
                            <Language>de</Language>
                        </DestinationText>
                    </Service>
                </StopEvent>
            </StopEventResult>
        </StopEventResponse>
    </DeliveryPayload>
</ServiceDelivery>
</Trias>"""

soup = BeautifulSoup(test_answer, "html.parser")
service_departure = soup.find("servicedeparture")

# as Tag objects
timetabled_time = service_departure.timetabledtime
estimated_time = service_departure.estimatedtime

# as strings
timetabled_time = timetabled_time.text
estimated_time = estimated_time.text

# as datetime objects
date_format = "%Y-%m-%dT%H:%M:%SZ"
timetabled_time = datetime.strptime(timetabled_time, date_format)
estimated_time = datetime.strptime(estimated_time, date_format)

print("Timetabled time: {} at {}".format(timetabled_time.date(), timetabled_time.time()))
print("Estimated time: {} at {}".format(estimated_time.date(), estimated_time.time()))

这会打印:

Timetabled time: 2018-11-19 at 14:16:00
Estimated time: 2018-11-19 at 14:17:00

I tried to use the ElementTree but it did not really work.

我认为@mzjn 可能是对的,他们提到:请注意,使用了 XML 个命名空间。

以防万一这就是问题所在,下面是使用 ElementTree 解析 XML 同时正确处理默认命名空间的示例。

我以@AndreaCattaneo 的回答为基础。它产生完全相同的输出。

Python

import xml.etree.ElementTree as ET
from datetime import datetime

test_answer = """<?xml version="1.0" encoding="UTF-8"?>
<Trias xmlns="http://www.vdv.de/trias" version="1.1">
<ServiceDelivery>
    <ResponseTimestamp xmlns="http://www.siri.org.uk/siri">2018-11-19T14:17:42Z</ResponseTimestamp>
    <ProducerRef xmlns="http://www.siri.org.uk/siri">EFAController10.2.9.62-WIN-G0NJHFUK71P</ProducerRef>
    <Status xmlns="http://www.siri.org.uk/siri">true</Status>
    <MoreData>false</MoreData>
    <Language>de</Language>
    <DeliveryPayload>
        <StopEventResponse>
            <StopEventResult>
                <ResultId>ID-8E6262DF-2FB8-4591-97A3-AC3E94E56635</ResultId>
                <StopEvent>
                    <ThisCall>
                        <CallAtStop>
                            <StopPointRef>8578169</StopPointRef>
                            <StopPointName>
                                <Text>Basel, Thomaskirche</Text>
                                <Language>de</Language>
                            </StopPointName>
                            <ServiceDeparture>
                                <TimetabledTime>2018-11-19T14:16:00Z</TimetabledTime>
                                <EstimatedTime>2018-11-19T14:17:00Z</EstimatedTime>
                            </ServiceDeparture>
                            <StopSeqNumber>31</StopSeqNumber>
                        </CallAtStop>
                    </ThisCall>
                    <Service>
                        <OperatingDayRef>2018-11-19</OperatingDayRef>
                        <JourneyRef>odp:05036::H:j18:36143:36143</JourneyRef>
                        <LineRef>odp:05036::H</LineRef>
                        <DirectionRef>outward</DirectionRef>
                        <Mode>
                            <PtMode>bus</PtMode>
                            <BusSubmode>regionalBus</BusSubmode>
                            <Name>
                                <Text>Bus</Text>
                                <Language>de</Language>
                            </Name>
                        </Mode>
                        <PublishedLineName>
                            <Text>36</Text>
                            <Language>de</Language>
                        </PublishedLineName>
                        <OperatorRef>odp:823</OperatorRef>
                        <OriginStopPointRef>8589334</OriginStopPointRef>
                        <OriginText>
                            <Text>Basel, Kleinhüningen</Text>
                            <Language>de</Language>
                        </OriginText>
                        <DestinationStopPointRef>8588780</DestinationStopPointRef>
                        <DestinationText>
                            <Text>Basel, Schifflände</Text>
                            <Language>de</Language>
                        </DestinationText>
                    </Service>
                </StopEvent>
            </StopEventResult>
        </StopEventResponse>
    </DeliveryPayload>
</ServiceDelivery>
</Trias>"""

ns = {"t": "http://www.vdv.de/trias"}

tree = ET.fromstring(test_answer)

# as strings
timetabled_time = tree.find(".//t:TimetabledTime", ns).text
estimated_time = tree.find(".//t:EstimatedTime", ns).text

# as datetime objects
date_format = "%Y-%m-%dT%H:%M:%SZ"
timetabled_time = datetime.strptime(timetabled_time, date_format)
estimated_time = datetime.strptime(estimated_time, date_format)

print("Timetabled time: {} at {}".format(timetabled_time.date(), timetabled_time.time()))
print("Estimated time: {} at {}".format(estimated_time.date(), estimated_time.time()))

输出

Timetabled time: 2018-11-19 at 14:16:00
Estimated time: 2018-11-19 at 14:17:00