如何在 Python 中解析此 XML 响应?

How to parse this XML response in Python?

这是我的 XML 文件:

<?xml version="1.0" ?>
<Items>
    <Item>
        <ASIN>3570102769</ASIN>
        <DetailPageURL>http://www.amazon.de/Inside-IS-Tage-Islamischen-Staat/dp/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D165953%26creativeASIN%3D3570102769</DetailPageURL>
        <ItemLinks>
            <ItemLink>
                <Description>Add To Wishlist</Description>
                <URL>http://www.amazon.de/gp/registry/wishlist/add-item.html%3Fasin.0%3D3570102769%26SubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL>
            </ItemLink>
            <ItemLink>
                <Description>Tell A Friend</Description>
                <URL>http://www.amazon.de/gp/pdp/taf/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL>
            </ItemLink>
            <ItemLink>
                <Description>All Customer Reviews</Description>
                <URL>http://www.amazon.de/review/product/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL>
            </ItemLink>
            <ItemLink>
                <Description>All Offers</Description>
                <URL>http://www.amazon.de/gp/offer-listing/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</URL>
            </ItemLink>
        </ItemLinks>
        <ItemAttributes>
            <Author>Jürgen Todenhöfer</Author>
            <Binding>Gebundene Ausgabe</Binding>
            <EAN>9783570102763</EAN>
            <EANList>
                <EANListElement>9783570102763</EANListElement>
            </EANList>
            <ISBN>3570102769</ISBN>
            <IsEligibleForTradeIn>1</IsEligibleForTradeIn>
            <ItemDimensions>
                <Height Units="hundredths-inches">874</Height>
                <Length Units="hundredths-inches">575</Length>
                <Width Units="hundredths-inches">126</Width>
            </ItemDimensions>
            <Label>C. Bertelsmann Verlag</Label>
            <Languages>
                <Language>
                    <Name>Deutsch</Name>
                    <Type>Published</Type>
                </Language>
                <Language>
                    <Name>Deutsch</Name>
                    <Type>Original</Type>
                </Language>
                <Language>
                    <Name>Deutsch</Name>
                    <Type>Unbekannt</Type>
                </Language>
            </Languages>
            <ListPrice>
                <Amount>1799</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 17,99</FormattedPrice>
            </ListPrice>
            <Manufacturer>C. Bertelsmann Verlag</Manufacturer>
            <ManufacturerMinimumAge Units="months">192</ManufacturerMinimumAge>
            <NumberOfPages>288</NumberOfPages>
            <PackageDimensions>
                <Height Units="hundredths-inches">118</Height>
                <Length Units="hundredths-inches">567</Length>
                <Weight Units="hundredths-pounds">93</Weight>
                <Width Units="hundredths-inches">252</Width>
            </PackageDimensions>
            <PackageQuantity>1</PackageQuantity>
            <ProductGroup>Book</ProductGroup>
            <ProductTypeName>ABIS_BOOK</ProductTypeName>
            <PublicationDate>2015-04-27</PublicationDate>
            <Publisher>C. Bertelsmann Verlag</Publisher>
            <Studio>C. Bertelsmann Verlag</Studio>
            <Title>Inside IS - 10 Tage im 'Islamischen Staat'</Title>
            <TradeInValue>
                <Amount>930</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 9,30</FormattedPrice>
            </TradeInValue>
        </ItemAttributes>
        <OfferSummary>
            <LowestNewPrice>
                <Amount>1799</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 17,99</FormattedPrice>
            </LowestNewPrice>
            <LowestUsedPrice>
                <Amount>1390</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 13,90</FormattedPrice>
            </LowestUsedPrice>
            <LowestCollectiblePrice>
                <Amount>4999</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 49,99</FormattedPrice>
            </LowestCollectiblePrice>
            <TotalNew>56</TotalNew>
            <TotalUsed>8</TotalUsed>
            <TotalCollectible>1</TotalCollectible>
            <TotalRefurbished>0</TotalRefurbished>
        </OfferSummary>
        <Offers>
            <TotalOffers>1</TotalOffers>
            <TotalOfferPages>1</TotalOfferPages>
            <MoreOffersUrl>http://www.amazon.de/gp/offer-listing/3570102769%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3570102769</MoreOffersUrl>
            <Offer>
                <OfferAttributes>
                    <Condition>New</Condition>
                </OfferAttributes>
                <OfferListing>
                    <OfferListingId>9KHCZj9qtL6ucVBPASfXaryQjU8tWbc0n%2F3F4F7GraOKW6Csji2OxpD93%2FkoHwgIGQctlnrtx4RWIeJULAcvvsFhiopFi08JdsZ%2FeO3u6g0%3D</OfferListingId>
                    <Price>
                        <Amount>1799</Amount>
                        <CurrencyCode>EUR</CurrencyCode>
                        <FormattedPrice>EUR 17,99</FormattedPrice>
                    </Price>
                    <Availability>Gewöhnlich versandfertig in 24 Stunden</Availability>
                    <AvailabilityAttributes>
                        <AvailabilityType>now</AvailabilityType>
                        <MinimumHours>0</MinimumHours>
                        <MaximumHours>0</MaximumHours>
                    </AvailabilityAttributes>
                    <IsEligibleForSuperSaverShipping>1</IsEligibleForSuperSaverShipping>
                </OfferListing>
            </Offer>
        </Offers>
    </Item>
    <Item>
        <ASIN>3813506479</ASIN>
        <DetailPageURL>http://www.amazon.de/Altes-Land-Roman-D%C3%B6rte-Hansen/dp/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D165953%26creativeASIN%3D3813506479</DetailPageURL>
        <ItemLinks>
            <ItemLink>
                <Description>Add To Wishlist</Description>
                <URL>http://www.amazon.de/gp/registry/wishlist/add-item.html%3Fasin.0%3D3813506479%26SubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL>
            </ItemLink>
            <ItemLink>
                <Description>Tell A Friend</Description>
                <URL>http://www.amazon.de/gp/pdp/taf/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL>
            </ItemLink>
            <ItemLink>
                <Description>All Customer Reviews</Description>
                <URL>http://www.amazon.de/review/product/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL>
            </ItemLink>
            <ItemLink>
                <Description>All Offers</Description>
                <URL>http://www.amazon.de/gp/offer-listing/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</URL>
            </ItemLink>
        </ItemLinks>
        <ItemAttributes>
            <Author>Dörte Hansen</Author>
            <Binding>Gebundene Ausgabe</Binding>
            <EAN>9783813506471</EAN>
            <EANList>
                <EANListElement>9783813506471</EANListElement>
            </EANList>
            <ISBN>3813506479</ISBN>
            <IsEligibleForTradeIn>1</IsEligibleForTradeIn>
            <ItemDimensions>
                <Height Units="hundredths-inches">870</Height>
                <Length Units="hundredths-inches">567</Length>
                <Width Units="hundredths-inches">114</Width>
            </ItemDimensions>
            <Label>Albrecht Knaus Verlag</Label>
            <Languages>
                <Language>
                    <Name>Deutsch</Name>
                    <Type>Published</Type>
                </Language>
                <Language>
                    <Name>Deutsch</Name>
                    <Type>Original</Type>
                </Language>
            </Languages>
            <ListPrice>
                <Amount>1999</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 19,99</FormattedPrice>
            </ListPrice>
            <Manufacturer>Albrecht Knaus Verlag</Manufacturer>
            <NumberOfPages>288</NumberOfPages>
            <PackageDimensions>
                <Height Units="hundredths-inches">118</Height>
                <Length Units="hundredths-inches">858</Length>
                <Weight Units="hundredths-pounds">101</Weight>
                <Width Units="hundredths-inches">559</Width>
            </PackageDimensions>
            <ProductGroup>Book</ProductGroup>
            <ProductTypeName>ABIS_BOOK</ProductTypeName>
            <PublicationDate>2015-02-16</PublicationDate>
            <Publisher>Albrecht Knaus Verlag</Publisher>
            <Studio>Albrecht Knaus Verlag</Studio>
            <Title>Altes Land: Roman</Title>
            <TradeInValue>
                <Amount>965</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 9,65</FormattedPrice>
            </TradeInValue>
        </ItemAttributes>
        <OfferSummary>
            <LowestNewPrice>
                <Amount>1999</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 19,99</FormattedPrice>
            </LowestNewPrice>
            <LowestUsedPrice>
                <Amount>1599</Amount>
                <CurrencyCode>EUR</CurrencyCode>
                <FormattedPrice>EUR 15,99</FormattedPrice>
            </LowestUsedPrice>
            <TotalNew>72</TotalNew>
            <TotalUsed>8</TotalUsed>
            <TotalCollectible>0</TotalCollectible>
            <TotalRefurbished>0</TotalRefurbished>
        </OfferSummary>
        <Offers>
            <TotalOffers>1</TotalOffers>
            <TotalOfferPages>1</TotalOfferPages>
            <MoreOffersUrl>http://www.amazon.de/gp/offer-listing/3813506479%3FSubscriptionId%3DAKIAI554OLCUMRCYB7ZA%26tag%3DjPp08vuSO4osfgfbCbEdF7TNqnWOm7YtprtqRPB9%26linkCode%3Dxm2%26camp%3D2025%26creative%3D12738%26creativeASIN%3D3813506479</MoreOffersUrl>
            <Offer>
                <OfferAttributes>
                    <Condition>New</Condition>
                </OfferAttributes>
                <OfferListing>
                    <OfferListingId>aeRv5KPt26T8S0hLrgV8Bv9UPYABYOMijGRxffbNJXUZSN4XfeeOZZpCZ28EURzmgMLlcYEBSRlMXS%2F8Z0pN1JbYerndME%2B2VK3RosfdQJA%3D</OfferListingId>
                    <Price>
                        <Amount>1999</Amount>
                        <CurrencyCode>EUR</CurrencyCode>
                        <FormattedPrice>EUR 19,99</FormattedPrice>
                    </Price>
                    <Availability>Gewöhnlich versandfertig in 24 Stunden</Availability>
                    <AvailabilityAttributes>
                        <AvailabilityType>now</AvailabilityType>
                        <MinimumHours>0</MinimumHours>
                        <MaximumHours>0</MaximumHours>
                    </AvailabilityAttributes>
                    <IsEligibleForSuperSaverShipping>1</IsEligibleForSuperSaverShipping>
                </OfferListing>
            </Offer>
        </Offers>
    </Item>
</Items>

我想获取任何 ASIN 元素。所以我尝试了这个:

from lxml import etree
doc = etree.fromstring(xmlstring)
items = doc.xpath('//Items/Item')
for a in items:
    asin = a.xpath('//ASIN/text()')
    print asin

我得到的是这样的:

['3570102769', '3813506479']
['3570102769', '3813506479']

但我想要这个:

['3570102769']
['3813506479']

我不明白这里有什么问题?我想我应该遍历任何元素,并且在每个元素中都是 one item with one asin。为什么它 return twotwo asin?

当您搜索 a.xpath('//ASIN/text()') 时,您正在再次搜索完整的文档树。引用自 XML Path language specification:

//para selects all the para descendants of the document root and thus selects all para elements in the same document as the context node

所以你正在做的是遍历匹配的 Item 节点并说 "Give me all ASIN nodes in this document please"。此(Item 节点)的上下文被忽略。

您应该做的是直接 select ASIN 子节点。按照您的原始实现,这可能如下所示:

doc = etree.fromstring(xmlstring)
items = doc.xpath('//Items/Item')
for a in items:
    asin = a.xpath('ASIN/text()')
    print asin

它给出了你想要的输出:

['3570102769']
['3813506479']

或者,如果您不确定 Item 节点中您的 ASIN 出现的位置,您可以使用 .//ASIN/text()