IOError 将请求 Response.content 传递给 lxml.etree.parse()

IOError passing requests Response.content to lxml.etree.parse()

我在网页上有以下 xml -

<entry>
    <id>1750</id>
    <title>variablename</title>
    <source>
      com.tidalsoft.webclient.tes.dsp.db.datatypes.Variable
    </source>
    <tes:variable>
        <tes:ownername>ownergroup</tes:ownername>
        <tes:productiondate>2015-08-17T00:00:00-0400</tes:productiondate>
        <tes:readonly>N</tes:readonly>
        <tes:publish>N</tes:publish>
        <tes:description>
          Decription Here
        </tes:description>
        <tes:startcalendar>0</tes:startcalendar>
        <tes:ownerid>666</tes:ownerid>
        <tes:type>1</tes:type>
        <tes:lastusermodifiedtime>2015-06-15T15:42:27-0400</tes:lastusermodifiedtime>
        <tes:innervalue>\share\location</tes:innervalue>
        <tes:calc>N</tes:calc>
        <tes:name>variablename</tes:name>
        <tes:startdate>1899-12-30T00:00:00-0500</tes:startdate>
        <tes:pub>Y</tes:pub>
        <tes:lastvalue>\share\location</tes:lastvalue>
        <tes:id>1750</tes:id>
        <tes:startdateasstring>18991230000000</tes:startdateasstring>
        <tes:lastchangetime>2015-06-15T15:42:27-0400</tes:lastchangetime>
        <tes:clientcachelastchangetime>2015-08-17T09:56:49-0400</tes:clientcachelastchangetime>
    </tes:variable>
</entry>

我正在尝试解析此数据。我有一个通过请求 -

r = requests.get(url, auth=('username', 'password'))

但是当我尝试解析内容时出现错误。

>>> xmlObject = etree.parse(r.content)
Traceback (most recent call last):
  File "apiTest.py", line 46, in <module>
    xmlObject = etree.parse(r.content)
  File "lxml.etree.pyx", line 3310, in lxml.etree.parse (src\lxml\lxml.etree.c:7
2517)
  File "parser.pxi", line 1791, in lxml.etree._parseDocument (src\lxml\lxml.etre
e.c:105979)
  File "parser.pxi", line 1817, in lxml.etree._parseDocumentFromURL (src\lxml\lx
ml.etree.c:106278)
  File "parser.pxi", line 1721, in lxml.etree._parseDocFromFile (src\lxml\lxml.e
tree.c:105277)
  File "parser.pxi", line 1122, in lxml.etree._BaseParser._parseDocFromFile (src
\lxml\lxml.etree.c:100227)
  File "parser.pxi", line 580, in lxml.etree._ParserContext._handleParseResultDo
c (src\lxml\lxml.etree.c:94350)
  File "parser.pxi", line 690, in lxml.etree._handleParseResult (src\lxml\lxml.e
tree.c:95786)
  File "parser.pxi", line 618, in lxml.etree._raiseParseError (src\lxml\lxml.etr
ee.c:94818)
IOError: Error reading file ''

在最后一行引号之间的是 xml 在开头声明为字符串 -

<?xml version="1.0" encoding="UTF-8" standalone="ye s"?><entry xmlns="http://purl.org/atom/ns#"><id>1750</id><title>....

数据作为内容类型提供:text/xml

etree.parse 需要一个文件名、一个类文件对象或一个 URL 作为它的第一个参数(参见 help(etree.parse))。它不期望 XML 字符串。要解析 XML 字符串,请使用

xmlObject = etree.fromstring(r.content)

注意etree.fromstringreturns一个lxml.etree._Element。相比之下,etree.parsereturns一个lxml.etree._ElementTree。给定_Element,可以用getroottree方法得到_ElementTree

xmlTree = xmlObject.getroottree()