这个XML能解析吗?

Is this XML able to be parsed?

我正在尝试解析 XML 响应,但没有成功。

我正在使用 python 请求库连接到 API returns XML.

从 response.content 我得到:

{"GetQuestions":"<Questions><Question><QuestionId>393938<\/QuestionId><QuestionText>Please respond to the following statement:\"The assigned task was easy to complete\"<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393939<\/QuestionId><QuestionText>Did you save your  datafor later? Why\/why not?<\/QuestionText><QuestionType>text<\/QuestionType><QuestionStatus>1<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393940<\/QuestionId><QuestionText>Did you notice how much it cost to find the item? How much was it?<\/QuestionText><QuestionType>text<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393941<\/QuestionId><QuestionText>Did you select ‘signature on form’? Why\/why not?<\/QuestionText><QuestionType>text<\/QuestionType><QuestionStatus>1<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393942<\/QuestionId><QuestionText>Was it easy to find thethe new page? Why\/why not?<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>1<\/QuestionStatus><ExtendedType>4<\/ExtendedType><\/Question><Question><QuestionId>393943<\/QuestionId><QuestionText>Please enter your email. So that we can track your responses, we need you to provide this for each task.<\/QuestionText><QuestionShortCode>email<\/QuestionShortCode><QuestionType>text<\/QuestionType><QuestionStatus>1<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393944<\/QuestionId><QuestionText>Why didn't you save your  datafor later?<\/QuestionText><QuestionType>text<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393945<\/QuestionId><QuestionText>Why did you save your  datafor later?<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>4<\/ExtendedType><\/Question><Question><QuestionId>393946<\/QuestionId><QuestionText>Did you save your  datafor later?<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393947<\/QuestionId><QuestionText>Why didn't you select 'signature on form'?<\/QuestionText><QuestionType>text<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393948<\/QuestionId><QuestionText>Why did you select 'signature on form'?<\/QuestionText><QuestionType>text<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>4444449<\/QuestionId><QuestionText>Did you select ‘signature on form’?<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393950<\/QuestionId><QuestionText>Why wasn't it easy to find thethe new page?<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>4<\/ExtendedType><\/Question><Question><QuestionId>393951<\/QuestionId><QuestionText>Was it easy to find thethe new page?<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393952<\/QuestionId><QuestionText>Please enter your email addressSo that we can track your responses, we need you to provide this for each task<\/QuestionText><QuestionShortCode>email<\/QuestionShortCode><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>4<\/ExtendedType><\/Question><\/Questions>"}

如果我直接将它传递给 ElementTree :

ElementTree.fromstring(response.content)

它returns:

xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0

我从头开始删除:{"GetQuestions":

我从末尾删除:“}

还是returns一个xml.etree.ElementTree.ParseError.

我的方法有问题还是XML?

如有任何建议,我们将不胜感激。

您可以使用BeautifulSoup解析xml内容。当你创建一个变量时,你应该这样写:your_variable = BeautfilSoup(requests.text, features="xml" )。那应该适合你。还要尝试验证您的代码,以便这一行成为完整的代码。一定有错误。但是,真的很难找出它在哪里,因为它是一行代码。您可以访问Validator website

response 中使用 JSON 字符串:

xml.etree.ElementTree.fromstring(response.json()['GetQuestions'])

您可以解析 xml,但您需要先进行少量清理。

见下文

import xml.etree.ElementTree as ET
import requests 
# response is a dict with 1 entry
response = requests.get('api_url_goes_here').json()
# TODO - remove next line when you actually call the API
response = {"GetQuestions":"<Questions><Question><QuestionId>393938<\/QuestionId><QuestionText>Please respond to the following statement:\"The assigned task was easy to complete\"<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393939<\/QuestionId><QuestionText>Did you save your  datafor later? Why\/why not?<\/QuestionText><QuestionType>text<\/QuestionType><QuestionStatus>1<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393940<\/QuestionId><QuestionText>Did you notice how much it cost to find the item? How much was it?<\/QuestionText><QuestionType>text<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393941<\/QuestionId><QuestionText>Did you select ‘signature on form’? Why\/why not?<\/QuestionText><QuestionType>text<\/QuestionType><QuestionStatus>1<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393942<\/QuestionId><QuestionText>Was it easy to find thethe new page? Why\/why not?<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>1<\/QuestionStatus><ExtendedType>4<\/ExtendedType><\/Question><Question><QuestionId>393943<\/QuestionId><QuestionText>Please enter your email. So that we can track your responses, we need you to provide this for each task.<\/QuestionText><QuestionShortCode>email<\/QuestionShortCode><QuestionType>text<\/QuestionType><QuestionStatus>1<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393944<\/QuestionId><QuestionText>Why didn't you save your  datafor later?<\/QuestionText><QuestionType>text<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393945<\/QuestionId><QuestionText>Why did you save your  datafor later?<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>4<\/ExtendedType><\/Question><Question><QuestionId>393946<\/QuestionId><QuestionText>Did you save your  datafor later?<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393947<\/QuestionId><QuestionText>Why didn't you select 'signature on form'?<\/QuestionText><QuestionType>text<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393948<\/QuestionId><QuestionText>Why did you select 'signature on form'?<\/QuestionText><QuestionType>text<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>4444449<\/QuestionId><QuestionText>Did you select ‘signature on form’?<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393950<\/QuestionId><QuestionText>Why wasn't it easy to find thethe new page?<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>4<\/ExtendedType><\/Question><Question><QuestionId>393951<\/QuestionId><QuestionText>Was it easy to find thethe new page?<\/QuestionText><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>0<\/ExtendedType><\/Question><Question><QuestionId>393952<\/QuestionId><QuestionText>Please enter your email addressSo that we can track your responses, we need you to provide this for each task<\/QuestionText><QuestionShortCode>email<\/QuestionShortCode><QuestionType>single<\/QuestionType><QuestionStatus>0<\/QuestionStatus><ExtendedType>4<\/ExtendedType><\/Question><\/Questions>"}

# fetch the xml string and do a quick cleanup
xml = response['GetQuestions'].replace('<\/','</')
root = ET.fromstring(xml)
print(root)

输出

<Element 'Questions' at 0x7f35c68919f0>