Python lxml 网络抓取

Python lxml web scraping

from lxml import html
import requests

page = requests.get('https://projecteuler.net/problem=1')
tree = html.fromstring(page.content)
text=tree.xpath('//div[@class="problem_content"]/text()')
print (text)

我有这段代码,因此我想获取描述问题的文本,在本例中为:

"If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000."

但是,我收到了:

['\r\n', '\n', '\n']

发现文本本身包含在 <p> 槽中,所以 xpath 行应该像

text=tree.xpath('//div[@role="problem"]/p/text()')