从 <HtmlElement> Python 中删除 BOM 字符
Removing BOM characters from <HtmlElement> Python
我正在尝试以这种方式从 URL 加载 html 标记,然后 运行 一些 xpath 查询,但是页面源加载了 BOM,我该如何在我 运行 xpath?
之前删除它们
session = requests.Session()
page = session.get(url)
page_data = lxml.html.fromstring(page.text)
输出:
u'Re\ufeffverse \ufeffFleece \ufeffHoo\ufeffded S\ufeffwea\ufefftshi\ufeffrt'
session = requests.Session()
page=session.get(url)
page_data = lxml.html.fromstring(page.text)
float=lxml.html.tostring(page_data).replace('', '')
page_data = lxml.html.fromstring(float)
我正在尝试以这种方式从 URL 加载 html 标记,然后 运行 一些 xpath 查询,但是页面源加载了 BOM,我该如何在我 运行 xpath?
之前删除它们session = requests.Session()
page = session.get(url)
page_data = lxml.html.fromstring(page.text)
输出:
u'Re\ufeffverse \ufeffFleece \ufeffHoo\ufeffded S\ufeffwea\ufefftshi\ufeffrt'
session = requests.Session()
page=session.get(url)
page_data = lxml.html.fromstring(page.text)
float=lxml.html.tostring(page_data).replace('', '')
page_data = lxml.html.fromstring(float)