不要使用 Python ElementTree 对 Element 文本对象进行编码
Don't encode Element text object using Python ElementTree
我试图在元素的文本节点内使用 HTML 数据,但它得到
编码就好像它不是 HTML 数据。
这是一个 MWE:
from xml.etree import ElementTree as ET
data = '<a href="https://example.com">Example data gained from elsewhere.</a>'
p = ET.Element('p')
p.text = data
p = ET.tostring(p, encoding='utf-8', method='html').decode('utf8')
print(p)
输出是...
<p><a href="https://example.com">Example data gained from elsewhere.</a></p>
我的意思是...
<p><a href="https://example.com">Example data gained from elsewhere.</a></p>
您可以将 HTML 字符串解析为 ElementTree 对象并将其附加到 DOM:
from xml.etree import ElementTree as ET
data = '<a href="https://example.com">Example data gained from elsewhere.</a>'
p = ET.Element('p')
p.append(ET.fromstring(data))
p = ET.tostring(p, encoding='utf-8', method='html').decode('utf8')
print(p)
你的做法是错误的。您正在分配 p.text = data
,它基本上将节点视为文本内容。很明显文本被转义了。
您必须将其添加为 child。如下所示:
from xml.etree import ElementTree as ET
data = '<a href="https://example.com">Example data gained from elsewhere.</a>'
d = ET.fromstring(data)
p = ET.Element('p')
p.append(d)
p = ET.tostring(p, encoding='utf-8', method='html').decode('utf8')
print(p)
给予输出
<p><a href="https://example.com">Example data gained from elsewhere.</a></p>
我试图在元素的文本节点内使用 HTML 数据,但它得到 编码就好像它不是 HTML 数据。
这是一个 MWE:
from xml.etree import ElementTree as ET
data = '<a href="https://example.com">Example data gained from elsewhere.</a>'
p = ET.Element('p')
p.text = data
p = ET.tostring(p, encoding='utf-8', method='html').decode('utf8')
print(p)
输出是...
<p><a href="https://example.com">Example data gained from elsewhere.</a></p>
我的意思是...
<p><a href="https://example.com">Example data gained from elsewhere.</a></p>
您可以将 HTML 字符串解析为 ElementTree 对象并将其附加到 DOM:
from xml.etree import ElementTree as ET
data = '<a href="https://example.com">Example data gained from elsewhere.</a>'
p = ET.Element('p')
p.append(ET.fromstring(data))
p = ET.tostring(p, encoding='utf-8', method='html').decode('utf8')
print(p)
你的做法是错误的。您正在分配 p.text = data
,它基本上将节点视为文本内容。很明显文本被转义了。
您必须将其添加为 child。如下所示:
from xml.etree import ElementTree as ET
data = '<a href="https://example.com">Example data gained from elsewhere.</a>'
d = ET.fromstring(data)
p = ET.Element('p')
p.append(d)
p = ET.tostring(p, encoding='utf-8', method='html').decode('utf8')
print(p)
给予输出
<p><a href="https://example.com">Example data gained from elsewhere.</a></p>