从 HTML 个标签中删除评论
Remove Comments from HTML Tags
参考 How can I strip comment tags from HTML using BeautifulSoup? ,我试图从下面的标签中删除评论
>>> h
<h4 class="col-sm-4"><!-- react-text: 124 -->52 Week High/Low:<!-- /react-text --><b><!-- react-text: 126 --> ₹ <!-- /react-text --><!-- react-text: 127 -->394.00<!-- /react-text --><!-- react-text: 128 --> / ₹ <!-- /react-text --><!-- react-text: 129 -->252.10<!-- /react-text --></b></h4>
我的代码 -
comments = h.findAll(text=lambda text:isinstance(text, Comment))
[comment.extract() for comment in comments]
print h
但是搜索评论没有结果。我想从上面的标签中提取 2 个值 - "52 Week High/Low:" 和 "₩ 394.00 / ₩ 252.10" .
我还尝试使用
从整个html中删除标签
soup = BeautifulSoup(html)
comments = soup.findAll(text=lambda text:isinstance(text, Comment))
[comment.extract() for comment in comments]
print soup
但是评论还在..有什么建议吗?
您使用的是 Python2.7
和 BeautifulSoup4
吗?如果不是后者,我会安装 BeautifulSoup4
.
pip install beautifulsoup4
以下脚本适合我。我刚刚从上面的问题中复制并粘贴了 运行 它。
from bs4 import BeautifulSoup, Comment
html = """<h4 class="col-sm-4"><!-- react-text: 124 -->52 Week High/Low:<!-- /react-text --><b><!-- react-text: 126 --> ₹ <!-- /react-text --><!-- react-text: 127 -->394.00<!-- /react-text --><!-- react-text: 128 --> / ₹ <!-- /react-text --><!-- react-text: 129 -->252.10<!-- /react-text --></b></h4>"""
soup = BeautifulSoup(html)
comments = soup.findAll(text=lambda text:isinstance(text, Comment))
# nit: It isn't good practice to use a list comprehension only for its
# side-effects. (Wastes space constructing an unused list)
for comment in comments:
comment.extract()
print soup
Note: It's a good thing you posted the print
statement. Wouldn't have known it was Python 2 otherwise. Posting the Python version helps too.
参考 How can I strip comment tags from HTML using BeautifulSoup? ,我试图从下面的标签中删除评论
>>> h
<h4 class="col-sm-4"><!-- react-text: 124 -->52 Week High/Low:<!-- /react-text --><b><!-- react-text: 126 --> ₹ <!-- /react-text --><!-- react-text: 127 -->394.00<!-- /react-text --><!-- react-text: 128 --> / ₹ <!-- /react-text --><!-- react-text: 129 -->252.10<!-- /react-text --></b></h4>
我的代码 -
comments = h.findAll(text=lambda text:isinstance(text, Comment))
[comment.extract() for comment in comments]
print h
但是搜索评论没有结果。我想从上面的标签中提取 2 个值 - "52 Week High/Low:" 和 "₩ 394.00 / ₩ 252.10" .
我还尝试使用
从整个html中删除标签soup = BeautifulSoup(html)
comments = soup.findAll(text=lambda text:isinstance(text, Comment))
[comment.extract() for comment in comments]
print soup
但是评论还在..有什么建议吗?
您使用的是 Python2.7
和 BeautifulSoup4
吗?如果不是后者,我会安装 BeautifulSoup4
.
pip install beautifulsoup4
以下脚本适合我。我刚刚从上面的问题中复制并粘贴了 运行 它。
from bs4 import BeautifulSoup, Comment
html = """<h4 class="col-sm-4"><!-- react-text: 124 -->52 Week High/Low:<!-- /react-text --><b><!-- react-text: 126 --> ₹ <!-- /react-text --><!-- react-text: 127 -->394.00<!-- /react-text --><!-- react-text: 128 --> / ₹ <!-- /react-text --><!-- react-text: 129 -->252.10<!-- /react-text --></b></h4>"""
soup = BeautifulSoup(html)
comments = soup.findAll(text=lambda text:isinstance(text, Comment))
# nit: It isn't good practice to use a list comprehension only for its
# side-effects. (Wastes space constructing an unused list)
for comment in comments:
comment.extract()
print soup
Note: It's a good thing you posted the