从 HTML 个标签中删除评论

Remove Comments from HTML Tags

参考 How can I strip comment tags from HTML using BeautifulSoup? ,我试图从下面的标签中删除评论

>>> h
<h4 class="col-sm-4"><!-- react-text: 124 -->52 Week High/Low:<!-- /react-text --><b><!-- react-text: 126 --> ₹ <!-- /react-text --><!-- react-text: 127 -->394.00<!-- /react-text --><!-- react-text: 128 --> / ₹ <!-- /react-text --><!-- react-text: 129 -->252.10<!-- /react-text --></b></h4>

我的代码 -

comments = h.findAll(text=lambda text:isinstance(text, Comment))
[comment.extract() for comment in comments]
print h

但是搜索评论没有结果。我想从上面的标签中提取 2 个值 - "52 Week High/Low:""₩ 394.00 / ₩ 252.10" .

我还尝试使用

从整个html中删除标签
soup = BeautifulSoup(html)
comments = soup.findAll(text=lambda text:isinstance(text, Comment))
[comment.extract() for comment in comments]
print soup

但是评论还在..有什么建议吗?

您使用的是 Python2.7BeautifulSoup4 吗?如果不是后者,我会安装 BeautifulSoup4.

pip install beautifulsoup4

以下脚本适合我。我刚刚从上面的问题中复制并粘贴了 运行 它。

from bs4 import BeautifulSoup, Comment

html = """<h4 class="col-sm-4"><!-- react-text: 124 -->52 Week High/Low:<!-- /react-text --><b><!-- react-text: 126 --> ₹ <!-- /react-text --><!-- react-text: 127 -->394.00<!-- /react-text --><!-- react-text: 128 --> / ₹ <!-- /react-text --><!-- react-text: 129 -->252.10<!-- /react-text --></b></h4>"""
soup = BeautifulSoup(html)
comments = soup.findAll(text=lambda text:isinstance(text, Comment))

# nit: It isn't good practice to use a list comprehension only for its
# side-effects. (Wastes space constructing an unused list)
for comment in comments:
   comment.extract()

print soup

Note: It's a good thing you posted the print statement. Wouldn't have known it was Python 2 otherwise. Posting the Python version helps too.