从 BeautifulSoup4 (python) 的 main soup 中删除特定标签
Delete a specific tag from main soup in BeautifulSoup4 (python)
这是我试过的-看看soup.div.decompose(),我也试过soup.elements.div.decompose()。这也是使用 DataTables 中的内容,这是我第一次使用它,所以如果有更好的方法来实现我正在做的事情,请告诉我!提前致谢!
import bs4
with open('MapPage.html', 'r', encoding="utf8") as f:
txt = f.read()
soup = bs4.BeautifulSoup(txt,"html5lib")
elements = soup.find_all('tr')
elements.pop(0)
def DeleteData(msgID):
for div in elements:
ID = div.find('a').contents[0]
if int(msgID)==int(ID):
soup.div.decompose()
return
print('Failed to delete data from', msgID)
我希望我能够再次将汤写到 'MapPage.html'。产生错误 AttributeError: 'NoneType' object has no attribute 'decompose'
。
这是打印 div
时的输出:
(Link to html file)
如果我没理解错的话,你喜欢decompose()
在其<a>
中包含特定值的<tr>
。
主要问题是您尝试执行 soup.div.decompose()
什么意思,您喜欢 decompose()
首先 <div>
汤对象。
只需使用:
div.decompose()
或者更好地将您的变量名称更改为 none 标签名称:
e.decompose()
例子
from bs4 import BeautifulSoup
html = '''
<html><body>
<h2>Welcome to our collection of community made maps!</h2>
<table id="example" class="cell-border" style="width:100%">
<thead>
<tr><th>ID</th><th>Author</th><th>Content</th><th>Thumbnail</th><th>Download</th><th>Rating</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">939257309387980851</a></td>
<td>Matter</td><td>Cervinia Source</td><td><img src="https://media.discordapp.net/attachments/932881912714895390/939257307290796062/unknown.png" alt="Cervina Thumb" width="300" height="auto"></td><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">Download</a></td><td>5</td>
</tr>
<tr><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">939257309387980852</a></td><td>Tea</td><td>Chamonix</td><td><img src="https://media.discordapp.net/attachments/932881912714895390/939257307290796062/unknown.png" alt="Cervina Thumb" width="300" height="auto"></td><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">Download</a></td><td>5</td></tr>
</tbody>
</table>
</body></html>
'''
soup = BeautifulSoup(html,)
elements = soup.select('tr:has(td)')
def DeleteData(msgID):
for e in elements:
ID = e.find('a').contents[0]
if int(msgID)==int(ID):
e.decompose()
return
print('Failed to delete data from', msgID)
DeleteData(939257309387980851)
这是我试过的-看看soup.div.decompose(),我也试过soup.elements.div.decompose()。这也是使用 DataTables 中的内容,这是我第一次使用它,所以如果有更好的方法来实现我正在做的事情,请告诉我!提前致谢!
import bs4
with open('MapPage.html', 'r', encoding="utf8") as f:
txt = f.read()
soup = bs4.BeautifulSoup(txt,"html5lib")
elements = soup.find_all('tr')
elements.pop(0)
def DeleteData(msgID):
for div in elements:
ID = div.find('a').contents[0]
if int(msgID)==int(ID):
soup.div.decompose()
return
print('Failed to delete data from', msgID)
我希望我能够再次将汤写到 'MapPage.html'。产生错误 AttributeError: 'NoneType' object has no attribute 'decompose'
。
这是打印 div
时的输出:
(Link to html file)
如果我没理解错的话,你喜欢decompose()
在其<a>
中包含特定值的<tr>
。
主要问题是您尝试执行 soup.div.decompose()
什么意思,您喜欢 decompose()
首先 <div>
汤对象。
只需使用:
div.decompose()
或者更好地将您的变量名称更改为 none 标签名称:
e.decompose()
例子
from bs4 import BeautifulSoup
html = '''
<html><body>
<h2>Welcome to our collection of community made maps!</h2>
<table id="example" class="cell-border" style="width:100%">
<thead>
<tr><th>ID</th><th>Author</th><th>Content</th><th>Thumbnail</th><th>Download</th><th>Rating</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">939257309387980851</a></td>
<td>Matter</td><td>Cervinia Source</td><td><img src="https://media.discordapp.net/attachments/932881912714895390/939257307290796062/unknown.png" alt="Cervina Thumb" width="300" height="auto"></td><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">Download</a></td><td>5</td>
</tr>
<tr><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">939257309387980852</a></td><td>Tea</td><td>Chamonix</td><td><img src="https://media.discordapp.net/attachments/932881912714895390/939257307290796062/unknown.png" alt="Cervina Thumb" width="300" height="auto"></td><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">Download</a></td><td>5</td></tr>
</tbody>
</table>
</body></html>
'''
soup = BeautifulSoup(html,)
elements = soup.select('tr:has(td)')
def DeleteData(msgID):
for e in elements:
ID = e.find('a').contents[0]
if int(msgID)==int(ID):
e.decompose()
return
print('Failed to delete data from', msgID)
DeleteData(939257309387980851)