使用美汤按文本而不是文本查找元素?
Use beautiful soup to find elements by textual contents, not text?
类似于这里的.renderContents
,我想按那个值搜索:Beautiful Soup [Python] and the extracting of text in a table
样本HTML:
<table>
<tr>
<td>
This is garbage
</td>
<td>
<td class="thead" style="font-weight:normal">
<!-- status icon and date -->
<a name="post1"><img class="inlineimg" src="img.gif" alt="Old" border="0" title="Old"></a>
19-11-2010, 04:25 PM
<!-- / status icon and date -->
</td>
<td>
This is garbage
</td>
</tr>
</table>
我尝试了什么:
soup.find_all("td", text = re.compile('(AM|PM)'))[0].get_text().strip()
但是,find_all
的 text
参数似乎不适用于此应用程序:IndexError: list index out of range
我需要做什么?
根本不指定标签名称,让它找到所需的文本节点。适合我:
soup.find(text=re.compile('(AM|PM)')).strip()
类似于这里的.renderContents
,我想按那个值搜索:Beautiful Soup [Python] and the extracting of text in a table
样本HTML:
<table>
<tr>
<td>
This is garbage
</td>
<td>
<td class="thead" style="font-weight:normal">
<!-- status icon and date -->
<a name="post1"><img class="inlineimg" src="img.gif" alt="Old" border="0" title="Old"></a>
19-11-2010, 04:25 PM
<!-- / status icon and date -->
</td>
<td>
This is garbage
</td>
</tr>
</table>
我尝试了什么:
soup.find_all("td", text = re.compile('(AM|PM)'))[0].get_text().strip()
但是,find_all
的 text
参数似乎不适用于此应用程序:IndexError: list index out of range
我需要做什么?
根本不指定标签名称,让它找到所需的文本节点。适合我:
soup.find(text=re.compile('(AM|PM)')).strip()