我如何抓取不在标签中的网络数据

Question

<div id="main-content" class="content">
<div class="metaline">
<span class="article-meta author">jorden</span>
</div>
 "
 1.name:jorden> 
 2.age:28

  --
 "
 <span class="D2"> from 111.111.111.111 </span>
  </div>

我只要

1.name:jorden
2.age:28

xxx.select('#main-content') 这将 return 所有东西，但我只需要其中的一部分。因为它们不在任何标签中，所以我不知道该怎么做。

Answer 1

您想在相关文本之前找到标签（在您的例子中，<div class="metaline">），然后查看下一个[= HTML 解析树中的 18=] 同级：

text = soup.find("div", class_='metaline').next_sibling print(text) # " # 1.name:jorden> # 2.age:28 # # -- # " #

获得原始文本后，将其剥离等

我如何抓取不在标签中的网络数据

How can i crawl web data that not in tags

html

python

beautifulsoup

web-crawler

python-requests