使用 Requests-HTML 抓取 <h4> 下的文本（Requests-HTML、Python）

Question

我正在尝试提取 cpu 的套接字类型，您可以在下面的 image. I have identified that the socket type is under the <h4> Socket heading as seen in the following image.

中看到

到目前为止，我已经能够抓取 .spec.block 并找到嵌套在其中的所有 <h4>'s。但是我无法获取每个标题下的文字

这是我的代码

from requests_html import HTMLSession
session = HTMLSession()

r = session.get('https://au.pcpartpicker.com/product/' + jLF48d)
about = r.html.find('.specs.block')[0]
about = about.find('h4')

print(about.text)

这会打印

 [ <Element 'h4' >, <Element 'h4' >, <Element 'h4' >, <Element 'h4' >,
 <Element 'h4' >, <Element 'h4' >, <Element 'h4' >, <Element 'h4' >,
 <Element 'h4' >, <Element 'h4' >, <Element 'h4' >]

但是当我将打印语句更改为：

print(about.text)

我收到以下错误：

AttributeError: 'list' object has no attribute 'text'

更新：

print(about[0].text)

此代码打印：

Manufacturer AMD Which is the first heading and text however I need the 4th

知道我可以使用什么代码来达到预期的结果吗？

如果您需要更多信息，请告诉我。

Answer 1

正在替换：打印（关于[0].文本）

有

print(about[3].text)

正如我上面问题中的代码所见，为我解决了问题！

使用 Requests-HTML 抓取 <h4> 下的文本（Requests-HTML、Python）

Scrape text under <h4> using Requests-HTML (Requests-HTML, Python)

html

python

python-3.x

python-requests-html