nokogiri select 文本匹配的段落

Question

所以我写了一个 scraper，我试图只获取包含 On Snow Feel

的段落的文本

我正在尝试将其拉出，但我不确定如何让 nokogiri 拉出包含匹配文本的段落。

目前我有 boards[:onthesnowfeel] = html.css(".reviewfold p").text 但这捕获了所有段落。并且不要假设段落会一直井井有条。所以不能只做 [2] 或其他事情。

但是你会用什么方法来抓取与文本匹配的段落"On Snow Feel"

<div id="review" class="reviewfold">
<p>The <strong>Salomon A</strong><b>assassin</b>&nbsp;Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. </p>
<p><b>Approximate Weight</b>: Moew mix is pretty normal</p>
<p><strong>On Snow Feel:&nbsp;</strong>At vero eos et accusamus et iusto odio dignissimos ducimus qui blanditiis praesentium voluptatum.</p>
<p><strong>Powder:&nbsp;</strong>It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. </p>
</div>

Answer 1

您可以将 Enumerable#find 与正则表达式匹配 =~ 结合使用以获得所需的元素内容。

html.css(".reviewfold p").find { |e| e.text =~ /On Snow Feel/ }.text

nokogiri select 文本匹配的段落

nokogiri select paragraph with text match

ruby

open-uri

nokogiri

scraper