使用 jsoup 解析特定的 p 值
Parse specific p values with jsoup
我有一个更长页面的以下摘录:
<h2 id="Supportedplatforms-Java">Java</h2>
<section class="layout-section layout-section-two_equal">
<div class="content-section">
<p><strong>Oracle JRE / JDK:</strong></p>
<p><img alt="(tick)" data-emoticon-name="tick" class="emoticon emoticon-tick" src="/s/en_GB/7202/e97769bbf919c0bd667762fc102f557beacb7f94/_/images/icons/emoticons/check.png"> Java 8</p>
<p><img alt="(tick)" data-emoticon-name="tick" class="emoticon emoticon-tick" src="/s/en_GB/7202/e97769bbf919c0bd667762fc102f557beacb7f94/_/images/icons/emoticons/check.png"> Java 11</p>
<p><strong>OpenJDK:</strong></p>
<p><strong><img alt="(tick)" data-emoticon-name="tick" class="emoticon emoticon-tick" src="/s/en_GB/7202/e97769bbf919c0bd667762fc102f557beacb7f94/_/images/icons/emoticons/check.png"> </strong>Java 8</p>
<p><img alt="(tick)" data-emoticon-name="tick" class="emoticon emoticon-tick" src="/s/en_GB/7202/e97769bbf919c0bd667762fc102f557beacb7f94/_/images/icons/emoticons/check.png"> Java 11</p>
</div>
<div class="content-section"> = [=10=]
我只想要以下结果:
甲骨文 JRE/JDK:
Java8
Java11
OpenJDK:
Java8
Java 11
我在 groovy 中使用 jsoup:
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
def url = "https://url";
def document = Jsoup.connect(url).get()
我试了几个小时都没用,
Elements test = document.select("#Supportedplatforms-Java > p")
...还有数百种变化
如果您有任何指示,我很乐意听到!
谢谢
Elements test = document.select(".layout-section .content-section p")
我有一个更长页面的以下摘录:
<h2 id="Supportedplatforms-Java">Java</h2>
<section class="layout-section layout-section-two_equal">
<div class="content-section">
<p><strong>Oracle JRE / JDK:</strong></p>
<p><img alt="(tick)" data-emoticon-name="tick" class="emoticon emoticon-tick" src="/s/en_GB/7202/e97769bbf919c0bd667762fc102f557beacb7f94/_/images/icons/emoticons/check.png"> Java 8</p>
<p><img alt="(tick)" data-emoticon-name="tick" class="emoticon emoticon-tick" src="/s/en_GB/7202/e97769bbf919c0bd667762fc102f557beacb7f94/_/images/icons/emoticons/check.png"> Java 11</p>
<p><strong>OpenJDK:</strong></p>
<p><strong><img alt="(tick)" data-emoticon-name="tick" class="emoticon emoticon-tick" src="/s/en_GB/7202/e97769bbf919c0bd667762fc102f557beacb7f94/_/images/icons/emoticons/check.png"> </strong>Java 8</p>
<p><img alt="(tick)" data-emoticon-name="tick" class="emoticon emoticon-tick" src="/s/en_GB/7202/e97769bbf919c0bd667762fc102f557beacb7f94/_/images/icons/emoticons/check.png"> Java 11</p>
</div>
<div class="content-section"> = [=10=]
我只想要以下结果:
甲骨文 JRE/JDK:
Java8
Java11
OpenJDK:
Java8
Java 11
我在 groovy 中使用 jsoup:
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
def url = "https://url";
def document = Jsoup.connect(url).get()
我试了几个小时都没用,
Elements test = document.select("#Supportedplatforms-Java > p")
...还有数百种变化
如果您有任何指示,我很乐意听到!
谢谢
Elements test = document.select(".layout-section .content-section p")