使用 jsoup 在两个(不同的)HTML 标签之间提取文本

Extract text between two (different) HTML tags using jsoup

我有以下 HTML 代码片段:

<td>
    <span class="detailh2" style="margin:0px">This month: </span>2 145        
    <span class="detailh2">Total: </span> 31 704          
    <span class="detailh2">Last: </span> 30.12.2021          
</td>

我的目标是提取 Total: span 之后的代码部分。这意味着输出应该如下所示:

31 704

我知道了:

String total = doc.select("td:contains(Total:)").get(0).ownText();

,其中 returns:

2 145 31 704 30.12.2021

如您所见,所有三个值都合并到一个容易混淆的字符串中。有什么方法(方法?)可以 return 它们在数组(列表)中吗?

["2 145", "31 704", "30.12.2021"]

(我实际上不需要数组,我只对 Total 值感兴趣)

使用Element.nextSibling()方法。在下面的示例代码中,所需的值被放入字符串的列表接口中:

String html = "<td>\n"
            + "    <span class=\"detailh2\" style=\"margin:0px\">This month: </span>2 145 \n"
            + "    <span class=\"detailh2\">Total: </span> 31 704                         \n"
            + "    <span class=\"detailh2\">Last: </span> 30.12.2021                      \n"
            + "</td>";

List<String> valuesList = new ArrayList<>();

Document doc = Jsoup.parse(html);
Elements elements = doc.select("span");
for (Element a : elements) {
    Node node = a.nextSibling();
    valuesList.add(node.toString().trim());
}
    
// Display valuesLlist in Condole window:
for (String value : valuesList) {
    System.out.println(value);
}

它将在控制台中显示以下内容 Window:

2 145
31 704
30.12.2021

如果您只想获取 Total: 的值,那么您可以试试这个:

String html = "<td>\n"
            + "    <span class=\"detailh2\" style=\"margin:0px\">This month: </span>2 145 \n"
            + "    <span class=\"detailh2\">Total: </span> 31 704                         \n"
            + "    <span class=\"detailh2\">Last: </span> 30.12.2021                      \n"
            + "</td>";
String totalValue = "N/A";
Document doc = Jsoup.parse(html);
Elements elements = doc.select("span");
for (Element a : elements) {
    if (a.before("</span>").text().contains("Total:")) {
        Node node = a.nextSibling();
        totalValue = "Total: --> " + node.toString().trim();
        break;
    }
}
    
// Display the value in Condole window:
System.out.println(totalValue);

以上代码将在控制台中显示以下内容Window:

 Total: --> 31 704