使用 HtmlUnit 从 <p> 中检索值

Question

我正在使用 HtmlUnit 浏览网页以获取跨度内的文本（代码）。每次我登录后访问该页面时都会生成此代码。这是 HTML 外观的示例：

<div id="Main" class="" role="main">
    <p>Your code for this session:</p>
    <p style="align: center; text-align: center;">
        <span>XXX-XXX-XXX</span>
    </p>
</div><!--end Main-->

我想获取代码（这个东西--> XXX-XXX-XXX）。

我试过以下方法：

final HtmlPage page = webClient.getPage("http://the_url");
final HtmlDivision div = page.getHtmlElementById("Main");

但是，当我打印 div 的内容时，它会打印 <a> 标签中的文本。

我没有使用 getByXPath("//div[@class='someclass']//p"); 因为 div class 是空的。有什么建议吗？

Answer 1

我手头没有 HTML 单元，但是 XPath 查询 "//*[@id='Main']/p/span" 应该可以为您提供 span 元素（如果您正在处理 HTML就像你的例子一样）。然后您应该能够从该元素中获取文本以找到您的 XXX-XXX-XXX 代码。

自从我使用 HTMLUnit 以来已经有很长一段时间了，但是从 the docs 开始，您想要的完整代码看起来像这样：

String code = page.getFirstByXPath("//*[@id='Main']/p/span").getTextContent();

使用 HtmlUnit 从 <p> 中检索值

Retrieving values from a <p> with HtmlUnit

java

xpath

htmlunit