WebClient(htmlunit) 没有看到一些元素

WebClient(htmlunit) doesn't see some elements

我正在尝试使用 "page.asText()" 解析 Steam 市场的网页,但这不起作用。发生这种情况的原因可能是 html 在 1 秒内加载后项目未加载。

public static void main(String[] args) throws Exception{
            java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(java.util.logging.Level.OFF);
            java.util.logging.Logger.getLogger("org.apache.http").setLevel(java.util.logging.Level.OFF);
            String link="http://steamcommunity.com/market/search?appid=730#p6_price_asc";
            HtmlPage page;
            WebClient webClient = new WebClient(BrowserVersion.CHROME);
            page = (HtmlPage) webClient.getPage(link);
            System.out.println(page.asText());
            }

在控制台中我看到:

Show advanced options...






 < 1 2 3 4 5 6 ... 939 >
 Showing 1-10 of 9389 results

需要是:

Show advanced options...
PRICE
QUANTITY
NAME
31,218
 Starting at:
 [=12=].35 USD
Operation Hydra Case 
 Counter-Strike: Global Offensive
 276,582
 Starting at:
 [=12=].23 USD
.
.
.

M4A1-S | Decimator (Field-Tested) 
 Counter-Strike: Global Offensive


 232
 Starting at:
 .06 USD

AWP | Asiimov (Battle-Scarred) 
 Counter-Strike: Global Offensive


 28,068
 Starting at:
 [=12=].75 USD

Krakow 2017 Legends Autograph Capsule 
 Counter-Strike: Global Offensive


 < 1 2 3 4 5 6 ... 940 >
 Showing 1-10 of 9392 results

首先,确保 javascript 已启用。

webClient.getOptions.setJavaScriptEnabled(true);

为了等待更多元素加载,我通常做的是:

thread.sleep(3000);

这使页面有 3 秒的时间加载所有其他内容。

您也可以尝试此处其他用户列出的任何其他方法:

HTMLUnit doesn't wait for Javascript