使用 HtmlUnit 预呈现 Javascript 网站(HTML 快照)

Using HtmlUnit to pre-render a Javascript website (HTML Snapshot)

我正在尝试构建一个由 HtmlUnit 提供支持的预呈现器,并尝试使用此 url 对其进行测试:https://demo.tutorialzine.com/2009/09/simple-ajax-website-jquery/demo.html#page3

这是我的代码:

final WebClient webClient = new WebClient(BrowserVersion.BEST_SUPPORTED);
WebClientOptions options = webClient.getOptions();
options.setCssEnabled(true);
webClient.setCssErrorHandler(new SilentCssErrorHandler());
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
//    webClient.setAjaxController(new AjaxController(){
//        @Override
//        public boolean processSynchron(HtmlPage page, WebRequest request, boolean async) {
//            return true;
//        }
//    });
options.setThrowExceptionOnScriptError(false);
options.setThrowExceptionOnFailingStatusCode(false);
options.setRedirectEnabled(false);
options.setAppletEnabled(false);
options.setJavaScriptEnabled(true);
//options.setUseInsecureSSL(true);
options.setTimeout(50000);
webClient.addRequestHeader("Access-Control-Allow-Origin", "*");

HtmlPage page = webClient.getPage(path);

// important!  Give the headless browser enough time to execute JavaScript
// The exact time to wait may depend on your application.
webClient.setJavaScriptTimeout(10000);
webClient.waitForBackgroundJavaScript(10000);
//just wait
for (int i = 0; i < 20; i++) {
    synchronized (page) {
        page.wait(500);
    }
}
String xml = page.asXml();

这里的问题是输出 html 没有包含应该用 Javascript 获取的内容。

这里可能有什么问题?

嗯,下面的代码用 2.28-snapshot 检索:

Donec in massa vel lectus aliquam laoreet nec et turpis. ....

try (final WebClient webClient = new WebClient(BrowserVersion.BEST_SUPPORTED)) {
    WebClientOptions options = webClient.getOptions();
    options.setCssEnabled(true);
    webClient.setAjaxController(new NicelyResynchronizingAjaxController());
    options.setTimeout(50000);
    webClient.addRequestHeader("Access-Control-Allow-Origin", "*");

    HtmlPage page = webClient.getPage("https://demo.tutorialzine.com/2009/09/simple-ajax-website-jquery/demo.html#page3");

    // important!  Give the headless browser enough time to execute JavaScript
    // The exact time to wait may depend on your application.
    webClient.setJavaScriptTimeout(10000);
    webClient.waitForBackgroundJavaScript(10000);
    //just wait
    Thread.sleep(10000);

    String xml = page.asXml();
    System.out.println(xml);
}

你还缺什么?