为什么 WebEngine 的工作线程永远不会完成?

Why does a worker thread for WebEngine never complete?

我的目标是从 Web 服务器加载文档,然后解析其 DOM 以获取特定内容。加载 DOM 是我的问题。

我正在尝试使用 javafx.scene.web.WebEngine,因为它似乎应该能够完成所有必要的机制,包括 javascript 执行,这可能会影响最终 DOM .

加载文档时,它似乎卡在 RUNNING 状态,从未达到 SUCCEEDED 状态,我认为这是从 [= 访问 DOM 之前所必需的14=].

无论是从 URL 还是文字内容(如在这个最小示例中使用的)加载,都会发生这种情况。

谁能看出我哪里做错了,或者误会了?

在此先感谢您的帮助。

import java.util.concurrent.ExecutionException;
import org.w3c.dom.Document;
import javafx.application.Platform;
import javafx.concurrent.Task;
import javafx.concurrent.Worker;
import javafx.embed.swing.JFXPanel;
import javafx.scene.web.WebEngine;

public class WebEngineProblem {
    private static Task<WebEngine> getEngineTask() {
        Task<WebEngine> task = new Task<>() {
            @Override
            protected WebEngine call() throws Exception {
                WebEngine webEngine = new WebEngine();
                final Worker<Void> loadWorker = webEngine.getLoadWorker();
                loadWorker.stateProperty().addListener((obs, oldValue, newValue) -> {
                    System.out.println("state:" + newValue);
                    if (newValue == State.SUCCEEDED) {
                        System.out.println("finished loading");
                    }    
                });
                webEngine.loadContent("<!DOCTYPE html>\r\n" + "<html>\r\n" + "<head>\r\n" + "<meta charset=\"UTF-8\">\r\n"
                    + "<title>Content Title</title>\r\n" + "</head>\r\n" + "<body>\r\n" + "<p>Body</p>\r\n" + "</body>\r\n"
                    + "</html>\r\n");
                State priorState = State.CANCELLED; //should never be CANCELLED
                double priorWork = Double.NaN;
                while (loadWorker.isRunning()) {
                    final double workDone = loadWorker.getWorkDone();
                    if (loadWorker.getState() != priorState || priorWork != workDone) {
                        priorState = loadWorker.stateProperty().getValue();
                        priorWork = workDone;
                        System.out.println(priorState + " " + priorWork + "/" + loadWorker.getTotalWork());
                    }
                    Thread.sleep(1000);
                }
                return webEngine;
            }
        };
        return task;
    }

    public static void main(String[] args) {
        new JFXPanel(); // Initialise the JavaFx Platform
        WebEngine engine = null;
        Task<WebEngine> task = getEngineTask();
        try {
            Platform.runLater(task);
            Thread.sleep(1000); 
            engine = task.get(); // Never completes as always RUNNING
        }
        catch (InterruptedException | ExecutionException e) {
            e.printStackTrace();
        }
        // This code is never reached as the content never completes loading
        // It would fail as it's not on the FX thread.
        Document doc = engine.getDocument();
        String content = doc.getTextContent();
        System.out.println(content);
    }

}

Workerstate 属性 的更改将发生在 FX 应用程序线程上,即使该工作者 运行 正在后台线程上运行。 (JavaFX 属性本质上是单线程的。)在加载 Web 引擎内容的线程的实现中的某个地方,有一个对 Platform.runLater(...) 的调用,它改变了工作者的状态。

由于您的任务会阻塞,直到 worker 的状态发生变化,并且由于您在 FX 应用程序线程上执行任务 运行,所以您实际上已经使 FX 应用程序线程陷入僵局:加载 worker 的更改在你的任务完成之前状态不会发生(因为它在同一个线程上 运行ning),并且你的任务在状态改变之前无法完成(因为这是你编程任务要做的)。

阻塞 FX 应用程序线程基本上总是一个错误。相反,您应该阻塞另一个线程,直到您想要的条件为真(创建 Web 引擎并完成加载线程),然后在发生这种情况时执行您想做的下一件事(如果需要,再次使用 Platform.runLater(...)在 FX 应用程序线程上执行)。

这是一个我认为你正在尝试做的例子:

import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.FutureTask;

import org.w3c.dom.Document;

import javafx.application.Platform;
import javafx.concurrent.Worker;
import javafx.concurrent.Worker.State;
import javafx.embed.swing.JFXPanel;
import javafx.scene.web.WebEngine;

public class WebEngineProblem {

    public static void main(String[] args) throws InterruptedException, ExecutionException {
        new JFXPanel(); // Initialise the JavaFx Platform

        CountDownLatch loaded = new CountDownLatch(1);

        FutureTask<WebEngine> createEngineTask = new FutureTask<WebEngine>( () -> {
            WebEngine webEngine = new WebEngine();
            final Worker<Void> loadWorker = webEngine.getLoadWorker();
            loadWorker.stateProperty().addListener((obs, oldValue, newValue) -> {
                System.out.println("state:" + newValue);
                if (newValue == State.SUCCEEDED) {
                    System.out.println("finished loading");
                    loaded.countDown();
                }    
            });
            webEngine.loadContent("<!DOCTYPE html>\r\n" + "<html>\r\n" + "<head>\r\n" + "<meta charset=\"UTF-8\">\r\n"
                + "<title>Content Title</title>\r\n" + "</head>\r\n" + "<body>\r\n" + "<p>Body</p>\r\n" + "</body>\r\n"
                + "</html>\r\n");
            return webEngine ;
        });

        Platform.runLater(createEngineTask);
        WebEngine engine = createEngineTask.get();
        loaded.await();

        Platform.runLater(() -> {
            Document doc = engine.getDocument();
            String content = doc.getDocumentElement().getTextContent();
            System.out.println(content);
        });
    }

}