HtmlUnit WebClient 重置线程中断状态
HtmlUnit WebClient resets thread interruption status
我有一堆解析器 classes 子 class 一个 PriseParser class 并实现一个 getAllPrices() 方法(由 PriseParser.getPrices() 调用为了从各种网站获取一些数据,做了一些与此无关的其他事情 post)。以下是此类实现的示例:
@Override
public List<Price> getAllPrices() throws ParserException,
InterruptedException {
LogFactory.getFactory().setAttribute("org.apache.commons.logging.Log",
"org.apache.commons.logging.impl.NoOpLog");
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit")
.setLevel(Level.OFF);
java.util.logging.Logger.getLogger("org.apache.commons.httpclient")
.setLevel(Level.OFF);
List<Price> prices = new ArrayList<price>();
WebClient webClient = new WebClient(BrowserVersion.FIREFOX_24);
HtmlPage page;
try {
page = webClient.getPage(URL);
if(Thread.currentThread().isInterrupted()){
System.out.println("INTERRUPTED BEFORE CLOSE");
}
//my parsing code here that fills prices list. Includes calls to webClient.waitForBackgroundJavaScript in some places
webClient.closeAllWindows();
if(Thread.currentThread().isInterrupted()){
System.out.println("INTERRUPTED AFTER CLOSE");
}
} catch (InterruptedException e) {
throw e;
} catch (Exception e) {
throw new ParserException(e);
}
return prices;
}
这些解析器 运行 与 ExecutorService 同时存在:
public List<Price> getPrices(List<PriceParser> priceParsers) throws InterruptedException {
ExecutorService executorService = Executors
.newFixedThreadPool(PriceParsers.size());
Set<Callable<List<Price>>> callables = new HashSet<Callable<List<Price>>>();
List<Price> allPrices = new ArrayList<Price>();
for (PriceParser PriceParser : PriceParsers) {
callables.add(new Callable<List<Price>>() {
public List<Price> call() throws Exception {
List<Price> prices = new ArrayList<Price>();
prices = PriceParser.getPrices();
return prices;
}
});
}
List<Future<List<Price>>> futures;
try {
futures = executorService.invokeAll(callables);
for (Future<List<Price>> future : futures) {
allPrices.addAll(future.get());
}
} catch (InterruptedException e) {
throw e;
} catch (ExecutionException e) {
logger.error("MULTI-THREADING EXECUTION ERROR ", e);
throw new RuntimeException("MULTI-THREADING EXECUTION ERROR ", e);
} finally {
executorService.shutdownNow();
}
return allPrices;
}
添加第一种方法中的两 if(Thread.currentThread().isInterrupted()){}
段代码是为了检查我观察到的以下问题:当执行程序服务中断时(这可能发生在 gui 应用程序终止按下取消按钮时的线程),我在代码中插入的第一个中断检查成功打印 "INTERRUPTED BEFORE CLOSE".
但是第二张支票没有打印任何东西。因此,似乎以某种方式我对 webClient 进行的调用之一(这是 waitForBackgroundJavaScript 方法调用和最后的 webClient.closeAllWindows() 调用)清除了线程中断状态。有人可以解释为什么会这样吗?
看来问题出在我的解析代码对 webClient.waitForBackgroundJavaScript 的调用上。在内部,这一直延伸到 HtmlUnit 的 JavaScriptJobManagerImpl 方法的 waitForJobs 方法。此方法包含以下代码段,它基本上吞没了所有 InterruptedExceptions,因此任何调用者都能够识别在该调用期间是否进行了任何中断:
try {
synchronized (this) {
wait(end - now);
}
// maybe a change triggers the wakup; we have to recalculate the
// wait time
now = System.currentTimeMillis();
}
catch (final InterruptedException e) {
LOG.error("InterruptedException while in waitForJobs", e);
}
与其捕获并记录日志,不如抛出异常
我有一堆解析器 classes 子 class 一个 PriseParser class 并实现一个 getAllPrices() 方法(由 PriseParser.getPrices() 调用为了从各种网站获取一些数据,做了一些与此无关的其他事情 post)。以下是此类实现的示例:
@Override
public List<Price> getAllPrices() throws ParserException,
InterruptedException {
LogFactory.getFactory().setAttribute("org.apache.commons.logging.Log",
"org.apache.commons.logging.impl.NoOpLog");
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit")
.setLevel(Level.OFF);
java.util.logging.Logger.getLogger("org.apache.commons.httpclient")
.setLevel(Level.OFF);
List<Price> prices = new ArrayList<price>();
WebClient webClient = new WebClient(BrowserVersion.FIREFOX_24);
HtmlPage page;
try {
page = webClient.getPage(URL);
if(Thread.currentThread().isInterrupted()){
System.out.println("INTERRUPTED BEFORE CLOSE");
}
//my parsing code here that fills prices list. Includes calls to webClient.waitForBackgroundJavaScript in some places
webClient.closeAllWindows();
if(Thread.currentThread().isInterrupted()){
System.out.println("INTERRUPTED AFTER CLOSE");
}
} catch (InterruptedException e) {
throw e;
} catch (Exception e) {
throw new ParserException(e);
}
return prices;
}
这些解析器 运行 与 ExecutorService 同时存在:
public List<Price> getPrices(List<PriceParser> priceParsers) throws InterruptedException {
ExecutorService executorService = Executors
.newFixedThreadPool(PriceParsers.size());
Set<Callable<List<Price>>> callables = new HashSet<Callable<List<Price>>>();
List<Price> allPrices = new ArrayList<Price>();
for (PriceParser PriceParser : PriceParsers) {
callables.add(new Callable<List<Price>>() {
public List<Price> call() throws Exception {
List<Price> prices = new ArrayList<Price>();
prices = PriceParser.getPrices();
return prices;
}
});
}
List<Future<List<Price>>> futures;
try {
futures = executorService.invokeAll(callables);
for (Future<List<Price>> future : futures) {
allPrices.addAll(future.get());
}
} catch (InterruptedException e) {
throw e;
} catch (ExecutionException e) {
logger.error("MULTI-THREADING EXECUTION ERROR ", e);
throw new RuntimeException("MULTI-THREADING EXECUTION ERROR ", e);
} finally {
executorService.shutdownNow();
}
return allPrices;
}
添加第一种方法中的两 if(Thread.currentThread().isInterrupted()){}
段代码是为了检查我观察到的以下问题:当执行程序服务中断时(这可能发生在 gui 应用程序终止按下取消按钮时的线程),我在代码中插入的第一个中断检查成功打印 "INTERRUPTED BEFORE CLOSE".
但是第二张支票没有打印任何东西。因此,似乎以某种方式我对 webClient 进行的调用之一(这是 waitForBackgroundJavaScript 方法调用和最后的 webClient.closeAllWindows() 调用)清除了线程中断状态。有人可以解释为什么会这样吗?
看来问题出在我的解析代码对 webClient.waitForBackgroundJavaScript 的调用上。在内部,这一直延伸到 HtmlUnit 的 JavaScriptJobManagerImpl 方法的 waitForJobs 方法。此方法包含以下代码段,它基本上吞没了所有 InterruptedExceptions,因此任何调用者都能够识别在该调用期间是否进行了任何中断:
try {
synchronized (this) {
wait(end - now);
}
// maybe a change triggers the wakup; we have to recalculate the
// wait time
now = System.currentTimeMillis();
}
catch (final InterruptedException e) {
LOG.error("InterruptedException while in waitForJobs", e);
}
与其捕获并记录日志,不如抛出异常