单个 tess4j 实例的并发线程访问期间的 NPE
NPE during concurrent thread access of a single tess4j instance
我正在使用 Tesseract 3.0.2 并使用 1.4.1 tess4j..这不是以线程安全的方式工作,我得到了 NPE。我正在使用 Grizzly/Jesery/Spring.
@Service("textExtractorService")
public class TextExtractorServiceImpl implements TextExtractorService {
Logger LOGGER = Logger.getLogger(TextExtractorServiceImpl.class);
private final Tesseract instance = Tesseract.getInstance(); // JNA Interface
...
..
}
...
...
public ExtractedInfo extract(BufferedImage bufferedImage)
throws IOException {
ExtractedInfo extractedInfo = new ExtractedInfo();
try {
BufferedImage preProcessed = preProcess(bufferedImage);
String result = null;
//the below gives me the NPE, when multiple threads calls this method.
result = instance.doOCR(preProcessed);
String[] r = StringUtils.split(result, "\n");
extractedInfo.setRawText(r);
} catch (TesseractException e) {
throw new IOException(e);
}
return extractedInfo;
}
...
...
Full stack Trace:
SEVERE: service exception:
javax.servlet.ServletException: java.lang.Error: Invalid memory access
at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:420)
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at com.sun.grizzly.http.servlet.ServletAdapter$FilterChainImpl.doFilter(ServletAdapter.java:1059)
at com.sun.grizzly.http.servlet.ServletAdapter$FilterChainImpl.invokeFilterChain(ServletAdapter.java:999)
at com.sun.grizzly.http.servlet.ServletAdapter.doService(ServletAdapter.java:434)
at com.sun.grizzly.http.servlet.ServletAdapter.service(ServletAdapter.java:379)
at com.sun.grizzly.tcp.http11.GrizzlyAdapter.service(GrizzlyAdapter.java:179)
at com.sun.grizzly.tcp.http11.GrizzlyAdapterChain.service(GrizzlyAdapterChain.java:196)
at com.sun.grizzly.tcp.http11.GrizzlyAdapter.service(GrizzlyAdapter.java:179)
at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:850)
at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:747)
at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1032)
at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:231)
at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137)
at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104)
at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90)
at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79)
at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54)
at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59)
at com.sun.grizzly.ContextTask.run(ContextTask.java:71)
at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532)
at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.Error: Invalid memory access
at com.sun.jna.Native.invokeVoid(Native Method)
at com.sun.jna.Function.invoke(Function.java:367)
at com.sun.jna.Function.invoke(Function.java:315)
at com.sun.jna.Library$Handler.invoke(Library.java:212)
at com.sun.proxy.$Proxy55.TessBaseAPIDelete(Unknown Source)
at net.sourceforge.tess4j.Tesseract.dispose(Tesseract.java:346)
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:242)
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:200)
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:184)
at com.vanitysoft.thirdeye.service.impl.TextExtractorServiceImpl.extract(TextExtractorServiceImpl.java:69)
at com.vanitysoft.thirdeye.web.TextExtractorResource.extract(TextExtractorResource.java:49)
我不确定这是否是完全相同的问题,但我在类似的问题上找到了这个答案。
简而言之,Tesseract 中的底层引擎似乎不支持多线程。
我正在使用 Tesseract 3.0.2 并使用 1.4.1 tess4j..这不是以线程安全的方式工作,我得到了 NPE。我正在使用 Grizzly/Jesery/Spring.
@Service("textExtractorService")
public class TextExtractorServiceImpl implements TextExtractorService {
Logger LOGGER = Logger.getLogger(TextExtractorServiceImpl.class);
private final Tesseract instance = Tesseract.getInstance(); // JNA Interface
... ..
}
... ...
public ExtractedInfo extract(BufferedImage bufferedImage)
throws IOException {
ExtractedInfo extractedInfo = new ExtractedInfo();
try {
BufferedImage preProcessed = preProcess(bufferedImage);
String result = null;
//the below gives me the NPE, when multiple threads calls this method.
result = instance.doOCR(preProcessed);
String[] r = StringUtils.split(result, "\n");
extractedInfo.setRawText(r);
} catch (TesseractException e) {
throw new IOException(e);
}
return extractedInfo;
}
... ...
Full stack Trace:
SEVERE: service exception:
javax.servlet.ServletException: java.lang.Error: Invalid memory access
at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:420)
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at com.sun.grizzly.http.servlet.ServletAdapter$FilterChainImpl.doFilter(ServletAdapter.java:1059)
at com.sun.grizzly.http.servlet.ServletAdapter$FilterChainImpl.invokeFilterChain(ServletAdapter.java:999)
at com.sun.grizzly.http.servlet.ServletAdapter.doService(ServletAdapter.java:434)
at com.sun.grizzly.http.servlet.ServletAdapter.service(ServletAdapter.java:379)
at com.sun.grizzly.tcp.http11.GrizzlyAdapter.service(GrizzlyAdapter.java:179)
at com.sun.grizzly.tcp.http11.GrizzlyAdapterChain.service(GrizzlyAdapterChain.java:196)
at com.sun.grizzly.tcp.http11.GrizzlyAdapter.service(GrizzlyAdapter.java:179)
at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:850)
at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:747)
at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1032)
at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:231)
at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137)
at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104)
at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90)
at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79)
at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54)
at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59)
at com.sun.grizzly.ContextTask.run(ContextTask.java:71)
at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532)
at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.Error: Invalid memory access
at com.sun.jna.Native.invokeVoid(Native Method)
at com.sun.jna.Function.invoke(Function.java:367)
at com.sun.jna.Function.invoke(Function.java:315)
at com.sun.jna.Library$Handler.invoke(Library.java:212)
at com.sun.proxy.$Proxy55.TessBaseAPIDelete(Unknown Source)
at net.sourceforge.tess4j.Tesseract.dispose(Tesseract.java:346)
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:242)
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:200)
at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:184)
at com.vanitysoft.thirdeye.service.impl.TextExtractorServiceImpl.extract(TextExtractorServiceImpl.java:69)
at com.vanitysoft.thirdeye.web.TextExtractorResource.extract(TextExtractorResource.java:49)
我不确定这是否是完全相同的问题,但我在类似的问题上找到了这个答案。
简而言之,Tesseract 中的底层引擎似乎不支持多线程。