PDF 小丑突出显示多个搜索词失败,因为 PDF 包含图像、彩色文本、复杂图表
PDF Clown Highlight multiple search word is failing for PDF contains images, color text, Complex Diagrams
我正在使用 PDFClown 来突出显示 PDF 文档中的多个搜索词。在许多包含彩色图像、复杂图表、彩色文本的 pdf 文档中,PDFClown 在那里抛出异常并且无法突出显示匹配的单词。提到的代码适用于普通或简单的 Pdf。
这是我用来测试的PDF
https://drive.google.com/file/d/0B-nuOO6Zsa4rXy1DS2JjX1RnYmM/view?usp=sharing
public void searchWordInPdf(DocumentMetadata documentMetadata , String searchWord,
HttpServletResponse response)throws IOException{
try {
byte[] bytes = null;
org.pdfclown.files.File file =null;
if (documentMetadata.getProject().getFtpId() != null && documentMetadata.getProject().getFtpId() > 0) {
FtpServer ftpServer = ftpServerService.getFtpServer(documentMetadata.getProject().getFtpId());
ByteArrayOutputStream bos = new ByteArrayOutputStream();
retrievePdfFile(ftpServer, bos, documentMetadata.getFilePath());
bytes = bos.toByteArray();
file = new org.pdfclown.files.File(bytes);
}else{
file = new org.pdfclown.files.File(documentMetadata.getFilePath());
}
List<String> matchList = new ArrayList<String>();
//Pattern regex = Pattern.compile("[^\s\"']+|\"([^\"]*)\"|'([^']*)'");
Pattern regex = Pattern.compile("[^\s\"']+|\"([^\"]*)\"|'([^']*)'");
Matcher regexMatcher = regex.matcher(searchWord);
while (regexMatcher.find()) {
if (regexMatcher.group(1) != null) {
// Add double-quoted string without the quotes
matchList.add(regexMatcher.group(1));
} else if (regexMatcher.group(2) != null) {
// Add single-quoted string without the quotes
matchList.add(regexMatcher.group(2));
} else {
// Add unquoted word
matchList.add(regexMatcher.group());
}
}
for (String key : matchList){
Pattern pattern = Pattern.compile(key, Pattern.CASE_INSENSITIVE);
// 2. Iterating through the document pages...
TextExtractor textExtractor = new TextExtractor(true, true);
for (final Page page : file.getDocument().getPages()) {
System.out.println("\nScanning page " + (page.getIndex() + 1) + "...\n");
// 2.1. Extract the page text!
Map<Rectangle2D, List<ITextString>> textStrings = textExtractor.extract(page);
// 2.2. Find the text pattern matches!
final Matcher matcher = pattern.matcher(TextExtractor.toString(textStrings));
// 2.3. Highlight the text pattern matches!
textExtractor.filter(
textStrings,
new TextExtractor.IIntervalFilter() {
@Override
public boolean hasNext() {
if (matcher.find()) {
//count++;
return true;
}
return false;
}
@Override
public Interval<Integer> next() {
return new Interval<Integer>(matcher.start(), matcher.end());
}
@Override
public void process(
Interval<Integer> interval,
ITextString match
) {
Rectangle2D textBox = null;
// Defining the highlight box of the text pattern match...
List<Quad> highlightQuads = new ArrayList<Quad>();
{
/*
NOTE: A text pattern match may be split across multiple contiguous lines,
so we have to define a distinct highlight box for each text chunk.
*/
for (TextChar textChar : match.getTextChars()) {
Rectangle2D textCharBox = textChar.getBox();
if (textBox == null) {
textBox = (Rectangle2D) textCharBox.clone();
} else {
if (textCharBox.getY() > textBox.getMaxY()) {
highlightQuads.add(Quad.get(textBox));
textBox = (Rectangle2D) textCharBox.clone();
} else {
textBox.add(textCharBox);
}
}
}
highlightQuads.add(Quad.get(textBox));
}
// Highlight the text pattern match!
new TextMarkup(page, highlightQuads, null, MarkupTypeEnum.Highlight);
}
@Override
public void remove() {
throw new UnsupportedOperationException();
}
}
);
}
}
String contentType = getContentType(documentMetadata.getFileName());
if (contentType == null) {
contentType = "binary/octet-stream";
}
response.setStatus(HttpStatus.OK.value());
ByteArrayOutputStream output = new ByteArrayOutputStream();
if(output != null){
file.save(output, SerializationModeEnum.Standard );
bytes = org.springframework.security.crypto.codec.Base64.encode(output.toByteArray());
response.addHeader("Content-Disposition", "attachment; filename=" + documentMetadata.getFileName());
response.addHeader("Content-Type", contentType);
response.getOutputStream().write(bytes);
}
} catch (Exception e) {
e.printStackTrace();
}
}
这里是 StackTrace
java.lang.IllegalArgumentException: Comparison method violates its general contract!
at java.util.TimSort.mergeLo(TimSort.java:777)
at java.util.TimSort.mergeAt(TimSort.java:514)
at java.util.TimSort.mergeCollapse(TimSort.java:439)
at java.util.TimSort.sort(TimSort.java:245)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1454)
at java.util.Collections.sort(Collections.java:175)
at org.pdfclown.tools.TextExtractor.sort(TextExtractor.java:675)
at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:306)
at nu.optimise.projectweb.service.DocumentMetadataService.searchWordInPdf(DocumentMetadataService.java:2669)
at nu.optimise.projectweb.service.DocumentMetadataService$$FastClassBySpringCGLIB$$fc6434c2.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:720)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
at nu.optimise.projectweb.aop.logging.LoggingAspect.logAround(LoggingAspect.java:51)
at sun.reflect.GeneratedMethodAccessor186.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:620)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:609)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:68)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.aspectj.AspectJAfterThrowingAdvice.invoke(AspectJAfterThrowingAdvice.java:59)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.transaction.interceptor.TransactionInterceptor.proceedWithInvocation(TransactionInterceptor.java:99)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:281)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:655)
at nu.optimise.projectweb.service.DocumentMetadataService$$EnhancerBySpringCGLIB$$c3a15a18.searchWordInPdf(<generated>)
at nu.optimise.projectweb.web.rest.DocumentMetadataResource.searchContentPDF(DocumentMetadataResource.java:1026)
at nu.optimise.projectweb.web.rest.DocumentMetadataResource$$FastClassBySpringCGLIB$$bb12eea8.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:720)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
at nu.optimise.projectweb.aop.logging.LoggingAspect.logAround(LoggingAspect.java:51)
at sun.reflect.GeneratedMethodAccessor186.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:620)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:609)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:68)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.aspectj.AspectJAfterThrowingAdvice.invoke(AspectJAfterThrowingAdvice.java:59)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at com.ryantenney.metrics.spring.TimedMethodInterceptor.invoke(TimedMethodInterceptor.java:48)
at com.ryantenney.metrics.spring.TimedMethodInterceptor.invoke(TimedMethodInterceptor.java:34)
at com.ryantenney.metrics.spring.AbstractMetricMethodInterceptor.invoke(AbstractMetricMethodInterceptor.java:59)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:655)
at nu.optimise.projectweb.web.rest.DocumentMetadataResource$$EnhancerBySpringCGLIB$$bfe48b3d.searchContentPDF(<generated>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:221)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:110)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:832)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:743)
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:961)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:895)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:967)
at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:858)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:622)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:843)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:292)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at com.codahale.metrics.servlet.AbstractInstrumentedFilter.doFilter(AbstractInstrumentedFilter.java:104)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.boot.actuate.autoconfigure.EndpointWebMvcAutoConfiguration$ApplicationContextHeaderFilter.doFilterInternal(EndpointWebMvcAutoConfiguration.java:281)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.boot.actuate.trace.WebRequestTraceFilter.doFilterInternal(WebRequestTraceFilter.java:115)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317)
at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127)
at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:115)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:137)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:112)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:169)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:63)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at nu.optimise.projectweb.security.jwt.JWTFilter.doFilter(JWTFilter.java:43)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:121)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:66)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:106)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:56)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:214)
at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:177)
at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:346)
at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:262)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.web.filter.HttpPutFormContentFilter.doFilterInternal(HttpPutFormContentFilter.java:87)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:77)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:121)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:212)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:141)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:528)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1099)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:670)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1520)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1476)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
这是 PDF Clown 中的一个错误:在文本提取过程中,它使用了自定义 Comparator
实现,该实现不完全遵循 Comparator
约定。在 Java 7 及以下版本中,这被忽略了,但在 Java 8 中,这导致了手头的异常。如果你指示Java使用旧的排序算法,程序运行无异常。
比较器
这是有问题的比较器
/**
Text string position comparator.
*/
private static class TextStringPositionComparator
implements Comparator<ITextString>
{
/**
Gets whether the specified boxes lay on the same text line.
*/
public static boolean isOnTheSameLine(
Rectangle2D box1,
Rectangle2D box2
)
{
/*
NOTE: In order to consider the two boxes being on the same line,
we apply a simple rule of thumb: at least 25% of a box's height MUST
lay on the horizontal projection of the other one.
*/
double minHeight = Math.min(box1.getHeight(), box2.getHeight());
double yThreshold = minHeight * .75;
return ((box1.getY() > box2.getY() - yThreshold
&& box1.getY() < box2.getMaxY() + yThreshold - minHeight)
|| (box2.getY() > box1.getY() - yThreshold
&& box2.getY() < box1.getMaxY() + yThreshold - minHeight));
}
@Override
public int compare(
ITextString textString1,
ITextString textString2
)
{
Rectangle2D box1 = textString1.getBox();
Rectangle2D box2 = textString2.getBox();
if(isOnTheSameLine(box1,box2))
{
/*
[FIX:55:0.1.3] In order not to violate the transitive condition, equivalence on x-axis
MUST fall back on y-axis comparison.
*/
int xCompare = Double.compare(box1.getX(), box2.getX());
if(xCompare != 0)
return xCompare;
}
return Double.compare(box1.getY(), box2.getY());
}
}
如评论[FIX:55:0.1.3] ...
所示,作者已经遇到了排序问题。不幸的是,他只解决了一个麻烦的情况。
显然 compare
中使用的 isOnTheSameLine
测试通常是非传递性的原因,考虑具有三个 ITextString
实例的情况 A
, B
,以及 C
:
(这可能发生在常规文本中,例如,在一行中,首先是下标文本,然后是正常书写,然后是上标文本。)
A
和 B
以及 B
和 C
会被认为在同一行,但 A
和 [=20 不会=].因此,前两对将分别通过 x 坐标进行比较,而最后一对将通过 y 坐标进行比较,从而导致非传递性:
- A < B 和
- B < C,但是
- A > C(PDF 小丑使用 y 坐标向下递增)。
身份条件也可能被违反,考虑两个 ITextString
实例 A
和 B
的情况,它们都具有相同的框,即具有相同的尺寸和打印在同一位置(例如,用重叠的字母构建符号)。 compare
会 return 0
只有在将一个对象与其相等的对象进行比较时才会发生(“应该”,因为这只是建议,并非严格要求)。
不过,大多数情况下,比较器确实会按照人们认为正确的方式对文本片段进行排序。
解决方法
在 Java 8 之前内置的 Java 排序算法没有测试 Comparator
实现是否满足契约。排序结果可能没有正确排序,但排序没有抛出异常。 (不过,一些后来调用的例程假定要排序的数组可能会失败得很厉害。)
Java 8,不过,使用不同的默认排序算法,该算法会进行一些健全性检查,以识别未履行的 Comparator
合同对排序过程的某些影响。
但是通过使用命令行JRE参数
-Djava.util.Arrays.useLegacyMergeSort=true
你可以告诉 Java 8 使用不会因异常而失败的旧排序方法。
我正在使用 PDFClown 来突出显示 PDF 文档中的多个搜索词。在许多包含彩色图像、复杂图表、彩色文本的 pdf 文档中,PDFClown 在那里抛出异常并且无法突出显示匹配的单词。提到的代码适用于普通或简单的 Pdf。
这是我用来测试的PDF https://drive.google.com/file/d/0B-nuOO6Zsa4rXy1DS2JjX1RnYmM/view?usp=sharing
public void searchWordInPdf(DocumentMetadata documentMetadata , String searchWord,
HttpServletResponse response)throws IOException{
try {
byte[] bytes = null;
org.pdfclown.files.File file =null;
if (documentMetadata.getProject().getFtpId() != null && documentMetadata.getProject().getFtpId() > 0) {
FtpServer ftpServer = ftpServerService.getFtpServer(documentMetadata.getProject().getFtpId());
ByteArrayOutputStream bos = new ByteArrayOutputStream();
retrievePdfFile(ftpServer, bos, documentMetadata.getFilePath());
bytes = bos.toByteArray();
file = new org.pdfclown.files.File(bytes);
}else{
file = new org.pdfclown.files.File(documentMetadata.getFilePath());
}
List<String> matchList = new ArrayList<String>();
//Pattern regex = Pattern.compile("[^\s\"']+|\"([^\"]*)\"|'([^']*)'");
Pattern regex = Pattern.compile("[^\s\"']+|\"([^\"]*)\"|'([^']*)'");
Matcher regexMatcher = regex.matcher(searchWord);
while (regexMatcher.find()) {
if (regexMatcher.group(1) != null) {
// Add double-quoted string without the quotes
matchList.add(regexMatcher.group(1));
} else if (regexMatcher.group(2) != null) {
// Add single-quoted string without the quotes
matchList.add(regexMatcher.group(2));
} else {
// Add unquoted word
matchList.add(regexMatcher.group());
}
}
for (String key : matchList){
Pattern pattern = Pattern.compile(key, Pattern.CASE_INSENSITIVE);
// 2. Iterating through the document pages...
TextExtractor textExtractor = new TextExtractor(true, true);
for (final Page page : file.getDocument().getPages()) {
System.out.println("\nScanning page " + (page.getIndex() + 1) + "...\n");
// 2.1. Extract the page text!
Map<Rectangle2D, List<ITextString>> textStrings = textExtractor.extract(page);
// 2.2. Find the text pattern matches!
final Matcher matcher = pattern.matcher(TextExtractor.toString(textStrings));
// 2.3. Highlight the text pattern matches!
textExtractor.filter(
textStrings,
new TextExtractor.IIntervalFilter() {
@Override
public boolean hasNext() {
if (matcher.find()) {
//count++;
return true;
}
return false;
}
@Override
public Interval<Integer> next() {
return new Interval<Integer>(matcher.start(), matcher.end());
}
@Override
public void process(
Interval<Integer> interval,
ITextString match
) {
Rectangle2D textBox = null;
// Defining the highlight box of the text pattern match...
List<Quad> highlightQuads = new ArrayList<Quad>();
{
/*
NOTE: A text pattern match may be split across multiple contiguous lines,
so we have to define a distinct highlight box for each text chunk.
*/
for (TextChar textChar : match.getTextChars()) {
Rectangle2D textCharBox = textChar.getBox();
if (textBox == null) {
textBox = (Rectangle2D) textCharBox.clone();
} else {
if (textCharBox.getY() > textBox.getMaxY()) {
highlightQuads.add(Quad.get(textBox));
textBox = (Rectangle2D) textCharBox.clone();
} else {
textBox.add(textCharBox);
}
}
}
highlightQuads.add(Quad.get(textBox));
}
// Highlight the text pattern match!
new TextMarkup(page, highlightQuads, null, MarkupTypeEnum.Highlight);
}
@Override
public void remove() {
throw new UnsupportedOperationException();
}
}
);
}
}
String contentType = getContentType(documentMetadata.getFileName());
if (contentType == null) {
contentType = "binary/octet-stream";
}
response.setStatus(HttpStatus.OK.value());
ByteArrayOutputStream output = new ByteArrayOutputStream();
if(output != null){
file.save(output, SerializationModeEnum.Standard );
bytes = org.springframework.security.crypto.codec.Base64.encode(output.toByteArray());
response.addHeader("Content-Disposition", "attachment; filename=" + documentMetadata.getFileName());
response.addHeader("Content-Type", contentType);
response.getOutputStream().write(bytes);
}
} catch (Exception e) {
e.printStackTrace();
}
}
这里是 StackTrace
java.lang.IllegalArgumentException: Comparison method violates its general contract!
at java.util.TimSort.mergeLo(TimSort.java:777)
at java.util.TimSort.mergeAt(TimSort.java:514)
at java.util.TimSort.mergeCollapse(TimSort.java:439)
at java.util.TimSort.sort(TimSort.java:245)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1454)
at java.util.Collections.sort(Collections.java:175)
at org.pdfclown.tools.TextExtractor.sort(TextExtractor.java:675)
at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:306)
at nu.optimise.projectweb.service.DocumentMetadataService.searchWordInPdf(DocumentMetadataService.java:2669)
at nu.optimise.projectweb.service.DocumentMetadataService$$FastClassBySpringCGLIB$$fc6434c2.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:720)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
at nu.optimise.projectweb.aop.logging.LoggingAspect.logAround(LoggingAspect.java:51)
at sun.reflect.GeneratedMethodAccessor186.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:620)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:609)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:68)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.aspectj.AspectJAfterThrowingAdvice.invoke(AspectJAfterThrowingAdvice.java:59)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.transaction.interceptor.TransactionInterceptor.proceedWithInvocation(TransactionInterceptor.java:99)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:281)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:655)
at nu.optimise.projectweb.service.DocumentMetadataService$$EnhancerBySpringCGLIB$$c3a15a18.searchWordInPdf(<generated>)
at nu.optimise.projectweb.web.rest.DocumentMetadataResource.searchContentPDF(DocumentMetadataResource.java:1026)
at nu.optimise.projectweb.web.rest.DocumentMetadataResource$$FastClassBySpringCGLIB$$bb12eea8.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:720)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
at nu.optimise.projectweb.aop.logging.LoggingAspect.logAround(LoggingAspect.java:51)
at sun.reflect.GeneratedMethodAccessor186.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:620)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:609)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:68)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.aspectj.AspectJAfterThrowingAdvice.invoke(AspectJAfterThrowingAdvice.java:59)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at com.ryantenney.metrics.spring.TimedMethodInterceptor.invoke(TimedMethodInterceptor.java:48)
at com.ryantenney.metrics.spring.TimedMethodInterceptor.invoke(TimedMethodInterceptor.java:34)
at com.ryantenney.metrics.spring.AbstractMetricMethodInterceptor.invoke(AbstractMetricMethodInterceptor.java:59)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:655)
at nu.optimise.projectweb.web.rest.DocumentMetadataResource$$EnhancerBySpringCGLIB$$bfe48b3d.searchContentPDF(<generated>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:221)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:136)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:110)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:832)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:743)
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:961)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:895)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:967)
at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:858)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:622)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:843)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:292)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at com.codahale.metrics.servlet.AbstractInstrumentedFilter.doFilter(AbstractInstrumentedFilter.java:104)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.boot.actuate.autoconfigure.EndpointWebMvcAutoConfiguration$ApplicationContextHeaderFilter.doFilterInternal(EndpointWebMvcAutoConfiguration.java:281)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.boot.actuate.trace.WebRequestTraceFilter.doFilterInternal(WebRequestTraceFilter.java:115)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317)
at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127)
at org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:115)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.session.SessionManagementFilter.doFilter(SessionManagementFilter.java:137)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.authentication.AnonymousAuthenticationFilter.doFilter(AnonymousAuthenticationFilter.java:112)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter.doFilter(SecurityContextHolderAwareRequestFilter.java:169)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.savedrequest.RequestCacheAwareFilter.doFilter(RequestCacheAwareFilter.java:63)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at nu.optimise.projectweb.security.jwt.JWTFilter.doFilter(JWTFilter.java:43)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.authentication.logout.LogoutFilter.doFilter(LogoutFilter.java:121)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.header.HeaderWriterFilter.doFilterInternal(HeaderWriterFilter.java:66)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.context.SecurityContextPersistenceFilter.doFilter(SecurityContextPersistenceFilter.java:106)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter.doFilterInternal(WebAsyncManagerIntegrationFilter.java:56)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)
at org.springframework.security.web.FilterChainProxy.doFilterInternal(FilterChainProxy.java:214)
at org.springframework.security.web.FilterChainProxy.doFilter(FilterChainProxy.java:177)
at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(DelegatingFilterProxy.java:346)
at org.springframework.web.filter.DelegatingFilterProxy.doFilter(DelegatingFilterProxy.java:262)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:99)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.web.filter.HttpPutFormContentFilter.doFilterInternal(HttpPutFormContentFilter.java:87)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.web.filter.HiddenHttpMethodFilter.doFilterInternal(HiddenHttpMethodFilter.java:77)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:121)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:212)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:141)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:528)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1099)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:670)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1520)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1476)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
这是 PDF Clown 中的一个错误:在文本提取过程中,它使用了自定义 Comparator
实现,该实现不完全遵循 Comparator
约定。在 Java 7 及以下版本中,这被忽略了,但在 Java 8 中,这导致了手头的异常。如果你指示Java使用旧的排序算法,程序运行无异常。
比较器
这是有问题的比较器
/**
Text string position comparator.
*/
private static class TextStringPositionComparator
implements Comparator<ITextString>
{
/**
Gets whether the specified boxes lay on the same text line.
*/
public static boolean isOnTheSameLine(
Rectangle2D box1,
Rectangle2D box2
)
{
/*
NOTE: In order to consider the two boxes being on the same line,
we apply a simple rule of thumb: at least 25% of a box's height MUST
lay on the horizontal projection of the other one.
*/
double minHeight = Math.min(box1.getHeight(), box2.getHeight());
double yThreshold = minHeight * .75;
return ((box1.getY() > box2.getY() - yThreshold
&& box1.getY() < box2.getMaxY() + yThreshold - minHeight)
|| (box2.getY() > box1.getY() - yThreshold
&& box2.getY() < box1.getMaxY() + yThreshold - minHeight));
}
@Override
public int compare(
ITextString textString1,
ITextString textString2
)
{
Rectangle2D box1 = textString1.getBox();
Rectangle2D box2 = textString2.getBox();
if(isOnTheSameLine(box1,box2))
{
/*
[FIX:55:0.1.3] In order not to violate the transitive condition, equivalence on x-axis
MUST fall back on y-axis comparison.
*/
int xCompare = Double.compare(box1.getX(), box2.getX());
if(xCompare != 0)
return xCompare;
}
return Double.compare(box1.getY(), box2.getY());
}
}
如评论[FIX:55:0.1.3] ...
所示,作者已经遇到了排序问题。不幸的是,他只解决了一个麻烦的情况。
显然 compare
中使用的 isOnTheSameLine
测试通常是非传递性的原因,考虑具有三个 ITextString
实例的情况 A
, B
,以及 C
:
(这可能发生在常规文本中,例如,在一行中,首先是下标文本,然后是正常书写,然后是上标文本。)
A
和 B
以及 B
和 C
会被认为在同一行,但 A
和 [=20 不会=].因此,前两对将分别通过 x 坐标进行比较,而最后一对将通过 y 坐标进行比较,从而导致非传递性:
- A < B 和
- B < C,但是
- A > C(PDF 小丑使用 y 坐标向下递增)。
身份条件也可能被违反,考虑两个 ITextString
实例 A
和 B
的情况,它们都具有相同的框,即具有相同的尺寸和打印在同一位置(例如,用重叠的字母构建符号)。 compare
会 return 0
只有在将一个对象与其相等的对象进行比较时才会发生(“应该”,因为这只是建议,并非严格要求)。
不过,大多数情况下,比较器确实会按照人们认为正确的方式对文本片段进行排序。
解决方法
在 Java 8 之前内置的 Java 排序算法没有测试 Comparator
实现是否满足契约。排序结果可能没有正确排序,但排序没有抛出异常。 (不过,一些后来调用的例程假定要排序的数组可能会失败得很厉害。)
Java 8,不过,使用不同的默认排序算法,该算法会进行一些健全性检查,以识别未履行的 Comparator
合同对排序过程的某些影响。
但是通过使用命令行JRE参数
-Djava.util.Arrays.useLegacyMergeSort=true
你可以告诉 Java 8 使用不会因异常而失败的旧排序方法。