用pdfclown提取矢量图形(线和点)

Extracting vector graphics (lines and points) with pdfclown

我想使用 pdfclown 从 pdf 中提取矢量图形(线条和点)。我试图全神贯注于图形样本,但我无法弄清楚对象模型是如何工作的。请问谁能解释一下关系?

你是对的:直到 PDF Clown 0.1 系列,高级路径建模都没有实现(它本来是从 ContentScanner.GraphicsWrapper 派生的)。

下一个版本0.2 series, due next month) will support the high-level representation of all the graphics contents, including path objects (PathElement), through the new ContentModeller。这是一个例子:

import org.pdfclown.documents.contents.elements.ContentModeller;
import org.pdfclown.documents.contents.elements.GraphicsElement;
import org.pdfclown.documents.contents.elements.PathElement;
import org.pdfclown.documents.contents.objects.Path;

import java.awt.geom.GeneralPath;

for(GraphicsElement<?> element : ContentModeller.model(page, Path.class))
{
  PathElement pathElement = (PathElement)element;
  List<ContentMarker> markers = pathElement.getMarkers();
  pathElement.getBox();
  GeneralPath getPath = pathElement.getPath();
  pathElement.isFilled();
  pathElement.isStroked();
}

与此同时,您可以通过 ContentScanner as suggested in ContentScanningSample (available in the downloadable distribution), looking for path-related operations (BeginSubpath, DrawLine, DrawRectangle, DrawCurve, ...).

提取迭代内容流的矢量图形的低级表示