使用 GeoTools 加载多线程几何

Multithreaded Geometry loading with GeoTools

嘿 Whosebug 社区, 我目前正在尝试编写一个小工具,它读取 shapefile 几何图形(多多边形/多边形)并将这些几何图形的 WKT 表示写入文本文件。 为此,我正在使用 GeoTools 并且我设法得到它 运行 很好,因为我正在转换包含大约 5000000 个多边形/多边形的文件,这需要很长时间才能完成。

所以我的问题是:

可以加档loading/writing吗? 因为我使用的是 SimpleFeatureIterator,所以我没有找到如何实现多线程。

有办法吗? 或者有人知道如何在不使用迭代器的情况下获取 shapefile 几何图形吗?

这是我的代码:

此方法只是声明文件选择器并为每个选定的文件启动线程。

protected static void printGeometriesToFile() {
    JFileChooser chooser = new JFileChooser();
    FileNameExtensionFilter filter = new FileNameExtensionFilter(
            "shape-files", "shp");
    chooser.setFileFilter(filter);
    chooser.setDialogTitle("Choose the file to be converted.");
    chooser.setMultiSelectionEnabled(true);
    File[] files = null;

    int returnVal = chooser.showOpenDialog(null);
    if (returnVal == JFileChooser.APPROVE_OPTION) {
        files = chooser.getSelectedFiles();
    }

    for (int i = 0; i < files.length; i++) {
        MultiThreadWriter writer = new MultiThreadWriter(files[i]);
        writer.start();
    }
}

多线程class:

class MultiThreadWriter extends Thread {
    private File threadFile;

    MultiThreadWriter(File file) {
        threadFile = file;
        System.out.println("Starting Thread for " + file.getName());
    }

    public void run() {
        try {
            File outputFolder = new File(threadFile.getAbsolutePath() + ".txt");
            FileOutputStream fos = new FileOutputStream(outputFolder);
            System.out.println("Now writing data to file: " + outputFolder.getName());

            FileDataStore store = FileDataStoreFinder.getDataStore(threadFile);
            SimpleFeatureSource featureSource = store.getFeatureSource();

            SimpleFeatureCollection featureCollection = featureSource.getFeatures();
            SimpleFeatureIterator featureIterator = featureCollection.features();

            int pos = 0;

            while (featureIterator.hasNext()) {
                fos.write((geometryToByteArray((Polygonal) featureIterator.next().getAttribute("the_geom"))));

                pos++;
                System.out.println("The file " + threadFile.getName() + "'s current positon is: " + pos);
            }

            fos.close();

            System.out.println("Finished writing.");

        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

这只是一个将多边形转换为多边形的辅助函数,returns它的 WKT 表示形式带有“|”作为分隔符。

private byte[] geometryToByteArray(Polygonal polygonal) {

    List<Polygon> polygonList;

    String polygonString = "";

    if (polygonal instanceof MultiPolygon) {
        polygonList = GeometrieUtils.convertMultiPolygonToPolygonList((MultiPolygon) polygonal);
     //The method above just converts a MultiPolygon into a list of Polygons
    } else {
        polygonList = new ArrayList<>(1);
        polygonList.add((Polygon) polygonal);
    }

    for (int i = 0; i < polygonList.size(); i++) {
        polygonString = polygonString + polygonList.get(i).toString() + "|";
    }

    return polygonString.getBytes();
}

}

我知道我的代码不漂亮或不好。刚开始学习Java希望能快点好起来

真诚的

我没有线索 :)

我更喜欢将文件内容读取为对象列表,然后将列表拆分为子列表,然后为每个列表创建一个线程,示例:

int nbrThreads = 10;

ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(nbrThreads);

int count = myObjectsList != null ? myObjectsList.size() / nbrThreads : 0;

List<List<MyObject>> resultlists = choppeList(myObjectsList, count > 0 ? count : 1);

try
{
    for (List<MyObject> list : resultlists)
    {
        // TODO : create your thread and passe the list of objects   
    }

    executor.shutdown();

    executor.awaitTermination(30, TimeUnit.MINUTESS); // chose time of termination
}
catch (Exception e)
{
    LOG.error("Problem launching threads", e);
}

choppeList 方法可以是这样的:

public <T> List<List<T>> choppeList(final List<T> list, final int L)
{
    final List<List<T>> parts = new ArrayList<List<T>>();
    final int N = list.size();
    for (int i = 0; i < N; i += L)
    {
        parts.add(new ArrayList<T>(list.subList(i, Math.min(N, i + L))));
    }
    return parts;
}
  1. 您不需要为每个文件都创建一个新线程,因为创建新线程是一项昂贵的操作。相反,您可以让 MultiThreadWriter 实现 Runnable 并使用 ThreadPoolExecuter 管理所有线程。

    多线程编写器

    public class MultiThreadWriter implements Runnable {
        @Override
        public void run() {
            //
        }
    }
    

    创建与您的运行时处理器匹配的线程池。

    ExecutorService service = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
    
    for (int i = 0; i < files.length; i++) {
        MultiThreadWriter writer = new MultiThreadWriter(files[i]);
        service.submit(writer);
    }
    
  2. 可以用BufferedWriter代替OutputStream,就是more 高效当你重复写小块时。

    File outputFolder = new File(threadFile.getAbsolutePath() + ".txt");
    FileOutputStream fos = new FileOutputStream(outputFolder);
    BufferedWriter writer = new BufferedWriter(fos);