动态选择 CSV 文件中的特定列

Question

我有这个 CSV 文件：

id,name,mark
20203923380,Lisa Hatfield,62
20200705173,Jessica Johnson,59
20205415333,Adam Harper,41
20203326467,Logan Nolan,77

我正在尝试使用以下代码处理它：

 try (Stream<String> stream = Files.lines(Paths.get(String.valueOf(csvPath)))) {
                DoubleSummaryStatistics statistics = stream
                        .map(s -> s.split(",")[index]).skip(1)
                        .mapToDouble(Double::valueOf)
                        .summaryStatistics();
} catch (IOException e) // more code

我想按名称获取列。

我想我需要验证 index 是用户作为整数输入的列的索引，如下所示：

int index = Arrays.stream(stream).indexOf(columnNS);

但是不行。

流应该具有以下值，例如：

列："mark"

62, 59, 41, 77

Answer 1

I need to validate the index to be the index of the column the user enters as an integer ... But it doesn't work.

Arrays.stream(stream).indexOf(columnNS)

Stream IPA 中没有方法indexOf。我不确定 stream(stream) 是什么意思，但这种方法是错误的。

为了获得有效的索引，您需要列的名称。根据 name，您必须分析从文件中检索到的第一行。就像在列名为“mark”的示例中一样，您需要查明该名称是否出现在第一行中以及它的索引是什么。

What I want is to get the column by it's name ... The stream is supposed ...

流旨在 有状态。它们是在 Java 中引入的，目的是提供一种表达清晰的代码结构方式。即使你设法将有状态的条件逻辑塞进一个流中，你也会失去这个优势并最终得到复杂的代码，其性能不如普通循环清晰（剩余部分：迭代解决方案几乎总是表现更好).

所以你想保持你的代码干净，你可以选择：使用迭代方法解决这个问题，或者放弃在流中动态确定列索引的要求。

这就是您如何解决根据列名使用循环动态读取文件数据的任务：

public static List<String> readFile(Path path, String columnName) {
    List<String> result = new ArrayList<>();
    try(var reader = Files.newBufferedReader(path)) {
        int index = -1;
        String line;
        while ((line = reader.readLine()) != null) {
            String[] arr = line.split("\p{Punct}");
            if (index == -1) {
                index = getIndex(arr, columnName);
                continue; // skipping the first line
            }
            result.add(arr[index]);
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
    return result;
}
// validation logic resides here
public static int getIndex(String[] arr, String columnName) {
    int index = Arrays.asList(arr).indexOf(columnName);
    if (index == -1) {
        throw new IllegalArgumentException("Given column name '" + columnName + "' wasn't found");
    }
    return index;
}
// extracting statistics from the file data
public static DoubleSummaryStatistics getStat(List<String> list) {
    return list.stream()
        .mapToDouble(Double::parseDouble)
        .summaryStatistics();
}

public static void main(String[] args) {
    DoubleSummaryStatistics stat = getStat(readFile(Path.of("test.txt"), "mark"));
}

动态选择 CSV 文件中的特定列

Selecting a particular Column in a CSV-file Dynamically

java

csv

io

nio

stream