MalformedInputException：当运行超过两次时输入长度 = 1

Question

运行第一次使用此代码处理文件时效果很好，但是在第二个运行（在同一个文件上），Files.readAllLines 抛出所述异常。

代码所做的只是（针对每个文件，但在本例中它只是一个）从文件中获取所有行，删除它，然后用相同的内容重新填充它。

for (File file : content) {
    List<String> fillLines = new ArrayList<>();
    try {
        fillLines = Files.readAllLines(file.toPath());
    } catch (IOException e) {
        e.printStackTrace();
    }

    if (fillLines.size() > 0) {
        file.delete();
        FileWriter fileWriter = new FileWriter(file, false);

        for (String line : fillLines) {
            fileWriter.write(line);
            if (fillLines.indexOf(line) < fillLines.size() - 1)
                fileWriter.append(System.lineSeparator());
        }
        fileWriter.close();
    }
}

有什么想法吗？可能是因为 fileWriter.append(System.lineSeparator());?

所有其他提问者第一次都失败了，因为使用了错误的字符集阅读它。但是因为我能够运行它一次，所以我不是在读而是写错了，所以更改字符集似乎是一种可以避免的解决方法。

堆栈跟踪：

java.nio.charset.MalformedInputException: Input length = 1
    at java.nio.charset.CoderResult.throwException(Unknown Source)
    at sun.nio.cs.StreamDecoder.implRead(Unknown Source)
    at sun.nio.cs.StreamDecoder.read(Unknown Source)
    at java.io.InputStreamReader.read(Unknown Source)
    at java.io.BufferedReader.fill(Unknown Source)
    at java.io.BufferedReader.readLine(Unknown Source)
    at java.io.BufferedReader.readLine(Unknown Source)
    at java.nio.file.Files.readAllLines(Unknown Source)
    at java.nio.file.Files.readAllLines(Unknown Source)

这里的任何内容都指向

    fillLines = Files.readAllLines(file.toPath());

Answer 1

来自 Files.readAllLines() 的文档：

Bytes from the file are decoded into characters using the UTF-8 charset

来自 FileWriter 的文档：

The constructors of this class assume that the default character encoding and the default byte-buffer size are acceptable. To specify these values yourself, construct an OutputStreamWriter on a FileOutputStream.

因此，您正在使用默认平台编码（在您的情况下不是 UTF8）进行编写，并使用 UTF8 进行读取。这就是异常的原因。使用相同的编码写入和读取。上面的文档说明了如何指定UTF8编码来写。

Answer 2

for (File file : content) {
    Path path = fiel.toPath();
    List<String> fillLines;
    try {
        fillLines = Files.readAllLines(path);
    } catch (IOException e) {
        System.err.println("Error while reading " + path);
        e.printStackTrace();
        fillLines = new ArrayList<>();
    }

    if (!fillLines.isEmpty()) {
        //Files.delete(path);
        // See -A-
        Files.write(path, fillLines, StandardOpenOptions.TRUNCATE_EXISTING);
    }
}

为什么这样 - 即使它更短、更安全、更一致？

错误

您没有指定用于读取和写入的字符集。

在 Files 之前，这意味着使用了平台编码，并且存在重载的构造函数和 Charset/String 编码。

对于非常古老的 FileReader/FileWriter 甚至不存在这种重载：他们总是使用平台编码 - System.getProperty("file.encoding").

使用 Files Unicode 的 UTF-8 成为默认值：因为 java 字符串包含 Unicode，转换变得无损。 太棒了！

但是用Windows-1252这样的东西写完之后，用UTF-8读很可能会失败，因为UTF-8需要专门用第8位来实现valid 多字节序列。

注意：最初文件是 UTF-8，但一旦写入，就不再是（没有有效的 UTF-8）。

// -A-
// Possibly add a BOM (begin of file marker) to identify to Windows
// that this file is in UTF-8 (works for UTF-16 too).
// This is primarily for Notepad. A BOM is redundant, invisible (zero width space)
// and generally inadvisable if not needed.
if (!lines.get(0).startsWith("\uFEFF")) {
    lines.set(0, "\uFEFF" + lines.get(0));
}

MalformedInputException：当运行超过两次时输入长度 = 1

MalformedInputException: Input length = 1 when run over twice

java

file

filewriter

MalformedInputException：当 运行 超过两次时输入长度 = 1

MalformedInputException: Input length = 1 when run over twice

java

file

filewriter

MalformedInputException：当运行超过两次时输入长度 = 1