将 DataInputStream 设置为字符串值

Question

我正在尝试为删除单词的方法编写一个 junit 测试。我遇到的问题是该方法返回的是符号而不是删除的单词。

我的测试方法是

    @Test
public void testReadString() throws IOException
{
    String testString = "******test";

    InputStream stream = new ByteArrayInputStream(testString.getBytes(StandardCharsets.UTF_8));
    DataInputStream dis = new DataInputStream(stream);

    String word = readString(dis, 10);

    assertEquals("test", word);
}

它正在测试的方法是

    public static String readString(DataInputStream dis, int size) throws IOException
{

    byte[] makeBytes = new byte[size * 2];// 2 bytes per char
    dis.read(makeBytes);  // read size characters (including padding)
    return depad(makeBytes);
}

public static String depad(byte[] read) 
{
    //word = word.replace("*", "");
    StringBuilder word = new StringBuilder();
    for (int i = 0; i < read.length; i += 2)
    {
        char c = (char) (((read[i] & 0x00FF) << 8) + (read[i + 1] & 0x00FF));

        if (c != '*')
        {
            word.append(c);
        }
    }
    return word.toString();
}

我在运行测试时遇到的错误是测试失败预期 [测试] 但 [⨪⨪⨪瑥獴]

Answer 1

InputStream stream = new ByteArrayInputStream(testString.getBytes(StandardCharsets.UTF_8));

...

char c = (char) (((read[i] & 0x00FF) << 8) + (read[i + 1] & 0x00FF));

您的代码需要 UCS-2 编码的字符串，但您向它提供的是 UTF-8 编码的字符串。在 UCS-2 中，每个字符恰好是两个字节。 UTF-8 是一种可变长度编码，其中 ASCII 字符为一个字节，其他字符为两个或多个。

参见：Comparison of Unicode encodings 维基百科

请注意，UCS-2 是一种非常简单且过时的编码。它只能编码前 64K 个 Unicode 字符。在现代 Unicode 应用程序中，它已被 UTF-16 取代。 According to the Unicode Consortium:

UCS-2 should now be considered obsolete. It no longer refers to an encoding form in either 10646 or the Unicode Standard.

无论如何，使用字节数组的原因是什么？如果你想操作字符数据，你应该使用字符串，而不是字节。字符串让您不必担心编码问题。

Answer 2

有两种I/O 类:

字节流：用于读取字节。

您可以找到很多类，例如：ByteArrayInputStream 和 DataInputStream。

字符流：它们用于阅读人类可读的文本。

您可以找到很多类，例如：StringReader 和 InputStreamReader。您可以轻松找到此类，因为他们使用后缀 Writter 或 Reader.

我建议像这样使用 StringReader：

new StringReader("******test");

将 DataInputStream 设置为字符串值

Set DataInputStream to String Value

java

junit

datainputstream