从文件中读取单个字符 returns 个特殊字符？

Question

我正在尝试使用 fstreams 从文件中的指定位置读取单个字符并将它们附加到字符串中。出于某种原因，读入这些字符 returns 个特殊字符。我已经尝试了很多东西，但我在调试时发现的更奇怪的事情是更改 char temp; 的初始值将导致整个字符串更改为该值。

int Class::numbers(int number, string& buffer) {
    char temp;

    if (number < 0 || buffer.length() > size) {
        exit(0);
    }

    string fname = name + ".txt";
    int start = number * size;

    ifstream readin(fname.c_str());
    readin.open(fname.c_str(), ios::in)
    readin.seekg(start);

    for (int i = 0; i < size; ++i) {
        readin.get(temp);
        buffer += temp;
    }

    cout << buffer << endl;
    readin.close();
    return 0;
}

下面是正在输出的特殊字符的示例屏幕截图：http://i.imgur.com/6HCI7TT.png

问题可能出在我开始使用 seekg 的地方吗？它似乎在适当的位置开始。我考虑过的另一件事是，也许我正在将一些无效的地方读入流中，它只是给我内存中的垃圾字符。

有什么想法吗？

可行的解决方案：

int Class::numbers(int number, string& buffer) {
    char temp;

    if (number < 0 || buffer.length() > size) {
        exit(0);
    }

    string fname = name + ".txt";
    int start = number * size;

    ifstream readin(fname.c_str());
    readin.open(fname.c_str(), ios::in)
    readin.seekg(start);

    for (int i = 0; i < size; ++i) {
        readin.get(temp);
        buffer += temp;
    }

    cout << buffer << endl;
    readin.close();
    return 0;
}

这是可行的解决方案。在我的程序中我已经打开了这个文件名，所以我想打开它两次可能会导致问题。我会在我自己的时间对此做一些进一步的测试。

Answer 1

For ASCII characters with a numeric value greater than 127, the actual character rendered on screen depends on the code page of the system you are currently using.

可能发生的情况是您没有像您认为的那样得到一个 "character"。

首先，要对此进行调试，请使用现有代码打开并打印出整个文本文件。您的程序能够做到这一点吗？如果不是，则您打开的 "text" 文件可能没有使用 ASCII，而是可能使用 UTF 或其他某种编码形式。这意味着当您读取 "character"（最有可能是 8 位）时，您只是读取了 16 位 "wide character" 的一半，结果对您来说毫无意义。

例如，gedit 应用程序将按照我的预期自动在屏幕上呈现 "Hello World"，而不管字符编码如何。但是，在十六进制编辑器中，UTF8 编码的文件如下所示：

UTF8 原始文本：

0000000: 4865 6c6c 6f20 776f 726c 642e 0a         Hello world..

而 UTF16 看起来像：

0000000: fffe 4800 6500 6c00 6c00 6f00 2000 7700  ..H.e.l.l.o. .w.
0000010: 6f00 7200 6c00 6400 2e00 0a00            o.r.l.d.....

这就是您的程序所看到的。 C/C++ 默认使用 ASCII 编码。如果您想处理其他编码，则由您的程序手动或使用第三方库来适应它。

Also, you aren't testing to see if you've exceeded the length of the file。你可能只是在抓垃圾。

使用一个只包含字符串 "Hello World" 的简单文本文件，您的程序可以这样做吗：

代码清单

// read a file into memory
#include <iostream>     // std::cout
#include <fstream>      // std::ifstream
#include <string.h>

int main () {
    std::ifstream is ("test.txt", std::ifstream::binary);
    if (is) {
        // get length of file:
        is.seekg (0, is.end);
        int length = is.tellg();
        is.seekg (0, is.beg);

        // allocate memory:
        char * buffer = new char [length];

        // read data as a block:
        is.read (buffer,length);
        // print content:
        std::cout.write (buffer,length);
        std::cout << std::endl;

        // repeat at arbitrary locations:
        for (int i = 0; i < length; i++ )
        {
            memset(buffer, 0x00, length);
            is.seekg (i, is.beg);
            is.read(buffer, length-i);
            // print content:
            std::cout.write (buffer,length);
            std::cout << std::endl;
        }

        is.close();
        delete[] buffer;
    }

    return 0;
}

示例输出

Hello World

Hello World

ello World

llo World

lo World

o World

 World

World

orld

rld

ld

d

从文件中读取单个字符 returns 个特殊字符？

Reading a single character from a file returns special characters?

c++

fstream