如何读取西班牙语编码文件并逐字符存储？

Question

我无法读取文件并将其存储在内存中，因为它是用西班牙语编写的，我认为这可能是编码问题。我想知道一种分别打印或存储每个字符的方法。我已经尝试了很多东西，但我发现最准确的方法是使用方法 wstring readFile(const char* filename)，如代码所示：

#include <sstream>
#include <fstream>
#include <iostream>
#include <fstream>
#include <algorithm>

std::wstring readFile(const char* filename)//Read using a file using wifstream
{
    std::wifstream wif(filename);

    std::wstringstream wss;

    wss << wif.rdbuf();
    return wss.str();
}

int main()
{
    std::wstring fileContent = readFile("read.txt"); //Read file to wstring.

    std::wcout << fileContent ; //Print the wstring. This works fine.
    std::cout << " " << std::endl;//Give spacing.

    wchar_t a; //create variable wchar_t.
    int fs = fileContent.size();
    std::cout << "Number of chars: " << fs; //Check content size.

    for (int i = 0; i < fs; i++){ //I want to print each letter.

        a = fileContent.at(i);  //Assign to "a" content of specified index.

        std::wcout << " " << a ; //Print character stored in variable a.
    }
}

在变量 wchar_t a 中存储或打印 fileContent.at(i) 或 fileContent[i] 的值时似乎有问题。你知道代码中可以改进什么或者给我一个解决这个问题的指导方针吗？

我正在使用 Macintosh 和 Linux，如果它有助于了解。谢谢！

Answer 1

您正在使用 std::wifstream，其中 returns Unicode 字符使用 wchar_t（UTF-16 或 UTF-32，取决于平台），但您没有告诉 std::wifstream 源文件的编码是什么，以便它可以将文件数据从西班牙语解码为 Unicode。在开始读取文件数据之前，您需要 imbue() 将适当的西班牙语语言环境设置到 std::wifstream。

如何读取西班牙语编码文件并逐字符存储？

How to read a Spanish encoded file and store it character by character?

c++

wchar-t

utf-8

wstring

c++11