使用分隔符 a space 和分号读取文件

Question

我写这个是为了解析一个带有数字的文件，其中分隔符只是一个 space。我的目标是读取文件的每个数字并将其存储在矩阵A的相应索引中。因此，读取的第一个数字应转到 A[0][0]，第二个数字应转到 A[0][1]，依此类推。

#include <iostream>
#include <string>
#include <fstream>

using namespace std;

int main() {
    const int N = 5, M = 5;
    double A[N*M];
    string fname("test_problem.txt");
    ifstream file(fname.c_str());
    for (int r = 0; r < N; ++r) {
        for (int c = 0; c < M; ++c) {
            file >> *(A + N*c + r);
        }
    }

    for (int r = 0; r < N; ++r) {
        for (int c = 0; c < M; ++c) {
            cout << *(A + N*c + r) << " ";
        }
        cout << "\n";
    }
    cout << endl;

    return 0;
}

现在，我正在尝试解析这样的文件：

1 ;2 ;3 ;4 ;5
10 ;20 ;30 ;40 ;50
0.1 ;0.2 ;0.3 ;0.4 ;0.5
11 ;21 ;31 ;41 ;5
1 ;2 ;3 ;4 ;534

但它会打印（从而读取）垃圾。我该怎么办？

编辑

这是我在 C 中的尝试，但也失败了：

FILE* fp = fopen("test_problem.txt", "r");
double v = -1.0;
while (fscanf(fp, "%f ;", &v) == 1) {
    std::cout << v << std::endl;
}

-1 将始终打印。

Answer 1

您应该在转换前删除分号

std::string temp;
file >> temp;
std::replace( temp.begin(), temp.end(), ';', ' ');
*(A + N*c + r) =    std::stod( temp );

Answer 2

您的 C 示例的问题：

warning: format ‘%f’ expects argument of type ‘float*’, but
         argument 3 has type ‘double*’ [-Wformat=]

随时随地打开警告（-Wall -Wextra）并进行更多错误检查。

无论如何，要 fscanf 变成 double 你需要 %lf 而不是 %f。

Answer 3

鉴于您的输入格式...

1 ;2 ;3 ;4 ;5

...您的代码...

for (int c = 0; c < M; ++c) {
    file >> *(A + N*c + r);
}

...将 "eat" 第一个数值，然后在第一个 ; 分隔符上阻塞。最简单的更正是...

char expected_semicolon;

for (int c = 0; c < M; ++c) {
    if (c) {
        file >> expected_semicolon;
        assert(expected_semicolon == ';'); // if care + #include <cassert>
    }
    file >> *(A + N*c + r);
}

不管它值多少钱，我建议添加更好的错误检查...

if (std::ifstream file(fname))
{
    ...use file stream...
}
else
{
    std::cerr << "oops\n";
    throw or exit(1);
}

...作为打开文件流的一般做法。

对于循环获取数据，使用支持宏来提供类似断言的样式非常适合流：

#define CHECK(CONDITION, MESSAGE) \
    do { \
        if (!(CONDITION)) { \
            std::ostringstream oss; \
            oss << __FILE__ << ':' << __LINE __ \
                << " CHECK FAILED: " << #CONDITION \
                << "; " << MESSAGE; \
            throw std::runtime_error(oss.str()); \
    } while (false)

...

for (int c = 0; c < M; ++c) {
    if (c)
        CHECK(file >> expected_semicolon &&
              expected_semicolon == ';',
              "values should be separated by semicolons");
    CHECK(file >> *(A + N*c + r), "expected a numeric value");
}

对于这个特定的输入解析，对于生产系统，您可能想要使用 getline 这样您就可以知道您在输入中的位置...

size_t lineNum = 0;
std::string my_string;
for (int r = 0; r < N; ++r) {
    CHECK(getline(file, my_string), "unexpect EOF in input");
    ++lineNum;
    std::istringstream iss(my_string);
        for (int c = 0; c < M; ++c) {
            if (c)
                CHECK(file >> expected_semicolon &&
                      expected_semicolon == ';',
                      "unexpected char '" << c 
                      << "' when semicolon separator needed on line "
                      << lineNum);
            CHECK(iss >> *(A + N*c + r),
                  "non numeric value encountered on line " << lineNum);
        }
    }
 }

Answer 4

为什么不试试 getline()，它接受一个定界符作为第三个参数。

string buffer;
for (int c = 0; c < M; ++c) {
    getline(file, buffer, ';');
    stringstream tmp(buffer);
    tmp>>*(A + N*c + r);
}

getline() 将一直读取到下一个分隔符或换行符或文件结尾

使用分隔符 a space 和分号读取文件

Read file with separator a space and a semicolon

c++

fstream

file

text-parsing