在 C++ 中有选择地读取格式化数据文件

Selectively read a formatted data file in C++

我有一个这样开头的数据文件:

/*--------------------------------------------------------------------------*\
Some useless commented information
\*---------------------------------------------------------------------------*/

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //


882
(
(0 0 0)
(1 1 1)
...more vectors
)

如何继续读取文件并将数字 882 以及向量列表存储到数据结构中?

我基本上是在尝试使用括号内的数据,即 (1 2 3) to vec.x = 1, vec.y = 2, vec.z = 3.

这是我尝试至少打印出数字 882 的尝试:

#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <vector>
#include <iterator>

int main()
{

  std::string line;
  std::ifstream file ("points");
  if (file.is_open())
  {
    while ( getline (file,line) )
    {
            std::stringstream ss(line);
            int n;
            std::vector<int> v;

                while (ss >> n)
                {
                    v.push_back(n);
                }
                std::copy(v.begin(), v.end(), std::ostream_iterator<int>(std::cout, " "));

    }
    file.close();
  }

  else std::cout << "Unable to open file";

  return 0;

}

这完全取决于您的格式化数据文件可以包含哪些行。

假设您的数据文件遵循以下模式:

/* THIS IS BEGIN OF COMMENT BLOCK */
STILL MORE USELESS COMMENTS
812
that 812 is still useless
\* END OF COMMENT BLOCK *\

// **** Single line comment *** //

// **** its fine to have blank lines ***** //

812
(
(1 2 3)
// **** Comments can come anywhere **** //
(4 5 6)
.... MORE VECTORS ...
(7 8 9)
/***** EVENT BLOCK COMMENTS ****/
\***** ******\

// **** Blank lines allowed anywhere **** //
)

您可以设置一个简单的状态机来处理您的数据文件。

你会有几个状态:

1. Looking for initial number
   a. Inside Comment Block
   b. Not inside Comment Block
2. Looking for start of list of vectors
   a. Inside Comment Block
   b. Not inside Comment Block
3. Reading list of vectors / Looking for end of list of vectors
   a. Inside Comment Block
   b. Not inside Comment block

您基本上有 3 个要找的东西。初始编号,向量列表的开始和结束。 在每一种情况下,您都有两种基本情况会影响您处理线路的方式。您是否在块评论中。

如果您在块注释内忽略所有内容,直到找到块注释的末尾。

否则处理该行以确定它是空行、注释块的开头、注释行还是您当前正在查找的内容。

代码

#include <iostream>
#include <vector>
#include <fstream>
#include <string>
#include <sstream>
using namespace std;

struct vec{
  int x;
  int y;
  int z;
};

/* I'll leave these for you to try out yourself. You would know best how each of these are defined */
bool block_comment_start(const string& line);
bool block_comment_end(const string& line);
bool is_number(const string& line);
bool is_point(const string& line);
bool is_start_of_point_list(const string& line);
bool is_end_of_point_list(const string& line);
int parse_num(const string& line){
  int tmp;
  istringstream ss(line);
  ss >> tmp;
  return tmp;
}
vec parsePoint(const string& line){
  vec tmp;
  char lp; /* ignore left parenthesis at beginning of point*/
  istringstream ss(line);
  ss >> lp >> tmp.x >> tmp.y >> tmp.z;
  return tmp;
}

int main(){
    string line;
    int state(0);        /* we're initially looking for a number */
    bool comment(false); /* We're initially not inside a comment block */

    int val;
    vector<vec> points;

    ifstream file("points");
    if (file.is_open()){
      while (getline(file, line)){
        if (comment){
          if (block_comment_end(line))
            comment = false;
        } else if (state == 0){ // Looking for initial number
          if (block_comment_start(line))
            comment = true;
          else if (is_number(line)){
            val = parse_num(line);
            ++state;
          } /* ignore anything that isn't a number or begin of comment line */
        } else if (state == 1){
          if (block_comment_start(line))
            comment = true;
          else if (is_start_of_point_list(line)){
            ++state;
          }
        } else if (state == 2){
          if (block_comment_start(line))
            comment = true;
          else if (is_end_of_point_list(line)){
            ++state;
          } else if (is_point(line)){
            points.push_back(parsePoint(line));
          }
        } /* Ignore everything after end of list of vectors */
      }
    } else {
      cout << "Error opening file: \"points\"";
    }
    return 0;
}

bool is_point(const string& line){
  /* returns true if the first character of the line is '(' and last character is ')'
     this will match anything between parenthesis */
  return line[0] == '(' && line[line.length-1] == ')';
}

这更多是关于如何解析文件的概述。您需要做的是编写确定什么是注释行、注释块开始、注释块结束等的功能。