按列对 CSV 文件进行排序并与 C++ 中的另一列进行比较

Question

我有一些 CSV 文件可以导入到 C++ 中，分成列并打印它们，但我无法执行我需要的分析。我希望能够对每一列进行排序（升序或降序），然后在单独的 1 或 0 列中找到一个分组。这是我到目前为止的代码，但似乎每次创建新行时我都在替换变量。

#include "stdafx.h"
#include <iostream>
#include <fstream>
#include <vector>

using namespace std;

struct sampleData { //Create set of variables that are used in the vector 
//to print sampleData
float first, second, third, fourth;
};

void printSample(sampleData sample) // Function that prints the vectors.

{
    cout << sample.first << " " << sample.second << " " << sample.third << " 
" << sample.fourth endl;
}

int main()
{
ifstream myFile("file.csv"); //Open file with ifstream constructor.

if (myFile.is_open())
{
    vector<sampleData> sample; //Create vector that stores the variables declared in the struct.
    float first, second, third, fourth; 
    char delim;

    while (myFile >> first >> delim >> second >> delim >> third >> delim >> fourth) { //Places each value of csv file into individual variables in the vector.
        sample.push_back({ first, second, third, fourth });
    }

    cout << "First value " << " Second value: " << " Third Value: " << " Fourth Value: " << endl; //Create column headers.
    for (int x(0); x < sample.size(); ++x)
    {
        printSample(sample.at(x));
    }

}


else
{
    cout << "The file did not open."; //Let's me know if file has not been opened.
}

system("pause");

return 0;
}

下面是我需要的示例。我想对每一列 (1-3) 进行排序，并将它们与第四列的 1 和 0 进行比较，以找到至少有七个 1 且平均值至少为 .70 的分组。最好创建一个二维数组或二维向量吗？如果是这样，将如何对其进行排序和比较？

感谢大家的帮助。

> -40.31945 -20.71259   4.024558    1
> -8.428544 -1.173988   13.55221    1
> -9.99227  -1.964128   22.35553    1
> -6.227934 -0.6318588  11.28533    0
> -7.350101 -4.340335   9.932037    1
> -11.32407 -3.242851   15.07184    1
> -15.81499 -5.500328   15.33309    0
> -6.112404 -1.504377   24.17496    1
> -7.5483   -3.147136   17.5016     1
> -9.895069 -6.141642   17.70264    1
> -6.691729 -5.821645   41.11068    1
> -9.520897 -4.83869    12.83501    0
> -6.09901  -1.291806   22.62663    1
> -2.136172 -0.7562032  34.48225    1
> -5.813394 -2.087043   26.70455    0
> -2.359689 -0.04058313 68.30959    0
> -4.093154 -2.890539   32.40205    0
> -7.326787 -8.31641    23.47626    0
> -5.842336 -4.699064   32.14418    0
> -1.26901  -1.150853   54.72232    1
> -4.532993 -1.921023   27.54052    0
> -13.04364 -12.8271    17.78159    1
> -22.29973 -18.63197   10.62449    1
> -13.097   -11.09199   9.261793    0
> -6.73371  -4.044      24.63213    1
> -8.487038 -5.855842   20.65492    1
> -1.271804 -0.1592398  73.54436    0
> -5.903441 -2.511718   2.906148    0
> -6.569601 -3.63947    14.92872    0
> -2.671139 -1.596091   61.78936    1
> -0.67129  -0.1758051  35.63146    0
> -10.33999 -10.25158   19.83222    0
> -5.900752 -4.774312   22.25315    0
> -3.473342 -2.116564   60.31918    0
> -5.51118  -8.684725   45.30108    1
> -4.393883 -3.597137   21.0572     0
> -3.671957 -3.355143   51.05236    1
> -7.700621 -7.257176   29.59876    1
> -6.959113 -5.834087   21.52065    1
> -6.978306 -6.291922   26.17615    0
> -3.525233 -0.2435265  39.66356    0
> -8.017325 -7.190228   16.78984    1
> -9.686805 -6.356866   24.96812    1
> -5.841892 -4.090017   12.90826    1
> -4.101501 -0.8392091  29.49425    1
> -0.50966  -0.6248183  72.55316    0
> -2.747329 -3.107922   70.82893    1
> -3.682684 -5.461088   7.237332    0
> -1.726765 -1.030436   51.13756    0
> -5.065511 -5.105534   48.8038     1
> -3.490172 -0.8473139  54.89489    1
> -14.56848 -13.29985   8.508147    1
> -5.511615 -2.257046   26.53605    1
> -0.80373  -1.259443   54.58532    1
> -11.76727 -10.51294   19.43544    0
> -4.924498 -5.660692   64.22583    1
> -1.662102 -1.329681   68.50871    0
> -2.225776 -1.191363   46.14959    1
> -11.97834 -1.471152   18.86225    0
> -9.986734 -8.210676   15.11784    1
> -0.78368  -0.2543859  64.04224    1
> -11.41681 -13.24663   9.016961    1
> -10.73357 -13.46118   31.8038     1
> -2.443766 -0.841536   35.3982     1
> -3.112007 -1.327887   32.61596    1
> -1.647414 -0.9874625  65.37144    0
> -3.771582 -2.685039   42.65498    0
> -5.503803 -6.65314    15.60404    1
> -6.844056 -10.59976   22.71807    1
> -3.977231 -6.444871   47.65485    1
> -0.43918  -1.813655   35.90933    1
> -4.520459 -3.337119   17.47536    1
> -3.102405 -2.276846   15.49771    1
> -3.173711 -4.548148   54.85541    1
> -4.157713 -2.368944   36.82358    1
> -6.671762 -6.863191   33.18528    1
> -5.806525 -8.300102   38.04575    1
> -9.137906 -10.43044   20.62558    1
> -4.830114 -5.035967   80.04454    1
> -6.717423 -7.807728   18.62613    1
> -1.654782 -2.814744   69.35754    1
> -5.718936 -5.041555   19.44518    1
> -1.139612 -1.246455   31.46728    1
> -5.193422 -4.141603   49.06763    0
> -0.72360  -1.519114   68.06107    1
> -3.45456  -2.324488   24.8586     1
> -3.946017 -1.809939   26.39728    1
> -1.373865 -1.385224   59.31034    0
> -12.91463 -16.81217   21.9325     1
> -7.101114 -4.463167   24.6039     1
> -11.19178 -7.923832   11.70692    1
> -6.337176 -3.290151   46.2829     1
> -6.034304 -6.688771   12.98928    1
> -10.72616 -16.16286   27.24244    1
> -10.01076 -11.90333   16.67032    1
> -2.85405  -1.064295   18.82794    1
> -3.582814 -3.041154   34.58895    0
> -0.88143  -2.513154   72.57123    0
> -2.936312 -2.92483    32.65664    0
> -2.859565 -7.337652   31.87842    1
> -4.467122 -6.427214   56.81916    0
> -6.340817 -6.706052   9.87694     1
> -1.40155  -2.738037   35.32452    1
> -10.92032 -11.05833   30.82691    1
> -7.330603 -6.257256   22.16484    1
> -2.714168 -2.258151   36.30459    0
> -2.793682 -2.935043   56.51117    1
> -6.706202 -11.04426   11.10245    0
> -6.113976 -7.36745    11.36128    1
> -9.845764 -10.35044   37.52305    0
> -7.786937 -10.70406   21.68431    1
> -0.54450  -3.818708   64.34981    1
> -1.402748 -4.612042   52.94871    0
> -1.771809 -3.918717   41.45876    1
> -4.142132 -7.088901   45.44987    1
> -1.640578 -4.787658   40.82234    1
> -1.050637 -2.535334   42.87785    1
> -0.32151  -3.315413   40.40543    1

Answer 1

您需要提供 comparitor 函数，用于在排序期间比较单个行。希望你有一个现代版本的 C++ 并且可以使用 lambdas:

// Sort by first column
std::sort( samples.begin(), samples.end(),
  []( const sampleData& a, const sampleData& b )
  {
    return a.first < b.first;
  }
);

按特定列排序后，您可以遍历序列并计算连续的第四列 1s。

编辑
lambda 是一种即时创建函数对象的方法。以上等同于：

struct unnamed_function
{
  bool operator () ( const sampleData& a, const sampleData& b ) const
  {
    return a.first < b.first;
  }
};

...

std::sort( samples.begin(), samples.end(), unnamed_function() );

[]是引入lambda的“捕获列表”

Read more about lambdas. （抱歉，我还没有更好的常见问题解答...）

* 大致如此。幕后其实比这个复杂一点。

按列对 CSV 文件进行排序并与 C++ 中的另一列进行比较

Sort CSV file by column and compare with another column in C++

c++

arrays

csv

sorting

analysis