根据一列中的相同值合并二维向量中的行

Question

我有一个二维向量，它有 6 列和 500 rows.I 想通过比较单个列值（PDG_ID）来组合行，即如果行的 PDG_ID 列值相同，我将取其他五列的平均值并将这些行存储为一行。知道如何在 C++ 中做到这一点吗？ 2d vectror with six columns

Answer 1

需要了解需求，然后select进行适配设计。

在您的情况下，您想要分组具有相同 ID 的几行，并计算数据条目的平均值。

因此，1个ID与1个或多个数据条目之间存在关系。或者在 C++ 中，一个 ID 关联了一个或多个条目。

在 C++ 中，我们有所谓的关联容器，如 std::map 或 std::unordered_map。在这里，我们可以存储一个密钥（ID）和许多关联数据。

如果我们把一行的所有数据放到一个结构体中，我们可以这样写：

struct PDG {
    int ID{};
    int status{};
    double Px{};
    double Py{};
    double Pz{};
    double E{};
}

而且，如果我们想存储与许多 PDG 关联的 ID，我们可以定义一个这样的映射：

std::map<int, std::vector<PDG>> groupedPDGs{};

在这个映射中，我们可以存储由一个或多个PDG组成的ID和关联数据。

然后我们可以添加一些非常 small/simple 的辅助函数，例如 IO 功能或计算平均值。这样我们就把大的、更复杂的问题分解成更简单的部分。

然后，整体实现可能如下所示：

#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <iterator>
#include <map>
#include <iomanip>

// Simple data struct with all necessary values and IO functions
struct PDG {

    // Data part
    int ID{};
    int status{};
    double Px{};
    double Py{};
    double Pz{};
    double E{};

    // Input of one row
    friend std::istream& operator >> (std::istream& is, PDG& pdg) {
        char c{};
        return is >> pdg.ID >> c >> pdg.status >> c >> pdg.Px >> c >> pdg.Py >> c >> pdg.Pz >> c >> pdg.E;
    }
    // Output of one row
    friend std::ostream& operator << (std::ostream& os, const PDG& pdg) {
        return os << "ID: " << std::setw(5) << pdg.ID << "\tStatus: " << pdg.status << "\tPx: " << std::setw(9) << pdg.Px 
            << "\tPy: " << std::setw(9) << pdg.Py << "\tPz: " << std::setw(9) << pdg.Pz << "\tE: " << std::setw(9) << pdg.E;
    }
};

// Alias/Abbreviation for a vector of PDGs
using PDGS = std::vector<PDG>;

// Calculate a mean value for vector of PDG data
PDG calculateMean(const PDGS& pdgs) {

    // Here we store the result. Initilize with values from first row and zeroes
    PDG result{ pdgs.front().ID, pdgs.front().status, 0.0, 0.0, 0.0, 0.0};

    // Add up data fields according to type
    for (size_t i{}; i < pdgs.size(); ++i) {
        result.Px += pdgs[i].Px;
        result.Py += pdgs[i].Py;
        result.Pz += pdgs[i].Pz;
        result.E += pdgs[i].E;
    }
    // Get mean value
    result.Px /= pdgs.size();
    result.Py /= pdgs.size();
    result.Pz /= pdgs.size();
    result.E /= pdgs.size();

    // And return result to calling function
    return result;
}

int main() {

    // Open the source file containing the data, and check, if the file could be opened
    if (std::ifstream ifs{ "pdg.txt" }; ifs) {

        // Read header line and throw away
        std::string header{}; std::getline(ifs, header);
        
        // Here we will stored the PDGs grouped by their ID
        std::map<int, PDGS> groupedPDGs{};

        // Read all source lines
        PDG pdg{};
        while (ifs >> pdg)

            // Store read values grouped by their ID
            groupedPDGs[pdg.ID].push_back(pdg);

        // Result with mean values
        PDGS result{};

        // Calculate mean values and store in additional vector
        for (const auto& [id, pdgs] : groupedPDGs)
            result.push_back(std::move(calculateMean(pdgs)));

        // Debug: Show output to user
        for (const PDG& p : result)
            std::cout << p << '\n';
    }
    std::cerr << "\nError: Could not open source datafile\n\n";
}

输入文件如：

PDG ID, Status, Px, Py, Pz, E
22, 1, 0.00658, 0.0131, -0.00395, 0.0152
13, 1, -43.2, -44.7, -49.6, 79.6
14, 1, 3.5, 21.4, 0.499, 21.7
16, 1, 41.1, -18, 27.8, 52.8
211, 1, 0.483, -0.312, 1.52, 1.63
211, 1, -0.247, -1.75, 45.2, 45.2
321, 1, 0.717, 0.982, 52.6, 52.6
321, 1, 0.112, 0.423, 33.2, 33.2
211, 1, 0.191, -0.68, -178, 178
2212, 1, 1.08, -0.428, -1.78E+03, 1.78E+03
2212, 1, 7.61, 4.28, 76.3, 76.8
211, 1, 0.176, 0.247, 8.9, 8.9
211, 1, 0.456, -0.73, 0.342, 0.937
2112, 1, 0.633, -0.904, 0.423, 1.51
2112, 1, 1, -0.645, 0.366, 1.56
211, 1, -0.0722, 0.147, -0.153, 0.264
211, 1, 0.339, 0.402, 0.304, 0.623
211, 1, 3.64, 2.58, -2.84, 5.29
211, 1, 0.307, 0.208, -5.69, 5.71
2212, 1, 0.118, 0.359, -3.29, 3.45

我们得到以下输出：

ID:    13       Status: 1       Px:     -43.2   Py:     -44.7   Pz:     -49.6   E:      79.6
ID:    14       Status: 1       Px:       3.5   Py:      21.4   Pz:     0.499   E:      21.7
ID:    16       Status: 1       Px:      41.1   Py:       -18   Pz:      27.8   E:      52.8
ID:    22       Status: 1       Px:   0.00658   Py:    0.0131   Pz:  -0.00395   E:    0.0152
ID:   211       Status: 1       Px:  0.585867   Py: 0.0124444   Pz:  -14.4908   E:   27.3949
ID:   321       Status: 1       Px:    0.4145   Py:    0.7025   Pz:      42.9   E:      42.9
ID:  2112       Status: 1       Px:    0.8165   Py:   -0.7745   Pz:    0.3945   E:     1.535
ID:  2212       Status: 1       Px:     2.936   Py:   1.40367   Pz:  -568.997   E:   620.083

根据一列中的相同值合并二维向量中的行

Consolidating rows in a 2d vector with respect to same value in one column

c++

2d

vector

concatenation

mean