避免 C++ 程序的冗余映射

Question

这是我的 C++ 代码，用于执行以下操作：

比较一组 XML 个文件，看看它们之间有什么不同
如果节点是新的（在 B.xml 但不是 A.xml 中），则转出该节点
扫描该节点并使用地图将标签关联到信息类型
根据数据的类型处理数据

我对第 1-2 步的工作方式非常满意，但我觉得第 3-4 步我可能实施得不好。我主要担心的是，即使标签已经匹配，我也必须映射标签，例如 id，而实际上只有当它与描述不同时才定义映射才是好的。

我的代码：

#include "pugi/pugixml.hpp"

#include <iostream>
#include <string>
#include <map>

int main()
 {
    // This map relates the type of content to the tag name in the XML file
    const std::map<std::string, std::string> tagMap {
         {"id", "id"}, {"description", "content"}, {"url", "web_address"}, {"location", "location"}
     };

    pugi::xml_document doca, docb;
     std::map<std::string, pugi::xml_node> mapa, mapb;

    for (auto& node: doca.child("data").children("entry")) {
     const char* id = node.child_value("id");
     mapa[id] = node;
     }

    for (auto& node: docb.child("data").children("entry")) {
     const char* idcs = node.child_value("id");
         if (!mapa.erase(idcs)) {
         mapb[idcs] = node;
         }
     }

    // For added nodes
     for (auto& eb: mapb) {
         // Loop through Tag map to see if we can find tags named "id, content, web_address or location" in the node returned
         for (auto& kv : tagMap) {
         // For each result, assign the value of that tag to the type of content
         // For example: description = Testing!
         kv.first = eb.second.child_value(kv.second.c_str());
         // If it's an ID...
             if (kv.first == "id") {
             // Do work on ID value (i.e check if it's unique)
             }
             if (kv.first == "description") {
            // Do work on Description data (I.e Trim it)
             }
             if (kv.first == "url") {
             // Do work on URL data (I.e validate it)
             }
             if (kv.first == "location") {
             // Do work on location data
             }
         }
     }

}

示例输入文件：

<data>
    <entry>
        <id>1</id>
        <content>Description</content>
        <web_address>www.google.com</web_address>
        <location>England</location>
        <unrelated>Test</unrelated>
        <not_needed>Test</not_needed>
    </entry>
..
</data>

Answer 1

我对你的第 3 点和第 4 点有两个不同的改进：

简单改进：

作为您的 tagMap 的键，使用枚举，例如

enum Tags { Tag_ID, Tag_Description, ... }

这避免了字符串比较。

一种更动态的方法是使用多态性。

定义一个抽象基类Tag

class Tag {
public:
    virtual const char* getTagname() const = 0;
    virtual void processNode(const std::string& value) = 0;
};

然后为您拥有的每个标签实现一个子类。

class IdTag : public Tag {
public:
    const char* getTagname() const { return "Id"; }
    void processNode(const std::string& value) { /* Do something */ }
};

现在您可以使用标签列表了。 std::list<std::unique_ptr<Tag>> tagMap { new IdTag(), new DescriptionTag(), ... };

你的新循环：

// For added nodes
 for (auto& eb: mapb) {
     // Loop through Tag map to see if we can find tags named "id, content, web_address or location" in the node returned
     for (auto& kv : tagMap) {
         kv->processNode(eb.second.child_value(kv->getTagname());
     }
 }

避免 C++ 程序的冗余映射

Avoid redundant mappings for C++ program

c++

xml

xml-parsing

c++11