合并和聚合保存在 JSON 个对象中的数据

Merging and Aggregating data held in JSON objects

我有两个 JSON 对象,它们是使用 JSON.parse 创建的,我想合并和聚合它们。

我没有能力将数据存储在 Mongo 数据库中,我不清楚如何继续。

第一个 JSON 文件包含原始数据:

 [    
   {
        "sector": {
            "url": "http://TestUrl/api/sectors/11110",
            "code": "11110",
            "name": "Education policy and administrative management"
        },
        "budget": 5742
    },
    {
        "sector": {
            "url": "http://TestUrl/api/sectors/11110",
            "code": "11110",
            "name": "Education policy and administrative management"
        },
        "budget": 5620
    },  
    {
        "sector": {
            "url": "http://TestUrl/api/sectors/12110",
            "code": "12110",
            "name": "Health policy and administrative management"
        },
        "budget": 5524
    }, ]

第二个 JSON 文件包含数据合并操作所需的映射:

{
    "Code (L3)":11110,
    "High Level Code (L1)":1,
    "High Level Sector Description":"Education",
    "Name":"Education policy and administrative management",
    "Description":"Education sector policy, planning and programmes; aid to education ministries, administration and management systems; institution capacity building and advice; school management and governance; curriculum and materials development; unspecified education activities.",
    "Category (L2)":111,
    "Category Name":"Education, level unspecified",
    "Category Description":"The codes in this category are to be used only when level of education is unspecified or unknown (e.g. training of primary school teachers should be coded under 11220)."
  },
{
    "Code (L3)":12110,
    "High Level Code (L1)":2,
    "High Level Sector Description":"Health",
    "Name":"Health policy and administrative management",
    "Description":"Health sector policy, planning and programmes; aid to health ministries, public health administration; institution capacity building and advice; medical insurance programmes; unspecified health activities.",
    "Category (L2)":121,
    "Category Name":"Health, general",
    "Category Description":""
  },
    {
    "Code (L3)":99999,
    "High Level Code (L1)":9,
    "High Level Sector Description":"Unused Code",
    "Name":"Extra Code",
    "Description":"Shows Data Issue",
    "Category (L2)":998,
    "Category Name":"Extra, Code",
    "Category Description":""
  },  

我想使用第一个文件中的 "code" 值和第二个文件中的 "Code (L3)" 值连接两个文件中的数据。在 SQL 术语中,我想使用这些值作为连接点对文件执行 "inner join"。

然后我想将第一个文件中的所有预算值与第二个文件中的 "High Level Code (L1)" 值相加,以生成以下 JSON 对象:

   {
     "High Level Code (L1)":1,
     "High Level Sector Description":"Education",
     "Budget”: 11362
    },

    {
      "High Level Code (L1)":2,
      "High Level Sector Description":"Health",
      "Budget”: 5524
     }

使用数据库这将是一项非常简单的任务,但恐怕此选项不可用。我们是 运行 我们在 Sinatra 上的站点,因此我无法使用任何 Rails 特定的辅助方法。

更新: 我现在使用真实数据作为输入,我发现映射文件中有多个 JSON 对象 "Code (L3)" 值不映射到原始数据文件中的任何 [Sector][code] 值。

我已经尝试了一些变通方法(将数据分解为二维数组,然后尝试将结果数组作为散列 table 返回),但我一直无法使任何工作正常进行。

我已经回到我接受的这个问题的答案,因为它是一个非常优雅的解决方案,我不想问同一个问题两次 - 我只是不知道如何让它被忽略当映射文件中的项目与原始数据文件中的任何内容都不匹配时。

如何只遍历第一个数据集并使用代码作为键将其索引到散列,然后遍历第二个数据集并从散列中为每个键找到合适的数据。有点蛮力但是..

这很简单,您列出的第一个图像名为 sources,而第二个列表名为 "values",或其他名称。我们将通过 "values",提取所需的字段,其中之一,在 "sources" 中找到所需的值:

values.map do |elem| 
       {  "High Level Code (L1)"          => elem["High Level Code (L1)"],
          "High Level Sector Description" => elem["High Level Sector Description"],
          "Budget" => sources.select do |source|
                         source["sector"]["code"] == elem["Code (L3)"].to_s 
                      end.map{|elem|elem["budget"]}.sum  
       } 
end

"join" 与数据库的等价物是 "find" 操作。我们循环遍历 sources 数组以找到与 "Code (L3)" 相同的 sector/code 值,然后我们提取 "budget" 值并将提取的所有这些值相加....

结果如下:

[{"High Level Code (L1)"=>1,
  "High Level Sector Description"=>"Education",
  "Budget"=>11362},
 {"High Level Code (L1)"=>2,
  "High Level Sector Description"=>"Health",
  "Budget"=>5524}]