设计合集，如何反规范化

Question

我有在许多地方以不同价格提供的服务。在过渡 SQL 中，我会让 price_location table 包含 service_id 和 location_id 当我想在某个区域找到服务时进行加入和分组(s) 显示最高价和最低价（区域将 select 多个位置）。

由于服务和位置非常多，我正在考虑以下几点：

service_location_price = [
  {
    serviceName:'s1';
    ,price:10
    ,location:'location1'
  },{//to keep it simple only serviceName is here but
     // there will be multiple providers for the same
     // serviceName at same location but different price
    serviceName:'s1';
    ,price:12
    ,location:'location1'
  },{
    serviceName:'s1';
    ,price:15
    ,location:'location2'
  }
];

基本上平面文件数据打破了第二范式（具有重复行）。

现在 aggregate 和/或 map reduce 应该可以很好地让特定区域的服务显示最低和最高价格。或显示某些服务可用的位置。

服务和位置都有自己的集合，并且 service_location_price 集合为该查询复制一些服务和位置的值。

有些人担心重复数据，并希望以不同的方式实现这一点（猫鼬用匹配填充？？）。

不确定我的选择是什么，因此希望有更多经验的人提供一些意见。有没有更好的方法来搜索

服务和位置不会更新太多，但它们之间的关系可能会更改、添加或删除。但是搜索区域服务会经常执行。

Answer 1

填充是一个大的 $in 查询来解析引用，然后将数组中的引用换成相应的文档。如果引用字段被编入索引并没有那么糟糕，但它是一个额外的查询，它是糟糕模式设计的一个拐杖，因为当您不使用关系数据库并且应该以不同的方式处理问题时，它可以更容易地模拟关系数据库.我认为它应该从 Mongoose 中删除，但遗憾的是它有点晚了:(

我不确定你是如何建模区域的 - 你说一个区域可以是多个位置，所以我将把一个区域建模为 location 个值的数组。

给定区域的服务总数：

db.service_location_price.distinct("serviceName", { "location" : { "$in" : region_array } })

这将为您提供一系列服务名称，因此 .length 将提供服务数量。

Min/max地区服务价格：

db.service_location_price.find({ "location" : { "$in" : region_array }, "serviceName" : "service1" }).sort({ "price" : 1 }).limit(1)
db.service_location_price.find({ "location" : { "$in" : region_array }, "serviceName" : "service1" }).sort({ "price" : -1 }).limit(1)

示例文档中没有关于服务供应商的信息，所以我不知道如何找到一个地区的服务供应商数量。也许您想在文档中包含一个 supplier 字段？

设计合集，如何反规范化

Design collection, how to de normalize

javascript

denormalization

mongodb