SQL 其中连接集必须包含所有值,但可以包含更多

SQL where joined set must contain all values but may contain more

我有三个 tables offers, sports 和连接 table offers_sports.

class Offer < ActiveRecord::Base
  has_and_belongs_to_many :sports
end

class Sport < ActiveRecord::Base
  has_and_belongs_to_many :offers
end

我想select提供包含一组给定的运动名称。它们必须包含所有sports,但可能包含更多。

假设我有这三个优惠:

light:
  - "Yoga"
  - "Bodyboarding"
medium:
  - "Yoga"
  - "Bodyboarding"
  - "Surfing"
all:
  - "Yoga"
  - "Bodyboarding"
  - "Surfing"
  - "Parasailing"
  - "Skydiving"

给定数组 ["Bodyboarding", "Surfing"] 我想得到 mediumall 但不是 light.

我尝试了一些类似 this answer 的方法,但我得到的结果是零行:

Offer.joins(:sports)
     .where(sports: { name: ["Bodyboarding", "Surfing"] })
     .group("sports.name")
     .having("COUNT(distinct sports.name) = 2")

翻译成SQL:

SELECT "offers".* 
FROM "offers" 
INNER JOIN "offers_sports" ON "offers_sports"."offer_id" = "offers"."id"     
INNER JOIN "sports" ON "sports"."id" = "offers_sports"."sport_id" 
  WHERE "sports"."name" IN ('Bodyboarding', 'Surfing') 
GROUP BY sports.name 
HAVING COUNT(distinct sports.name) = 2;

一个 ActiveRecord 答案会很好,但我会接受 SQL,最好是 Postgres 兼容。

数据:

offers
======================
id | name
----------------------
1  | light
2  | medium
3  | all
4  | extreme

sports
======================
id | name
----------------------
1  | "Yoga"
2  | "Bodyboarding"
3  | "Surfing"
4  | "Parasailing"
5  | "Skydiving"

offers_sports
======================
offer_id | sport_id
----------------------
1        | 1
1        | 2
2        | 1
2        | 2
2        | 3
3        | 1
3        | 2
3        | 3
3        | 4
3        | 5
4        | 3
4        | 4
4        | 5

分组 offer.id,而不是 sports.name(或 sports.id):

SELECT o.*
FROM   sports        s
JOIN   offers_sports os ON os.sport_id = s.id
JOIN   offers        o  ON os.offer_id = o.id
WHERE  s.name IN ('Bodyboarding', 'Surfing') 
GROUP  BY o.id  -- !!
HAVING count(*) = 2;

假设典型实现:

  • offer.idsports.id 被定义为主键。
  • sports.name 定义唯一。
  • (sport_id, offer_id) in offers_sports 被定义为唯一(或 PK)。

您不需要 DISTINCT 计数。 count(*) 甚至更便宜。

包含一系列可能技术的相关答案:

  • How to filter SQL results in a has-many-through relation

由@max(OP)添加-这是将上述查询整合到 ActiveRecord 中:

class Offer < ActiveRecord::Base
  has_and_belongs_to_many :sports
  def self.includes_sports(*sport_names)
    joins(:sports)
      .where(sports: { name: sport_names })
      .group('offers.id')
      .having("count(*) = ?", sport_names.size)
  end
end

一种方法是使用数组和 array_agg 聚合函数。

SELECT "offers".*, array_agg("sports"."name") as spnames 
FROM "offers" 
INNER JOIN "offers_sports" ON "offers_sports"."offer_id" = "offers"."id"     
INNER JOIN "sports" ON "sports"."id" = "offers_sports"."sport_id" 
GROUP BY "offers"."id" HAVING array_agg("sports"."name")::text[] @> ARRAY['Bodyboarding','Surfing']::text[];

returns:

 id |  name  |                      spnames                      
----+--------+---------------------------------------------------
  2 | medium | {Yoga,Bodyboarding,Surfing}
  3 | all    | {Yoga,Bodyboarding,Surfing,Parasailing,Skydiving}
(2 rows)

@> 运算符表示左侧数组必须包含右侧数组中的所有元素,但可以包含更多元素。 spnames 列只是为了展示,但您可以安全地将其删除。

有两件事你必须非常注意这一点。

  1. 即使使用 Postgres 9.4(我还没有尝试过 9.5),用于比较数组的类型转换很草率并且经常出错,告诉你它找不到将它们转换为可比较值的方法,正如您在示例中看到的那样,我使用 ::text[].

  2. 手动投射双方
  3. 我不知道数组参数的支持级别是多少 Ruby,也不知道 RoR 框架,所以您最终可能不得不手动转义字符串(如果通过用户)并使用 ARRAY[] 语法形成数组。