Rails - 加入后区别开
Rails - Distinct ON after a join
我正在使用 Rails 4.2 和 PostgreSQL。我有一个 Product
模型和一个 Purchase
模型 Product
has many
Purchases
。我想找到不同的最近购买的产品。最初我试过:
Product.joins(:purchases)
.select("DISTINCT products.*, purchases.updated_at") #postgresql requires order column in select
.order("purchases.updated_at DESC")
然而,这会导致重复,因为它会尝试查找所有元组(product.id
和 purchases.updated_at
)具有唯一值的元组。但是我只想在加入后 select 具有不同 id
的产品。如果产品 ID 在连接中出现多次,则只有 select 第一个。所以我也尝试了:
Product.joins(:purchases)
.select("DISTINCT ON (product.id) purchases.updated_at, products.*")
.order("product.id, purchases.updated_at") #postgres requires that DISTINCT ON must match the leftmost order by clause
这不起作用,因为我需要在 order
子句中指定 product.id
,因为 this 约束会输出意外的顺序。
实现此目的的rails方法是什么?
使用子查询并在外部 SELECT
:
中添加不同的 ORDER BY
子句
SELECT *
FROM (
SELECT DISTINCT ON (pr.id)
pu.updated_at, pr.*
FROM Product pr
JOIN Purchases pu ON pu.product_id = pr.id -- guessing
ORDER BY pr.id, pu.updated_at DESC NULLS LAST
) sub
ORDER BY updated_at DESC NULLS LAST;
DISTINCT ON
的详细信息:
- Select first row in each GROUP BY group?
或其他一些查询技术:
- Optimize GROUP BY query to retrieve latest record per user
但是如果您只需要 Purchases
updated_at
,您可以在加入之前通过子查询中的简单聚合获得更便宜的价格:
SELECT *
FROM Product pr
JOIN (
SELECT product_id, max(updated_at) AS updated_at
FROM Purchases
GROUP BY 1
) pu ON pu.product_id = pr.id -- guessing
ORDER BY pu.updated_at DESC NULLS LAST;
关于NULLS LAST
:
- PostgreSQL sort by datetime asc, null first?
或者更简单,但在检索所有行时没有那么快:
SELECT pr.*, max(updated_at) AS updated_at
FROM Product pr
JOIN Purchases pu ON pu.product_id = pr.id
GROUP BY pr.id -- must be primary key
ORDER BY 2 DESC NULLS LAST;
Product.id
需要定义为主键才能工作。详情:
- PostgreSQL - GROUP BY clause
- Return a grouped list with occurrences using Rails and PostgreSQL
如果您只获取一小部分选择(例如,使用 WHERE
子句限制为一个或几个 pr.id
),这会更快。
尝试这样做:
Product.joins(:purchases)
.select("DISTINCT ON (products_id) purchases.product_id, purchases.updated_at, products.*")
.order("product_id, purchases.updated_at") #postgres requires that DISTINCT ON must match the leftmost order by clause
我最终得到了这个 -
Product.joins(:purchases)
.select("DISTINCT ON (products.id) products.*, purchases.updated_at as date")
.sort_by(&:date)
.reverse
仍在寻找更好的方法。
以 erwin-brandstetter 的答案为基础,这是您可以使用 ActiveRecord 执行此操作的方法(至少应该接近):
Product
.select('*')
.joins('INNER JOIN (SELECT product_id, max(updated_at) AS updated_at FROM Purchases GROUP BY 1) pu ON pu.product_id = pr.id')
.order('pu.updated_at DESC NULLS LAST')
基于@ErwinBrandstetter 的回答,我终于找到了正确的方法。查找不同的最近购买的查询是
SELECT *
FROM (
SELECT DISTINCT ON (pr.id)
pu.updated_at, pr.*
FROM Product pr
JOIN Purchases pu ON pu.product_id = pr.id
) sub
ORDER BY updated_at DESC NULLS LAST;
子查询中不需要 order_by
,因为无论如何我们都在外部查询中进行排序。
rails 这样做的方法是 -
inner_query = Product.joins(:purchases)
.select("DISTINCT ON (products.id) products.*, purchases.updated_at as date") #This selects all the unique purchased products.
result = Product.from("(#{inner_query.to_sql}) as unique_purchases")
.select("unique_purchases.*").order("unique_purchases.date DESC")
@ErwinBrandstetter 建议的第二种(也是更好的)方法是
SELECT *
FROM Product pr
JOIN (
SELECT product_id, max(updated_at) AS updated_at
FROM Purchases
GROUP BY 1
) pu ON pu.product_id = pr.id
ORDER BY pu.updated_at DESC NULLS LAST;
在rails中可以写成
join_query = Purchase.select("product_id, max(updated_at) as date")
.group(1) #This selects most recent date for all purchased products
result = Product.joins("INNER JOIN (#{join_query.to_sql}) as unique_purchases ON products.id = unique_purchases.product_id")
.order("unique_purchases.date")
我正在使用 Rails 4.2 和 PostgreSQL。我有一个 Product
模型和一个 Purchase
模型 Product
has many
Purchases
。我想找到不同的最近购买的产品。最初我试过:
Product.joins(:purchases)
.select("DISTINCT products.*, purchases.updated_at") #postgresql requires order column in select
.order("purchases.updated_at DESC")
然而,这会导致重复,因为它会尝试查找所有元组(product.id
和 purchases.updated_at
)具有唯一值的元组。但是我只想在加入后 select 具有不同 id
的产品。如果产品 ID 在连接中出现多次,则只有 select 第一个。所以我也尝试了:
Product.joins(:purchases)
.select("DISTINCT ON (product.id) purchases.updated_at, products.*")
.order("product.id, purchases.updated_at") #postgres requires that DISTINCT ON must match the leftmost order by clause
这不起作用,因为我需要在 order
子句中指定 product.id
,因为 this 约束会输出意外的顺序。
实现此目的的rails方法是什么?
使用子查询并在外部 SELECT
:
ORDER BY
子句
SELECT *
FROM (
SELECT DISTINCT ON (pr.id)
pu.updated_at, pr.*
FROM Product pr
JOIN Purchases pu ON pu.product_id = pr.id -- guessing
ORDER BY pr.id, pu.updated_at DESC NULLS LAST
) sub
ORDER BY updated_at DESC NULLS LAST;
DISTINCT ON
的详细信息:
- Select first row in each GROUP BY group?
或其他一些查询技术:
- Optimize GROUP BY query to retrieve latest record per user
但是如果您只需要 Purchases
updated_at
,您可以在加入之前通过子查询中的简单聚合获得更便宜的价格:
SELECT *
FROM Product pr
JOIN (
SELECT product_id, max(updated_at) AS updated_at
FROM Purchases
GROUP BY 1
) pu ON pu.product_id = pr.id -- guessing
ORDER BY pu.updated_at DESC NULLS LAST;
关于NULLS LAST
:
- PostgreSQL sort by datetime asc, null first?
或者更简单,但在检索所有行时没有那么快:
SELECT pr.*, max(updated_at) AS updated_at
FROM Product pr
JOIN Purchases pu ON pu.product_id = pr.id
GROUP BY pr.id -- must be primary key
ORDER BY 2 DESC NULLS LAST;
Product.id
需要定义为主键才能工作。详情:
- PostgreSQL - GROUP BY clause
- Return a grouped list with occurrences using Rails and PostgreSQL
如果您只获取一小部分选择(例如,使用 WHERE
子句限制为一个或几个 pr.id
),这会更快。
尝试这样做:
Product.joins(:purchases)
.select("DISTINCT ON (products_id) purchases.product_id, purchases.updated_at, products.*")
.order("product_id, purchases.updated_at") #postgres requires that DISTINCT ON must match the leftmost order by clause
我最终得到了这个 -
Product.joins(:purchases)
.select("DISTINCT ON (products.id) products.*, purchases.updated_at as date")
.sort_by(&:date)
.reverse
仍在寻找更好的方法。
以 erwin-brandstetter 的答案为基础,这是您可以使用 ActiveRecord 执行此操作的方法(至少应该接近):
Product
.select('*')
.joins('INNER JOIN (SELECT product_id, max(updated_at) AS updated_at FROM Purchases GROUP BY 1) pu ON pu.product_id = pr.id')
.order('pu.updated_at DESC NULLS LAST')
基于@ErwinBrandstetter 的回答,我终于找到了正确的方法。查找不同的最近购买的查询是
SELECT *
FROM (
SELECT DISTINCT ON (pr.id)
pu.updated_at, pr.*
FROM Product pr
JOIN Purchases pu ON pu.product_id = pr.id
) sub
ORDER BY updated_at DESC NULLS LAST;
子查询中不需要 order_by
,因为无论如何我们都在外部查询中进行排序。
rails 这样做的方法是 -
inner_query = Product.joins(:purchases)
.select("DISTINCT ON (products.id) products.*, purchases.updated_at as date") #This selects all the unique purchased products.
result = Product.from("(#{inner_query.to_sql}) as unique_purchases")
.select("unique_purchases.*").order("unique_purchases.date DESC")
@ErwinBrandstetter 建议的第二种(也是更好的)方法是
SELECT *
FROM Product pr
JOIN (
SELECT product_id, max(updated_at) AS updated_at
FROM Purchases
GROUP BY 1
) pu ON pu.product_id = pr.id
ORDER BY pu.updated_at DESC NULLS LAST;
在rails中可以写成
join_query = Purchase.select("product_id, max(updated_at) as date")
.group(1) #This selects most recent date for all purchased products
result = Product.joins("INNER JOIN (#{join_query.to_sql}) as unique_purchases ON products.id = unique_purchases.product_id")
.order("unique_purchases.date")