Mysql 贝叶斯并按星级排序
Mysql Bayesian and sort by star ratings
假设我有两个 table。 businesses
和 reviews
用于企业。
businesses
table:
+----+-------+
| id | title |
+----+-------+
reviews
table:
+----+-------------+---------+------+
| id | business_id | message | rate |
+----+-------------+---------+------+
每条评论都有 rate
(1 到 5 颗星)
我想根据 Bayesian Ranking
的评论率对商家进行排序,条件是至少有 2 条评论。
这是我的查询:
SELECT b.id,
(SELECT COUNT(r.rate) as rr FROM reviews r WHERE r.business_id = b.id) as rr,
(SELECT
((COUNT(r.rate) / (COUNT(r.rate) + 2)) AVG(r.rate) +
(2 /(COUNT(r.rate) + 2)) 4)
FROM reviews r where r.business_id = b.id AND rr > 2
) as score
FROM businesses b
order by score desc
LIMIT 4
这将输出我:
+------+----+------------+
| id | rr | score |
+------+----+------------+
| 992 | 14 | 4.31250000 |
+------+----+------------+
| 237 | 3 | 4.2000000 |
+------+----+------------+
| 19 | 5 | 4.0000000 |
+------+----+------------+
| 1009 | 12 | 3.9285142 |
+------+----+------------+
我有两个问题:
如您在 ((COUNT(r.rate) / (COUNT(r.rate) + 2)) AVG(r.rate) +
(2 /(COUNT(r.rate) + 2)) 4) FROM reviews r where r.business_id = b.id AND rr > 2 )
中所见,某些函数 运行 不止一次,例如 COUNT
或 AVG
。它们是否 运行 在后台运行一次并可能缓存结果?或者 运行 每次调用?
是否有任何等效的查询,但更优化?
提前致谢。
我希望 MySQL 可以优化多个计数,但不确定。
但是您可以重新安排查询以加入子查询。这样您就不会为每一行执行 2 个子查询。
SELECT b.id,
sub0.rr,
sub0.score
FROM businesses b
INNER JOIN
(
SELECT r.business_id,
COUNT(r.rate) AS rr ,
((COUNT(r.rate) / (COUNT(r.rate) + 2)) AVG(r.rate) + (2 /(COUNT(r.rate) + 2)) 4) AS score
FROM reviews r
GROUP BY r.business_id
HAVING rr > 2
) sub0
ON sub0.business_id = b.id
ORDER BY score DESC
LIMIT 4
请注意,此处的结果略有不同,因为它将排除只有 2 条评论的记录,而您的查询仍将 return 它们,但分数为 NULL。我在您的原始查询中留下了明显缺失的运算符(即,在 AVG(r.rate) 之前和 4 之前)AS score .
使用上面的想法,您可以将其重新编码为 return 子查询中的计数和平均比率,并且只需使用那些 returned 列的值进行计算。
SELECT b.id,
sub0.rr,
((rr / (rr + 2)) arr + (2 /(rr + 2)) 4) AS score
FROM businesses b
INNER JOIN
(
SELECT r.business_id,
COUNT(r.rate) AS rr ,
AVG(r.rate) AS arr
FROM reviews r
GROUP BY r.business_id
HAVING rr > 2
) sub0
ON sub0.business_id = b.id
ORDER BY score DESC
LIMIT 4
假设我有两个 table。 businesses
和 reviews
用于企业。
businesses
table:
+----+-------+
| id | title |
+----+-------+
reviews
table:
+----+-------------+---------+------+
| id | business_id | message | rate |
+----+-------------+---------+------+
每条评论都有 rate
(1 到 5 颗星)
我想根据 Bayesian Ranking
的评论率对商家进行排序,条件是至少有 2 条评论。
这是我的查询:
SELECT b.id,
(SELECT COUNT(r.rate) as rr FROM reviews r WHERE r.business_id = b.id) as rr,
(SELECT
((COUNT(r.rate) / (COUNT(r.rate) + 2)) AVG(r.rate) +
(2 /(COUNT(r.rate) + 2)) 4)
FROM reviews r where r.business_id = b.id AND rr > 2
) as score
FROM businesses b
order by score desc
LIMIT 4
这将输出我:
+------+----+------------+
| id | rr | score |
+------+----+------------+
| 992 | 14 | 4.31250000 |
+------+----+------------+
| 237 | 3 | 4.2000000 |
+------+----+------------+
| 19 | 5 | 4.0000000 |
+------+----+------------+
| 1009 | 12 | 3.9285142 |
+------+----+------------+
我有两个问题:
如您在
((COUNT(r.rate) / (COUNT(r.rate) + 2)) AVG(r.rate) + (2 /(COUNT(r.rate) + 2)) 4) FROM reviews r where r.business_id = b.id AND rr > 2 )
中所见,某些函数 运行 不止一次,例如COUNT
或AVG
。它们是否 运行 在后台运行一次并可能缓存结果?或者 运行 每次调用?是否有任何等效的查询,但更优化?
提前致谢。
我希望 MySQL 可以优化多个计数,但不确定。
但是您可以重新安排查询以加入子查询。这样您就不会为每一行执行 2 个子查询。
SELECT b.id,
sub0.rr,
sub0.score
FROM businesses b
INNER JOIN
(
SELECT r.business_id,
COUNT(r.rate) AS rr ,
((COUNT(r.rate) / (COUNT(r.rate) + 2)) AVG(r.rate) + (2 /(COUNT(r.rate) + 2)) 4) AS score
FROM reviews r
GROUP BY r.business_id
HAVING rr > 2
) sub0
ON sub0.business_id = b.id
ORDER BY score DESC
LIMIT 4
请注意,此处的结果略有不同,因为它将排除只有 2 条评论的记录,而您的查询仍将 return 它们,但分数为 NULL。我在您的原始查询中留下了明显缺失的运算符(即,在 AVG(r.rate) 之前和 4 之前)AS score .
使用上面的想法,您可以将其重新编码为 return 子查询中的计数和平均比率,并且只需使用那些 returned 列的值进行计算。
SELECT b.id,
sub0.rr,
((rr / (rr + 2)) arr + (2 /(rr + 2)) 4) AS score
FROM businesses b
INNER JOIN
(
SELECT r.business_id,
COUNT(r.rate) AS rr ,
AVG(r.rate) AS arr
FROM reviews r
GROUP BY r.business_id
HAVING rr > 2
) sub0
ON sub0.business_id = b.id
ORDER BY score DESC
LIMIT 4