从 mysql table 中选择加权随机分布
selecting weighted random distribution from a mysql table
我正在尝试编写一个查询,该查询将从 table 中随机 select 多篇文章,但这些文章有被选中的机会。我想出了一个解决方案,但对我来说似乎很笨拙,我想知道是否有人对如何做得更好有任何想法。我至少需要 1 篇文章,但如果查询一次返回多篇文章会很有帮助。
这是我的方法:
table--
mysql> describe randomiser;
+---------+----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+----------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| article | varchar(30) | YES | | NULL | |
| chance | smallint(5) unsigned | NO | MUL | 1 | |
| low | int(10) unsigned | NO | MUL | 0 | |
| high | int(10) unsigned | NO | | 0 | |
+---------+----------------------+------+-----+---------+----------------+
我的测试人群--
mysql> select * from randomiser;
+----+-------------+--------+-----+------+
| id | article | chance | low | high |
+----+-------------+--------+-----+------+
| 1 | common | 128 | 1 | 128 |
| 2 | uncommon | 64 | 129 | 192 |
| 3 | infrequent1 | 32 | 193 | 224 |
| 4 | infrequent2 | 32 | 225 | 256 |
| 5 | infrequent3 | 32 | 257 | 288 |
+----+-------------+--------+-----+------+
低值和高值在插入时更新,任何时候有人将新文章添加到 table。
我的select离子方法--
SET @t:=(SELECT FLOOR( SUM(chance) * RAND() + 1) FROM randomiser);
SELECT article FROM randomiser WHERE @t >= low AND @t <= high;
- 是否可以将 select 组合成一个有效的语句?
- 是否可以编写一个 select 来提取多个随机值而不是一个随机值?
注意 - 我完全没有依附于我定义的 table ;如果有不同类型的布局会更有效率,我想知道!
您可以使用以下查询
select t.article from
(SELECT article,
case when FLOOR( SUM(chance) * RAND() + 1) between low and high
then 1 else 0 end as chance
FROM randomiser
group by article) t
where t.chance = 1
上面的将使用多个随机值
对于一个查询,你可以这样做:
SELECT article
FROM randomiser
WHERE (SELECT FLOOR( SUM(chance) * RAND() + 1) FROM randomiser) BETWEEN low AND high;
或使用INNER JOIN
:
SELECT article, `range`
FROM randomiser
INNER JOIN (
SELECT
FLOOR( SUM(chance) * RAND() + 1) AS `range`
FROM randomiser
) t
WHERE `range` >= low AND `range` <= high;
我正在尝试编写一个查询,该查询将从 table 中随机 select 多篇文章,但这些文章有被选中的机会。我想出了一个解决方案,但对我来说似乎很笨拙,我想知道是否有人对如何做得更好有任何想法。我至少需要 1 篇文章,但如果查询一次返回多篇文章会很有帮助。
这是我的方法:
table--
mysql> describe randomiser;
+---------+----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+----------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| article | varchar(30) | YES | | NULL | |
| chance | smallint(5) unsigned | NO | MUL | 1 | |
| low | int(10) unsigned | NO | MUL | 0 | |
| high | int(10) unsigned | NO | | 0 | |
+---------+----------------------+------+-----+---------+----------------+
我的测试人群--
mysql> select * from randomiser;
+----+-------------+--------+-----+------+
| id | article | chance | low | high |
+----+-------------+--------+-----+------+
| 1 | common | 128 | 1 | 128 |
| 2 | uncommon | 64 | 129 | 192 |
| 3 | infrequent1 | 32 | 193 | 224 |
| 4 | infrequent2 | 32 | 225 | 256 |
| 5 | infrequent3 | 32 | 257 | 288 |
+----+-------------+--------+-----+------+
低值和高值在插入时更新,任何时候有人将新文章添加到 table。
我的select离子方法--
SET @t:=(SELECT FLOOR( SUM(chance) * RAND() + 1) FROM randomiser);
SELECT article FROM randomiser WHERE @t >= low AND @t <= high;
- 是否可以将 select 组合成一个有效的语句?
- 是否可以编写一个 select 来提取多个随机值而不是一个随机值?
注意 - 我完全没有依附于我定义的 table ;如果有不同类型的布局会更有效率,我想知道!
您可以使用以下查询
select t.article from
(SELECT article,
case when FLOOR( SUM(chance) * RAND() + 1) between low and high
then 1 else 0 end as chance
FROM randomiser
group by article) t
where t.chance = 1
上面的将使用多个随机值
对于一个查询,你可以这样做:
SELECT article
FROM randomiser
WHERE (SELECT FLOOR( SUM(chance) * RAND() + 1) FROM randomiser) BETWEEN low AND high;
或使用INNER JOIN
:
SELECT article, `range`
FROM randomiser
INNER JOIN (
SELECT
FLOOR( SUM(chance) * RAND() + 1) AS `range`
FROM randomiser
) t
WHERE `range` >= low AND `range` <= high;