SQL 中每组无放回样本观察
Sample observations per group without replacement in SQL
使用提供的 table 我想抽样假设每天 2 个用户,以便分配给这两天的用户不同。当然我遇到的问题更复杂,但是这个简单的例子给出了思路。
drop table if exists test;
create table test (
user_id int,
day_of_week int);
insert into test values (1, 1);
insert into test values (1, 2);
insert into test values (2, 1);
insert into test values (2, 2);
insert into test values (3, 1);
insert into test values (3, 2);
insert into test values (4, 1);
insert into test values (4, 2);
insert into test values (5, 1);
insert into test values (5, 2);
insert into test values (6, 1);
insert into test values (6, 2);
预期结果如下所示:
create table results (
user_id int,
day_of_week int);
insert into results values (1, 1);
insert into results values (2, 1);
insert into results values (3, 2);
insert into results values (6, 2);
您可以使用 window 功能。这是一个例子。 . .虽然细节确实取决于您的数据库(随机数函数因数据库而异):
select t.*
from (select t.*, row_number() over (partition by day_of_week order by random()) as seqnum
from test t
) t
where seqnum <= 2;
使用提供的 table 我想抽样假设每天 2 个用户,以便分配给这两天的用户不同。当然我遇到的问题更复杂,但是这个简单的例子给出了思路。
drop table if exists test;
create table test (
user_id int,
day_of_week int);
insert into test values (1, 1);
insert into test values (1, 2);
insert into test values (2, 1);
insert into test values (2, 2);
insert into test values (3, 1);
insert into test values (3, 2);
insert into test values (4, 1);
insert into test values (4, 2);
insert into test values (5, 1);
insert into test values (5, 2);
insert into test values (6, 1);
insert into test values (6, 2);
预期结果如下所示:
create table results (
user_id int,
day_of_week int);
insert into results values (1, 1);
insert into results values (2, 1);
insert into results values (3, 2);
insert into results values (6, 2);
您可以使用 window 功能。这是一个例子。 . .虽然细节确实取决于您的数据库(随机数函数因数据库而异):
select t.*
from (select t.*, row_number() over (partition by day_of_week order by random()) as seqnum
from test t
) t
where seqnum <= 2;