Select 一组随机一行(Oracle 10g)
Select one random row by group (Oracle 10g)
这个 post 与 的相似之处在于我每组都有多个观察值。但是,我只想随机 select 其中一个 。我也在研究 Oracle 10g。
table df
中的每个 person_id
有多行。我想按 dbms_random.value()
和 select 每组的第一个观察值对每组 person_id
进行排序。为此,我尝试了:
select
person_id, purchase_date
from
df
where
row_number() over (partition by person_id order by dbms_random.value()) = 1
查询returns:
ORA-30483: window functions are not allowed here
30483. 00000 - "window functions are not allowed here"
*Cause: Window functions are allowed only in the SELECT list of a query. And, window function cannot be an argument to another window or group function.
使用子查询:
select person_id, purchase_date
from (select df.*,
row_number() over (partition by person_id order by dbms_random.value()) as seqnum
from df
) df
where seqnum = 1;
一个选项是使用 WITH..AS
子句:
WITH t AS
(
SELECT df.*,
ROW_NUMBER() OVER (PARTITION BY person_id ORDER BY dbms_random.value()) AS rn
FROM df
)
SELECT person_id, purchase_date
FROM t
WHERE rn = 1
聚合查询(使用 GROUP BY
和聚合函数)比执行相同工作的等效分析函数快得多。因此,如果您有大量数据要处理,或者如果数据不是特别大但您必须 运行 经常执行此查询,您可能需要使用聚合而不是分析函数的更高效查询。
这是一种可能的方法:
select person_id,
max(purchase_date) keep (dense_rank first order by dbms_random.value())
as random_purchase_date
from df
group by person_id
;
这个 post 与
table df
中的每个 person_id
有多行。我想按 dbms_random.value()
和 select 每组的第一个观察值对每组 person_id
进行排序。为此,我尝试了:
select
person_id, purchase_date
from
df
where
row_number() over (partition by person_id order by dbms_random.value()) = 1
查询returns:
ORA-30483: window functions are not allowed here 30483. 00000 - "window functions are not allowed here" *Cause: Window functions are allowed only in the SELECT list of a query. And, window function cannot be an argument to another window or group function.
使用子查询:
select person_id, purchase_date
from (select df.*,
row_number() over (partition by person_id order by dbms_random.value()) as seqnum
from df
) df
where seqnum = 1;
一个选项是使用 WITH..AS
子句:
WITH t AS
(
SELECT df.*,
ROW_NUMBER() OVER (PARTITION BY person_id ORDER BY dbms_random.value()) AS rn
FROM df
)
SELECT person_id, purchase_date
FROM t
WHERE rn = 1
聚合查询(使用 GROUP BY
和聚合函数)比执行相同工作的等效分析函数快得多。因此,如果您有大量数据要处理,或者如果数据不是特别大但您必须 运行 经常执行此查询,您可能需要使用聚合而不是分析函数的更高效查询。
这是一种可能的方法:
select person_id,
max(purchase_date) keep (dense_rank first order by dbms_random.value())
as random_purchase_date
from df
group by person_id
;