是否可以通过查询获取组内组内的随机值?

Is it possible to get a random value within a group in a group by query?

我想在分组查询中获得随机,像这样:

SELECT city
  , SOME_RANDOMIZER_FUNC(latitude) as "RANDOM_LATITUDE_IN_CITY"
  , SOME_RANDOMIZER_FUNC(longitude) AS "RANDOM_LONGITUDE_IN_CITY"
FROM some_table
GROUP BY city

输入:

city    LATITUDE    LONGITUDE
STO     -31.3935    -57.9413
STO     -31.0274    -57.8081
STO     -30.7852    -57.7765
STO     -30.4274    -56.4718
NDU     -30.2747    -57.6023
NDU     -32.2909    -58.0737
NDU     -32.0286    -57.8468
NDU     -32.3600    -57.2021
NDU     -32.6816    -57.6541
MBO     -31.7085    -55.9873
MBO     -30.9359    -55.5457
MBO     -31.1972    -55.7574
MBO     -31.7711    -54.6904

期望的输出:

city    RANDOM_LATITUDE_IN_CITY     RANDOM_LONGITUDE_IN_CITY
STO     -31.0274                    -57.9413
NDU     -32.3600                    -57.6541
MBO     -30.9359                    -55.5457

其中函数SOME_RANDOMIZER_FUNC returns一个组内的随机值.

在随机函数的输入参数中,您呈现所有数据,然后对其应用分组,这是不允许的。 执行此操作的最佳方法是使用以下代码:

with s_table as (

    select
        city,
        latitude,
        longitude,
        row_number() over(partition by city order by DBMS_RANDOM.VALUE) as random_sort
    from
        some_table
    )
    select
    city,
    latitude,
    longitude
from
    s_table
where
    random_sort =1

为每个城市的行随机编号,然后保留编号为 1 的行。

SELECT city, latitude, longitude
FROM
(
  SELECT 
    city, latitude, longitude,
    ROW_NUMBER() OVER (PARTITION BY city ORDER BY DBMS_RANDOM.VALUE) AS rn
  FROM some_table
)
WHERE rn = 1
ORDER BY city;

如果你想要一些不确定的行,那么你可以尝试 any_value 函数,自 21c 起记录:

create table t
as
select
  trunc((level - 1)/10) as city
  , level as lat
  , 100 - level as lon
from dual
connect by level < 101
select /*+gather_plan_statistics*/
  city
  , any_value(lat) as lat
  , any_value(lon) as lon
from t
group by
  city
CITY | LAT | LON
---: | --: | --:
   0 |   1 |  99
   1 |  11 |  89
   2 |  21 |  79
   3 |  31 |  69
   4 |  41 |  59
   5 |  51 |  49
   6 |  61 |  39
   7 |  71 |  29
   8 |  81 |  19
   9 |  91 |   9

db<>fiddle here

您可以将 FIRST_VALUE 与 DBMS_RANDOM.VALUE

结合使用

例如:

SELECT city, latitude, longitude
FROM
(
    SELECT city
    , FIRST_VALUE(latitude) OVER (PARTITION BY city ORDER BY DBMS_RANDOM.VALUE) as latitude
    , FIRST_VALUE(longitude) OVER (PARTITION BY city ORDER BY DBMS_RANDOM.VALUE) as longitude
    FROM some_table
) 
GROUP BY city, latitude, longitude
ORDER BY city;
CITY | LATITUDE | LONGITUDE
:--- | -------: | --------:
MBO  | -30.9359 |  -55.5457
NDU  | -32.6816 |  -57.6541
STO  | -30.4274 |  -56.4718

演示 db<>fiddle here