oracle中多列的rank()函数
Rank () function over multiple columns in oracle
我需要对具有两列 transID 和 travel_date
的 table 进行排名
这是我的数据
transID travel_date
2341 2018-04-04 10:00:00
2341 2018-04-04 11:30:00
2891 2018-04-04 12:30:00
2891 2018-04-04 18:30:00
2341 2018-04-05 11:30:00
2891 2018-04-05 22:30:00
这是我试过的查询
select transID,travel_date,rn,
dense_rank () over (partition by transID order by EarliestDate,transID) as rn2
from
(SELECT transID,travel_date,
ROW_NUMBER() OVER (PARTITION BY transID ORDER BY travel_date) AS rn,
max(travel_date) OVER (partition by travel_date) as EarliestDate
FROM travel_log_info
) t
order by transID;
上述查询的当前输出
transID travel_date rn2
2341 2018-04-04 10:00:00 1
2341 2018-04-04 11:30:00 2
2341 2018-04-05 11:30:00 3
2891 2018-04-04 12:30:00 1
2891 2018-04-04 18:30:00 2
2891 2018-04-05 22:30:00 3
预期输出
transID travel_date rn2
2341 2018-04-04 10:00:00 1
2341 2018-04-04 11:30:00 2
2341 2018-04-05 11:30:00 1
2891 2018-04-04 12:30:00 1
2891 2018-04-04 18:30:00 2
2891 2018-04-05 22:30:00 1
使用此输出,我可以通过 where condition rn2 = 1 获得所需的输出,以获得基于旅行日期和 transId 的输出。
我没有得到如上所示的所需输出。请提供建议以实现正确的输出。
感谢您的宝贵时间
你现在的主要问题是:
max(travel_date) OVER (partition by travel_date)
其中包括分区中每个日期的时间部分 - 所以您实际上得到了每个人的最大值 date/time,即 date/time。您似乎希望每天最多 date/time,因此您可以通过在 partition-by 子句中使用 trunc()
按每个 day 进行分区:
max(travel_date) OVER (partition by trunc(travel_date))
只是这个改变给了你:
TRANSID TRAVEL_DATE RN RN2
---------- ------------------- ---------- ----------
2341 2018-04-04 10:00:00 1 1
2341 2018-04-04 11:30:00 2 1
2341 2018-04-05 11:30:00 3 2
2891 2018-04-04 12:30:00 1 1
2891 2018-04-04 18:30:00 2 1
2891 2018-04-05 22:30:00 3 2
虽然外部查询中的分区也是错误的,您需要按 'earliest' 日期(实际上是最新的,但对此无关紧要)进行分区:
select transID,travel_date,rn,
dense_rank () over (partition by transID,EarliestDate order by travel_date) as rn2
from
(SELECT transID,travel_date,
ROW_NUMBER() OVER (PARTITION BY transID ORDER BY travel_date) AS rn,
max(travel_date) OVER (partition by trunc(travel_date)) as EarliestDate
FROM travel_log_info
) t
order by transID;
TRANSID TRAVEL_DATE RN RN2
---------- ------------------- ---------- ----------
2341 2018-04-04 10:00:00 1 1
2341 2018-04-04 11:30:00 2 2
2341 2018-04-05 11:30:00 3 1
2891 2018-04-04 12:30:00 1 1
2891 2018-04-04 18:30:00 2 2
2891 2018-04-05 22:30:00 3 1
但是您并不真的需要那个最大值,或者您当前拥有的外部查询;如果你在 row_number()
分区(你目前没有真正使用它)中包含被截断的那一天,你会得到:
SELECT transID,travel_date,
ROW_NUMBER() OVER (PARTITION BY transID, trunc(travel_date) ORDER BY travel_date) AS rn
FROM travel_log_info;
TRANSID TRAVEL_DATE RN
---------- ------------------- ----------
2341 2018-04-04 10:00:00 1
2341 2018-04-04 11:30:00 2
2341 2018-04-05 11:30:00 1
2891 2018-04-04 12:30:00 1
2891 2018-04-04 18:30:00 2
2891 2018-04-05 22:30:00 1
然后您可以将其包装在外部查询中以根据 rn
:
进行过滤
SELECT transID,travel_date
FROM (
SELECT transID,travel_date,
ROW_NUMBER() OVER (PARTITION BY transID, trunc(travel_date) ORDER BY travel_date) AS rn
FROM travel_log_info
)
WHERE rn = 1
ORDER BY transID,travel_date;
TRANSID TRAVEL_DATE
---------- -------------------
2341 2018-04-04 10:00:00
2341 2018-04-05 11:30:00
2891 2018-04-04 12:30:00
2891 2018-04-05 22:30:00
您也可以在没有子查询的情况下执行此操作;这得到相同的结果 using first
:
SELECT transID,
min(travel_date) keep (dense_rank first order by travel_date) as travel_date
FROM travel_log_info
GROUP BY transID, trunc(travel_date)
ORDER BY transID, travel_date;
我需要对具有两列 transID 和 travel_date
的 table 进行排名这是我的数据
transID travel_date
2341 2018-04-04 10:00:00
2341 2018-04-04 11:30:00
2891 2018-04-04 12:30:00
2891 2018-04-04 18:30:00
2341 2018-04-05 11:30:00
2891 2018-04-05 22:30:00
这是我试过的查询
select transID,travel_date,rn,
dense_rank () over (partition by transID order by EarliestDate,transID) as rn2
from
(SELECT transID,travel_date,
ROW_NUMBER() OVER (PARTITION BY transID ORDER BY travel_date) AS rn,
max(travel_date) OVER (partition by travel_date) as EarliestDate
FROM travel_log_info
) t
order by transID;
上述查询的当前输出
transID travel_date rn2
2341 2018-04-04 10:00:00 1
2341 2018-04-04 11:30:00 2
2341 2018-04-05 11:30:00 3
2891 2018-04-04 12:30:00 1
2891 2018-04-04 18:30:00 2
2891 2018-04-05 22:30:00 3
预期输出
transID travel_date rn2
2341 2018-04-04 10:00:00 1
2341 2018-04-04 11:30:00 2
2341 2018-04-05 11:30:00 1
2891 2018-04-04 12:30:00 1
2891 2018-04-04 18:30:00 2
2891 2018-04-05 22:30:00 1
使用此输出,我可以通过 where condition rn2 = 1 获得所需的输出,以获得基于旅行日期和 transId 的输出。
我没有得到如上所示的所需输出。请提供建议以实现正确的输出。 感谢您的宝贵时间
你现在的主要问题是:
max(travel_date) OVER (partition by travel_date)
其中包括分区中每个日期的时间部分 - 所以您实际上得到了每个人的最大值 date/time,即 date/time。您似乎希望每天最多 date/time,因此您可以通过在 partition-by 子句中使用 trunc()
按每个 day 进行分区:
max(travel_date) OVER (partition by trunc(travel_date))
只是这个改变给了你:
TRANSID TRAVEL_DATE RN RN2
---------- ------------------- ---------- ----------
2341 2018-04-04 10:00:00 1 1
2341 2018-04-04 11:30:00 2 1
2341 2018-04-05 11:30:00 3 2
2891 2018-04-04 12:30:00 1 1
2891 2018-04-04 18:30:00 2 1
2891 2018-04-05 22:30:00 3 2
虽然外部查询中的分区也是错误的,您需要按 'earliest' 日期(实际上是最新的,但对此无关紧要)进行分区:
select transID,travel_date,rn,
dense_rank () over (partition by transID,EarliestDate order by travel_date) as rn2
from
(SELECT transID,travel_date,
ROW_NUMBER() OVER (PARTITION BY transID ORDER BY travel_date) AS rn,
max(travel_date) OVER (partition by trunc(travel_date)) as EarliestDate
FROM travel_log_info
) t
order by transID;
TRANSID TRAVEL_DATE RN RN2
---------- ------------------- ---------- ----------
2341 2018-04-04 10:00:00 1 1
2341 2018-04-04 11:30:00 2 2
2341 2018-04-05 11:30:00 3 1
2891 2018-04-04 12:30:00 1 1
2891 2018-04-04 18:30:00 2 2
2891 2018-04-05 22:30:00 3 1
但是您并不真的需要那个最大值,或者您当前拥有的外部查询;如果你在 row_number()
分区(你目前没有真正使用它)中包含被截断的那一天,你会得到:
SELECT transID,travel_date,
ROW_NUMBER() OVER (PARTITION BY transID, trunc(travel_date) ORDER BY travel_date) AS rn
FROM travel_log_info;
TRANSID TRAVEL_DATE RN
---------- ------------------- ----------
2341 2018-04-04 10:00:00 1
2341 2018-04-04 11:30:00 2
2341 2018-04-05 11:30:00 1
2891 2018-04-04 12:30:00 1
2891 2018-04-04 18:30:00 2
2891 2018-04-05 22:30:00 1
然后您可以将其包装在外部查询中以根据 rn
:
SELECT transID,travel_date
FROM (
SELECT transID,travel_date,
ROW_NUMBER() OVER (PARTITION BY transID, trunc(travel_date) ORDER BY travel_date) AS rn
FROM travel_log_info
)
WHERE rn = 1
ORDER BY transID,travel_date;
TRANSID TRAVEL_DATE
---------- -------------------
2341 2018-04-04 10:00:00
2341 2018-04-05 11:30:00
2891 2018-04-04 12:30:00
2891 2018-04-05 22:30:00
您也可以在没有子查询的情况下执行此操作;这得到相同的结果 using first
:
SELECT transID,
min(travel_date) keep (dense_rank first order by travel_date) as travel_date
FROM travel_log_info
GROUP BY transID, trunc(travel_date)
ORDER BY transID, travel_date;