在 Postgres 的另一列中获取具有最大字符串长度的行
Fetch rows with maximum length of string in another column in Postgres
我在 postgres11
中关注 table
trial_id name_split drug_name_who
NCT01877395 imovax® rabies imovax
NCT01877395 imovax® rabies imovax rabies
NCT01877395 imovax® rabies rabies
NCT01877395 imovax® rabies rabies imovax
NCT00000374 olanzapine olanzapine
NCT00000390 imipramine hydrochloride imipramine hydrochloride
NCT00000390 imipramine hydrochloride imipramine
我想获取每个 'trial_id name_split' 的最大长度值的行。
我尝试了以下查询:
with x as (
SELECT distinct on (trial_id,name_split) *
FROM table
WHERE
regexp_replace(name_split, '[^\w]', '#', 'g') ~* ('\y'||regexp_replace(drug_name_who, '[^\w]', '#', 'g')||'\y')
and (length(drug_name_who) > 2)
or (drug_name_who is null)
ORDER BY trial_id, name_split, length(drug_name_who) DESC NULLS LAST)
select * from x;
查询可以正确获取 'drug_name_who' per trial_id 的长度不相等的行,但是当 'drug_name_who' per trial_id 的长度相等时,查询仅选择一行(例如:NCT01877395,缺少下一行:NCT01877395 imovax® rabies imovax)
期望的输出是:
trial_id name_split drug_name_who
NCT01877395 imovax® rabies imovax
NCT01877395 imovax® rabies rabies
NCT00000374 olanzapine olanzapine
NCT00000390 imipramine hydrochloride imipramine hydrochloride
非常感谢这里的任何帮助
distinct on
总是 return 每组只有一行 - 如果 order by
子句不是确定性的,那么你会从关系中随机得到一行。
如果你想允许联系,那么你可以使用 rank()
和一个子查询来代替:
select *
from (
select
t.*,
rank() over(
partition by trial_id, name_split
order by length(drug_name_who) desc
) rn
from mytable t
where ...
) t
where rn = 1
我在 postgres11
中关注 tabletrial_id name_split drug_name_who
NCT01877395 imovax® rabies imovax
NCT01877395 imovax® rabies imovax rabies
NCT01877395 imovax® rabies rabies
NCT01877395 imovax® rabies rabies imovax
NCT00000374 olanzapine olanzapine
NCT00000390 imipramine hydrochloride imipramine hydrochloride
NCT00000390 imipramine hydrochloride imipramine
我想获取每个 'trial_id name_split' 的最大长度值的行。
我尝试了以下查询:
with x as (
SELECT distinct on (trial_id,name_split) *
FROM table
WHERE
regexp_replace(name_split, '[^\w]', '#', 'g') ~* ('\y'||regexp_replace(drug_name_who, '[^\w]', '#', 'g')||'\y')
and (length(drug_name_who) > 2)
or (drug_name_who is null)
ORDER BY trial_id, name_split, length(drug_name_who) DESC NULLS LAST)
select * from x;
查询可以正确获取 'drug_name_who' per trial_id 的长度不相等的行,但是当 'drug_name_who' per trial_id 的长度相等时,查询仅选择一行(例如:NCT01877395,缺少下一行:NCT01877395 imovax® rabies imovax)
期望的输出是:
trial_id name_split drug_name_who
NCT01877395 imovax® rabies imovax
NCT01877395 imovax® rabies rabies
NCT00000374 olanzapine olanzapine
NCT00000390 imipramine hydrochloride imipramine hydrochloride
非常感谢这里的任何帮助
distinct on
总是 return 每组只有一行 - 如果 order by
子句不是确定性的,那么你会从关系中随机得到一行。
如果你想允许联系,那么你可以使用 rank()
和一个子查询来代替:
select *
from (
select
t.*,
rank() over(
partition by trial_id, name_split
order by length(drug_name_who) desc
) rn
from mytable t
where ...
) t
where rn = 1