Oracle - 过滤笛卡尔坐标
Oracle - filtering Cartesian coordinate
我有一个 mating_history
table:
id cage_id code event_date animal_id
---------------------------------------------------------------
100 4163 FA 03-Aug-2016 10.51.55.000 AM 3570
101 4163 MA 03-Aug-2016 10.52.13.000 AM 2053
102 4163 MR 29-Aug-2016 10.23.24.000 AM 2053
103 4163 MA 11-Oct-2016 12.50.02.000 PM 5882
104 4163 MR 31-Oct-2016 01.37.28.000 PM 5882
105 4163 MA 07-Nov-2016 01.27.58.000 PM 5882
106 4163 FA 19-Apr-2017 11.46.50.000 AM 6011
107 4163 FA 19-Apr-2017 11.48.31.000 AM 6010
图例:
MA = Male added to cage
MR = Male removed from cage
FA = Female added to cage
FR = Female removed from cage
在上面的table中,第一行说在event_date
上,一只雌性动物(id为3570
)被添加到笼子里,目的是繁殖.
如果您关注历史日志,您将获得这些点数 "actual mating":
female_id male_id event_date
-----------------------------------------------------------------
3570 2053 03-Aug-2016 10.52.13.000 AM
3570 5882 11-Oct-2016 12.50.02.000 PM
3570 5882 07-Nov-2016 01.27.58.000 PM
6011 5882 19-Apr-2017 11.46.50.000 AM
6010 5882 19-Apr-2017 11.48.31.000 AM
然而,当我试图将我的想法转化为SQL时,我并没有得到上面我想要的。
SQL
SELECT
be.cage_id, be.code AS base_code, be.animal_id AS base_animal, be.event_date AS base_date,
se.code AS sub_code, se.animal_id AS sub_animal, se.event_date AS sub_date
FROM mating_history be
LEFT JOIN mating_history se ON se.cage_id = be.cage_id
WHERE be.cage_id = 4163
AND be.code != se.code
AND be.code IN ('MA', 'FA')
AND se.code IN ('MA', 'FA')
AND be.event_date < se.event_date
ORDER BY be.event_date ASC, se.event_date ASC
结果
cage_id base_code base_animal base_date sub_code sub_animal sub_date
--------------------------------------------------------------------------------------------------------------------
4163 FA 3570 03-Aug-2016 10.51.55.000 AM MA 2053 03-Aug-2016 10.52.13.000 AM
4163 FA 3570 03-Aug-2016 10.51.55.000 AM MA 5882 11-Oct-2016 12.50.02.000 PM
4163 FA 3570 03-Aug-2016 10.51.55.000 AM MA 5882 07-Nov-2016 01.27.58.000 PM
4163 MA 2053 03-Aug-2016 10.52.13.000 AM FA 6011 19-Apr-2017 11.46.50.000 AM --------> WRONG
4163 MA 2053 03-Aug-2016 10.52.13.000 AM FA 6010 19-Apr-2017 11.48.31.000 AM --------> WRONG
4163 MA 5882 11-Oct-2016 12.50.02.000 PM FA 6011 19-Apr-2017 11.46.50.000 AM --------> WRONG
4163 MA 5882 11-Oct-2016 12.50.02.000 PM FA 6010 19-Apr-2017 11.48.31.000 AM --------> WRONG
4163 MA 5882 07-Nov-2016 01.27.58.000 PM FA 6011 19-Apr-2017 11.46.50.000 AM
4163 MA 5882 07-Nov-2016 01.27.58.000 PM FA 6010 19-Apr-2017 11.48.31.000 AM
我不知道如何获得我需要的 5 行。如何进一步过滤结果,以便在这种情况下只得到我需要的 5 行?
可选:创建笛卡尔积是否是我要实现的目标的最佳解决方案?有更好的方法吗?
让我们跟踪一下谁在笼子里。 . .并假设只有一男一女。以下为每次更改获取笼子中的动物:
select mh.*,
(case when 'MA' = lag(case when base_code in ('MA', 'MR') then base_code end ignore nulls) over (partition by cage_id order by event_date)
then lag(case when base_code in ('MA') then animal_id end ignore nulls) over (partition by cage_id order by event_date)
end) as male_animal,
(case when 'FA' = lag(case when base_code in ('FA', 'FR') then base_code end ignore nulls) over (partition by cage_id order by event_date)
then lag(case when base_code in ('FA') then animal_id end ignore nulls) over (partition by cage_id order by event_date)
end) as female_animal,
lead(event_date) over (partition by cage_id order by event_date) as next_event_date
from mating_history mh;
你想要两种动物都存在的那些:
select mh.*
from (select mh.*,
(case when 'MA' = lag(case when base_code in ('MA', 'MR') then base_code end ignore nulls) over (partition by cage_id order by event_date) = 'MA'
then lag(case when base_code in ('MA') then animal_id end ignore nulls) over (partition by cage_id order by event_date)
end) as male_animal,
(case when 'FA' = lag(case when base_code in ('FA', 'FR') then base_code end ignore nulls) over (partition by cage_id order by event_date) = 'FA'
then lag(case when base_code in ('FA') then animal_id end ignore nulls) over (partition by cage_id order by event_date)
end) as female_animal,
lead(event_date) over (partition by cage_id order by event_date) as next_event_date
from mating_history mh
) mh
where male_animal is not null and female_animal is not null;
这可能有效:
设置:
create table mating_history (
id number primary key
, cage_id number not null
, code char(2) check (code in ('FA', 'FR', 'MA', 'MR'))
, event_date timestamp not null
, animal_id number not null
);
insert into mating_history
select 100, 4163, 'FA', timestamp '2016-08-03 10:51:55', 3570 from dual union all
select 101, 4163, 'MA', timestamp '2016-08-03 10:52:13', 2053 from dual union all
select 102, 4163, 'MR', timestamp '2016-08-29 10:23:24', 2053 from dual union all
select 103, 4163, 'MA', timestamp '2016-10-11 12:50:02', 5882 from dual union all
select 104, 4163, 'MR', timestamp '2016-10-31 13:37:28', 5882 from dual union all
select 105, 4163, 'MA', timestamp '2016-11-07 13:27:58', 5882 from dual union all
select 106, 4163, 'FA', timestamp '2017-04-19 11:46:50', 6011 from dual union all
select 107, 4163, 'FA', timestamp '2017-04-19 11:48:31', 6010 from dual
;
commit;
这在几个方面都很糟糕。笼子和动物应该有小 "dimension" tables。动物 table 应该显示性别(而不是当前 table 中的 "code")。现在,我假设数据与您提供的一样,并且您不倾向于修复数据模型。
查询:
with
grouped ( cage_id, sex, event_code, event_date, animal_id, grp ) as (
select cage_id, substr(code, 1, 1), substr(code, 2),
event_date, animal_id,
row_number() over (partition by animal_id, code order by event_date)
from mating_history
),
pivoted as (
select *
from grouped
pivot ( max(event_date) for event_code in ('A' as a, 'R' as r) )
)
select f.animal_id as female_id,
m.animal_id as male_id,
greatest(f.a, m.a) as event_date
from ( select * from pivoted where sex = 'F' ) f
join
( select * from pivoted where sex = 'M' ) m
on f.cage_id = m.cage_id
and ( f.r >= m.a or f.r is null )
and ( m.r >= f.a or m.r is null )
order by event_date, female_id, male_id
;
输出:(event_date
列使用我当前的NLS_TIMESTAMP_FORMAT
)
FEMALE_ID MALE_ID EVENT_DATE
--------- ------- ------------------------------
3570 2053 03-AUG-2016 10.52.13.000000000
3570 5882 11-OCT-2016 12.50.02.000000000
3570 5882 07-NOV-2016 13.27.58.000000000
6011 5882 19-APR-2017 11.46.50.000000000
6010 5882 19-APR-2017 11.48.31.000000000
我有一个 mating_history
table:
id cage_id code event_date animal_id
---------------------------------------------------------------
100 4163 FA 03-Aug-2016 10.51.55.000 AM 3570
101 4163 MA 03-Aug-2016 10.52.13.000 AM 2053
102 4163 MR 29-Aug-2016 10.23.24.000 AM 2053
103 4163 MA 11-Oct-2016 12.50.02.000 PM 5882
104 4163 MR 31-Oct-2016 01.37.28.000 PM 5882
105 4163 MA 07-Nov-2016 01.27.58.000 PM 5882
106 4163 FA 19-Apr-2017 11.46.50.000 AM 6011
107 4163 FA 19-Apr-2017 11.48.31.000 AM 6010
图例:
MA = Male added to cage
MR = Male removed from cage
FA = Female added to cage
FR = Female removed from cage
在上面的table中,第一行说在event_date
上,一只雌性动物(id为3570
)被添加到笼子里,目的是繁殖.
如果您关注历史日志,您将获得这些点数 "actual mating":
female_id male_id event_date
-----------------------------------------------------------------
3570 2053 03-Aug-2016 10.52.13.000 AM
3570 5882 11-Oct-2016 12.50.02.000 PM
3570 5882 07-Nov-2016 01.27.58.000 PM
6011 5882 19-Apr-2017 11.46.50.000 AM
6010 5882 19-Apr-2017 11.48.31.000 AM
然而,当我试图将我的想法转化为SQL时,我并没有得到上面我想要的。
SQL
SELECT
be.cage_id, be.code AS base_code, be.animal_id AS base_animal, be.event_date AS base_date,
se.code AS sub_code, se.animal_id AS sub_animal, se.event_date AS sub_date
FROM mating_history be
LEFT JOIN mating_history se ON se.cage_id = be.cage_id
WHERE be.cage_id = 4163
AND be.code != se.code
AND be.code IN ('MA', 'FA')
AND se.code IN ('MA', 'FA')
AND be.event_date < se.event_date
ORDER BY be.event_date ASC, se.event_date ASC
结果
cage_id base_code base_animal base_date sub_code sub_animal sub_date
--------------------------------------------------------------------------------------------------------------------
4163 FA 3570 03-Aug-2016 10.51.55.000 AM MA 2053 03-Aug-2016 10.52.13.000 AM
4163 FA 3570 03-Aug-2016 10.51.55.000 AM MA 5882 11-Oct-2016 12.50.02.000 PM
4163 FA 3570 03-Aug-2016 10.51.55.000 AM MA 5882 07-Nov-2016 01.27.58.000 PM
4163 MA 2053 03-Aug-2016 10.52.13.000 AM FA 6011 19-Apr-2017 11.46.50.000 AM --------> WRONG
4163 MA 2053 03-Aug-2016 10.52.13.000 AM FA 6010 19-Apr-2017 11.48.31.000 AM --------> WRONG
4163 MA 5882 11-Oct-2016 12.50.02.000 PM FA 6011 19-Apr-2017 11.46.50.000 AM --------> WRONG
4163 MA 5882 11-Oct-2016 12.50.02.000 PM FA 6010 19-Apr-2017 11.48.31.000 AM --------> WRONG
4163 MA 5882 07-Nov-2016 01.27.58.000 PM FA 6011 19-Apr-2017 11.46.50.000 AM
4163 MA 5882 07-Nov-2016 01.27.58.000 PM FA 6010 19-Apr-2017 11.48.31.000 AM
我不知道如何获得我需要的 5 行。如何进一步过滤结果,以便在这种情况下只得到我需要的 5 行?
可选:创建笛卡尔积是否是我要实现的目标的最佳解决方案?有更好的方法吗?
让我们跟踪一下谁在笼子里。 . .并假设只有一男一女。以下为每次更改获取笼子中的动物:
select mh.*,
(case when 'MA' = lag(case when base_code in ('MA', 'MR') then base_code end ignore nulls) over (partition by cage_id order by event_date)
then lag(case when base_code in ('MA') then animal_id end ignore nulls) over (partition by cage_id order by event_date)
end) as male_animal,
(case when 'FA' = lag(case when base_code in ('FA', 'FR') then base_code end ignore nulls) over (partition by cage_id order by event_date)
then lag(case when base_code in ('FA') then animal_id end ignore nulls) over (partition by cage_id order by event_date)
end) as female_animal,
lead(event_date) over (partition by cage_id order by event_date) as next_event_date
from mating_history mh;
你想要两种动物都存在的那些:
select mh.*
from (select mh.*,
(case when 'MA' = lag(case when base_code in ('MA', 'MR') then base_code end ignore nulls) over (partition by cage_id order by event_date) = 'MA'
then lag(case when base_code in ('MA') then animal_id end ignore nulls) over (partition by cage_id order by event_date)
end) as male_animal,
(case when 'FA' = lag(case when base_code in ('FA', 'FR') then base_code end ignore nulls) over (partition by cage_id order by event_date) = 'FA'
then lag(case when base_code in ('FA') then animal_id end ignore nulls) over (partition by cage_id order by event_date)
end) as female_animal,
lead(event_date) over (partition by cage_id order by event_date) as next_event_date
from mating_history mh
) mh
where male_animal is not null and female_animal is not null;
这可能有效:
设置:
create table mating_history (
id number primary key
, cage_id number not null
, code char(2) check (code in ('FA', 'FR', 'MA', 'MR'))
, event_date timestamp not null
, animal_id number not null
);
insert into mating_history
select 100, 4163, 'FA', timestamp '2016-08-03 10:51:55', 3570 from dual union all
select 101, 4163, 'MA', timestamp '2016-08-03 10:52:13', 2053 from dual union all
select 102, 4163, 'MR', timestamp '2016-08-29 10:23:24', 2053 from dual union all
select 103, 4163, 'MA', timestamp '2016-10-11 12:50:02', 5882 from dual union all
select 104, 4163, 'MR', timestamp '2016-10-31 13:37:28', 5882 from dual union all
select 105, 4163, 'MA', timestamp '2016-11-07 13:27:58', 5882 from dual union all
select 106, 4163, 'FA', timestamp '2017-04-19 11:46:50', 6011 from dual union all
select 107, 4163, 'FA', timestamp '2017-04-19 11:48:31', 6010 from dual
;
commit;
这在几个方面都很糟糕。笼子和动物应该有小 "dimension" tables。动物 table 应该显示性别(而不是当前 table 中的 "code")。现在,我假设数据与您提供的一样,并且您不倾向于修复数据模型。
查询:
with
grouped ( cage_id, sex, event_code, event_date, animal_id, grp ) as (
select cage_id, substr(code, 1, 1), substr(code, 2),
event_date, animal_id,
row_number() over (partition by animal_id, code order by event_date)
from mating_history
),
pivoted as (
select *
from grouped
pivot ( max(event_date) for event_code in ('A' as a, 'R' as r) )
)
select f.animal_id as female_id,
m.animal_id as male_id,
greatest(f.a, m.a) as event_date
from ( select * from pivoted where sex = 'F' ) f
join
( select * from pivoted where sex = 'M' ) m
on f.cage_id = m.cage_id
and ( f.r >= m.a or f.r is null )
and ( m.r >= f.a or m.r is null )
order by event_date, female_id, male_id
;
输出:(event_date
列使用我当前的NLS_TIMESTAMP_FORMAT
)
FEMALE_ID MALE_ID EVENT_DATE
--------- ------- ------------------------------
3570 2053 03-AUG-2016 10.52.13.000000000
3570 5882 11-OCT-2016 12.50.02.000000000
3570 5882 07-NOV-2016 13.27.58.000000000
6011 5882 19-APR-2017 11.46.50.000000000
6010 5882 19-APR-2017 11.48.31.000000000