COUNT 多列 oracle 中存在的总不同值
COUNT total distinct values existing in multiple columns oracle
我有以下 table,我想计算两列之间的不同值。
ID_DATE DESCRIPT1 DESCRIPT2
20191001 A R
20191001 D B
20191001 B D
20191001 A B
20191002 A B
20191002 C A
20191002 A B
以下是我的查询,但结果不准确
SELECT
COUNT(distinct DESCRIPT1 || ' - ' || DESCRIPT2) AS ALL_DESCRIPT,
COUNT(DISTINCT DESCRIPT1) AS DESCRIPT_A,
COUNT(DISTINCT DESCRIPT2) AS DESCRIPT_B,
ID_DATE FROM MY_TABLE GROUP BY ID_DATE;
我的结果,
ALL_DESCRIPT DESCRIPT_A DESCRIPT_B ID_DATE
4 3 3 20191001
2 2 2 20191002
在我的结果中,带有 ID_DATE 20191002 的 ALL_DESCRIPT
列给我的总数是 2,而不是 3。它应该是 3,因为我有 A、B 和 C,总共是 3在 DESCRIPT1
和 DESCRIPT2
两列中
我哪里做错了。
以下是在 oracle 中测试的插入查询,以备不时之需。
INSERT all
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191001','A','R')
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191001','D','B')
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191001','B','D')
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191001','A','B')
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191002','A','B')
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191002','C','A')
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191002','A','B')
SELECT * FROM dual;
我看不到图片,但是 - 从你的描述来看,看起来像
SQL> select id_date, count(distinct descript) cnt
2 from (select id_date, descript1 descript from src_data
3 union all
4 select id_date, descript2 descript from src_data
5 )
6 group by id_date
7 order by id_date;
ID_DATE CNT
-------- ----------
20191001 4
20191002 3
SQL>
如果您添加一个显示来源的列(在我的示例中为what
),那么您会
SQL> select id_date,
2 count(distinct descript) cnt,
3 count(distinct case when what = 'A' then descript end) descript_a,
4 count(distinct case when what = 'B' then descript end) descript_b
5 from (select 'A' what, id_date, descript1 descript from src_data
6 union all
7 select 'B' what, id_date, descript2 descript from src_data
8 )
9 group by id_date
10 order by id_date;
ID_DATE CNT DESCRIPT_A DESCRIPT_B
-------- ---------- ---------- ----------
20191001 4 3 3
20191002 3 2 2
SQL>
添加到littlefoot的查询并给其他列,这是一个数据透视操作,似乎:
select
id_date,
count(distinct descript) all_descript,
count(case when descript = 'A' then 1 end) as descript_a,
count(case when descript = 'B' then 1 end) as descript_B
from
(
select id_date, descript1 descript
from src_data
union all
select id_date, descript2 descript
from src_data
) x
group by id_date
order by id_date;
您可以为不同的字母添加更多列,方法是遵循将另一个字母放入字符串中并以不同方式命名列的模式。当数据为例如 return 为非空值时,它的工作原理是A,当数据不是 A 时为 null。Count 仅计算非 null 数据。使用 SUM(CASE WHEN descript = 'A' THEN 1 ELSE 0 END)
对您来说可能更有意义 - 效果相同
编辑:实际上我想我误解了请求。试试这个:
select
id_date,
count(distinct descript) all_descript,
count(distinct descript1) as descript_a,
count(distinct descript2) as descript_B
from
(
select id_date, descript1 descript, descript1, descript2
from src_data
union all
select id_date, descript2 descript, null, null
from src_data
) x
group by id_date
order by id_date
当遇到聚合问题时,您总是可以编写单独的聚合查询然后加入它们。在您的情况下,这可能是:
select t1.all_descript, t2.descript_a, t2.descript_b, id_date
from -- this subquery gets you the overall distinct count
(
select id_date, count(*) as all_descript
from
(
select id_date, descript1 from mytable
union
select id_date, descript2 from my_table
)
group by id_date
) t1
join -- this subquery gets you the separate distinct counts
(
select
id_date,
count(distinct descript1) as descript_a,
count(distinct descript2) as descript_b
from my_table
group by id_date
) t2 using (id_date)
order by id_date;
这应该可以解决您的查询,我只是使用内存 table 在内存中存储不同的列,然后调用它们并以不同的形式计数。
WITH b AS (
SELECT id_date,DESCRIPT1 col1,descript1,descript2 FROM
SRC_DATA
UNION
SELECT id_date,DESCRIPT2 col1,descript1,descript2 FROM SRC_DATA
)
SELECT id_date,count(DISTINCT col1) col1,count(DISTINCT descript1)
descript1,count(DISTINCT descript2) descript2
FROM b
GROUP BY id_date
我有以下 table,我想计算两列之间的不同值。
ID_DATE DESCRIPT1 DESCRIPT2
20191001 A R
20191001 D B
20191001 B D
20191001 A B
20191002 A B
20191002 C A
20191002 A B
以下是我的查询,但结果不准确
SELECT
COUNT(distinct DESCRIPT1 || ' - ' || DESCRIPT2) AS ALL_DESCRIPT,
COUNT(DISTINCT DESCRIPT1) AS DESCRIPT_A,
COUNT(DISTINCT DESCRIPT2) AS DESCRIPT_B,
ID_DATE FROM MY_TABLE GROUP BY ID_DATE;
我的结果,
ALL_DESCRIPT DESCRIPT_A DESCRIPT_B ID_DATE
4 3 3 20191001
2 2 2 20191002
在我的结果中,带有 ID_DATE 20191002 的 ALL_DESCRIPT
列给我的总数是 2,而不是 3。它应该是 3,因为我有 A、B 和 C,总共是 3在 DESCRIPT1
和 DESCRIPT2
两列中
我哪里做错了。
以下是在 oracle 中测试的插入查询,以备不时之需。
INSERT all
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191001','A','R')
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191001','D','B')
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191001','B','D')
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191001','A','B')
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191002','A','B')
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191002','C','A')
INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191002','A','B')
SELECT * FROM dual;
我看不到图片,但是 - 从你的描述来看,看起来像
SQL> select id_date, count(distinct descript) cnt
2 from (select id_date, descript1 descript from src_data
3 union all
4 select id_date, descript2 descript from src_data
5 )
6 group by id_date
7 order by id_date;
ID_DATE CNT
-------- ----------
20191001 4
20191002 3
SQL>
如果您添加一个显示来源的列(在我的示例中为what
),那么您会
SQL> select id_date,
2 count(distinct descript) cnt,
3 count(distinct case when what = 'A' then descript end) descript_a,
4 count(distinct case when what = 'B' then descript end) descript_b
5 from (select 'A' what, id_date, descript1 descript from src_data
6 union all
7 select 'B' what, id_date, descript2 descript from src_data
8 )
9 group by id_date
10 order by id_date;
ID_DATE CNT DESCRIPT_A DESCRIPT_B
-------- ---------- ---------- ----------
20191001 4 3 3
20191002 3 2 2
SQL>
添加到littlefoot的查询并给其他列,这是一个数据透视操作,似乎:
select
id_date,
count(distinct descript) all_descript,
count(case when descript = 'A' then 1 end) as descript_a,
count(case when descript = 'B' then 1 end) as descript_B
from
(
select id_date, descript1 descript
from src_data
union all
select id_date, descript2 descript
from src_data
) x
group by id_date
order by id_date;
您可以为不同的字母添加更多列,方法是遵循将另一个字母放入字符串中并以不同方式命名列的模式。当数据为例如 return 为非空值时,它的工作原理是A,当数据不是 A 时为 null。Count 仅计算非 null 数据。使用 SUM(CASE WHEN descript = 'A' THEN 1 ELSE 0 END)
对您来说可能更有意义 - 效果相同
编辑:实际上我想我误解了请求。试试这个:
select
id_date,
count(distinct descript) all_descript,
count(distinct descript1) as descript_a,
count(distinct descript2) as descript_B
from
(
select id_date, descript1 descript, descript1, descript2
from src_data
union all
select id_date, descript2 descript, null, null
from src_data
) x
group by id_date
order by id_date
当遇到聚合问题时,您总是可以编写单独的聚合查询然后加入它们。在您的情况下,这可能是:
select t1.all_descript, t2.descript_a, t2.descript_b, id_date
from -- this subquery gets you the overall distinct count
(
select id_date, count(*) as all_descript
from
(
select id_date, descript1 from mytable
union
select id_date, descript2 from my_table
)
group by id_date
) t1
join -- this subquery gets you the separate distinct counts
(
select
id_date,
count(distinct descript1) as descript_a,
count(distinct descript2) as descript_b
from my_table
group by id_date
) t2 using (id_date)
order by id_date;
这应该可以解决您的查询,我只是使用内存 table 在内存中存储不同的列,然后调用它们并以不同的形式计数。
WITH b AS (
SELECT id_date,DESCRIPT1 col1,descript1,descript2 FROM
SRC_DATA
UNION
SELECT id_date,DESCRIPT2 col1,descript1,descript2 FROM SRC_DATA
)
SELECT id_date,count(DISTINCT col1) col1,count(DISTINCT descript1)
descript1,count(DISTINCT descript2) descript2
FROM b
GROUP BY id_date