COUNT 多列 oracle 中存在的总不同值

COUNT total distinct values existing in multiple columns oracle

我有以下 table,我想计算两列之间的不同值。

ID_DATE     DESCRIPT1   DESCRIPT2
20191001    A           R
20191001    D           B
20191001    B           D
20191001    A           B
20191002    A           B
20191002    C           A
20191002    A           B

以下是我的查询,但结果不准确

SELECT  
COUNT(distinct DESCRIPT1 || ' - ' ||  DESCRIPT2) AS ALL_DESCRIPT,
COUNT(DISTINCT DESCRIPT1) AS DESCRIPT_A, 
COUNT(DISTINCT DESCRIPT2) AS DESCRIPT_B, 
ID_DATE FROM MY_TABLE  GROUP BY ID_DATE;

我的结果,

ALL_DESCRIPT    DESCRIPT_A  DESCRIPT_B  ID_DATE
4               3           3           20191001
2               2           2           20191002

在我的结果中,带有 ID_DATE 20191002 的 ALL_DESCRIPT 列给我的总数是 2,而不是 3。它应该是 3,因为我有 A、B 和 C,总共是 3在 DESCRIPT1DESCRIPT2 两列中 我哪里做错了。

以下是在 oracle 中测试的插入查询,以备不时之需。

   INSERT all 
   INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191001','A','R')
   INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191001','D','B')
   INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191001','B','D')
   INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191001','A','B')
   INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191002','A','B')
   INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191002','C','A')
   INTO SRC_DATA (ID_DATE, DESCRIPT1, DESCRIPT2) VALUES ('20191002','A','B')
   SELECT * FROM dual;

我看不到图片,但是 - 从你的描述来看,看起来像

SQL> select id_date, count(distinct descript) cnt
  2  from (select id_date, descript1 descript from src_data
  3        union all
  4        select id_date, descript2 descript from src_data
  5       )
  6  group by id_date
  7  order by id_date;

ID_DATE         CNT
-------- ----------
20191001          4
20191002          3

SQL>

如果您添加一个显示来源的列(在我的示例中为what),那么您会

SQL> select id_date,
  2    count(distinct descript) cnt,
  3    count(distinct case when what = 'A' then descript end) descript_a,
  4    count(distinct case when what = 'B' then descript end) descript_b
  5  from (select 'A' what, id_date, descript1 descript from src_data
  6        union all
  7        select 'B' what, id_date, descript2 descript from src_data
  8       )
  9  group by id_date
 10  order by id_date;

ID_DATE         CNT DESCRIPT_A DESCRIPT_B
-------- ---------- ---------- ----------
20191001          4          3          3
20191002          3          2          2

SQL>

添加到littlefoot的查询并给其他列,这是一个数据透视操作,似乎:

select 
  id_date, 
  count(distinct descript) all_descript,
  count(case when descript = 'A' then 1 end) as descript_a,
  count(case when descript = 'B' then 1 end) as descript_B
from 
(
  select id_date, descript1 descript
  from src_data
  union all
  select id_date, descript2 descript 
  from src_data
) x
group by id_date
order by id_date;

您可以为不同的字母添加更多列,方法是遵循将另一个字母放入字符串中并以不同方式命名列的模式。当数据为例如 return 为非空值时,它的工作原理是A,当数据不是 A 时为 null。Count 仅计算非 null 数据。使用 SUM(CASE WHEN descript = 'A' THEN 1 ELSE 0 END) 对您来说可能更有意义 - 效果相同


编辑:实际上我想我误解了请求。试试这个:

    select 
      id_date, 
      count(distinct descript) all_descript,
      count(distinct descript1) as descript_a,
      count(distinct descript2) as descript_B
    from 
    (
      select id_date, descript1 descript, descript1, descript2
      from src_data
      union all
      select id_date, descript2 descript, null, null
      from src_data
    ) x
    group by id_date
    order by id_date

当遇到聚合问题时,您总是可以编写单独的聚合查询然后加入它们。在您的情况下,这可能是:

select t1.all_descript, t2.descript_a, t2.descript_b, id_date
from -- this subquery gets you the overall distinct count
(
  select id_date, count(*) as all_descript
  from 
  (
    select id_date, descript1 from mytable
    union
    select id_date, descript2 from my_table
  )
  group by id_date
) t1
join -- this subquery gets you the separate distinct counts
(
  select
    id_date,
    count(distinct descript1) as descript_a,
    count(distinct descript2) as descript_b
  from my_table
  group by id_date
) t2 using (id_date)
order by id_date;

这应该可以解决您的查询,我只是使用内存 table 在内存中存储不同的列,然后调用它们并以不同的形式计数。

WITH b AS  (
              SELECT id_date,DESCRIPT1  col1,descript1,descript2 FROM 
              SRC_DATA
              UNION 
            SELECT id_date,DESCRIPT2  col1,descript1,descript2 FROM SRC_DATA
           )
SELECT id_date,count(DISTINCT col1) col1,count(DISTINCT descript1) 
    descript1,count(DISTINCT descript2) descript2 
FROM b
GROUP BY id_date