计算第一个 'group by' 字段的不同值,而其中两个是
Count distinct values for 1st 'group by' field while are two of them
我有一个用户创建的文档列表,包含日期、周数和一些描述字段。我需要的是计算每周创建了多少文档,有多少用户参与。使用下面这样的查询,我可以很容易地计算出创建的文档数量,不幸的是,不同用户的数量会成倍增加,因为单个用户可能会创建不同类别的文档。所以有些用户被统计了很多次。
sel WEEK
, DESCRIPTION
, count(*) as DOC_CNT
,count(distinct USER_ID) as USER_CNT
from THE_TABLE_1
group by 1,2;
我想统计每周的不同用户数,而不考虑 DESCRIPTION 字段。你知道任何优雅的方式来获得这个吗?
如果重要的话,我正在使用 Teradata 引擎。
谢谢
编辑:
输入看起来像这样:
DOC_NO USER_ID DT WEEK day_of_week DESCRIPTION
1 0019071988 AC_N490314 10/03/2020 10 3 Maintain Business Partner
2 0018864387 AC_SSELVAM1 03/03/2020 9 3 Customer Change
3 0018840898 AC_RHARIHAR1 02/03/2020 9 2 Change Asset
4 0018883336 AC_AGNANA1 03/03/2020 9 3 Create Asset
5 0017743110 AC_DKUPPUSA1 03/02/2020 5 2 Change Bank
6 0017946108 AC_SMADHESH 07/02/2020 5 6 Create Supplier
7 0019573163 AC_SJAYACHA1 26/03/2020 12 5 Select Idocs
8 0017660339 AC_SSELVAM1 31/01/2020 4 6 Create material
9 0018324802 AC_DKUPPUSA1 18/02/2020 7 3 VIM Workplace
10 0019161678 AC_N478361 14/03/2020 10 7 Release Blocked Invoices
并且输出应该如下所示,尽管此处 USER_CNT 值计算错误。
WEEK DESCRIPTION DOC_CNT USER_CNT
1 10 Reset Cleared Items 229 13
2 3 Maintain Business Partner 600 10
3 4 Data Capture/Indexing Invoice 4,974 31
4 7 Other 1,207 54
5 9 Check VIM Business Rules 2,132 23
6 6 Check VIM Business Rules 2,863 29
7 5 Other 1,096 52
8 12 Check VIM Business Rules 1,390 19
9 4 Check VIM Business Rules 2,710 27
10 4 Other 1,462 56
我认为这里的问题是您在当前查询中使用 description
字段对行进行分组:
COUNT(*) AS doc_cnt, -- # rows per week / description
COUNT(DISTINCT user_id) AS user_cnt -- # distinct users per week / description
据我了解,您想要 # documents and distinct users per week, regardless of description
。假设每个文档一行,你可以这样做:
SELECT
src.week,
MAX(src.docs_per_week) AS docs_per_week,
COUNT(*) AS users_per_week
FROM (
SELECT
week,
COUNT(*) OVER(PARTITION BY week) AS docs_per_week -- # rows per week
FROM the_table_1
QUALIFY ROW_NUMBER() OVER(PARTITION BY week, user_id) = 1 -- Remove duplicate users
) src
GROUP BY src.week -- One row per week
我有一个用户创建的文档列表,包含日期、周数和一些描述字段。我需要的是计算每周创建了多少文档,有多少用户参与。使用下面这样的查询,我可以很容易地计算出创建的文档数量,不幸的是,不同用户的数量会成倍增加,因为单个用户可能会创建不同类别的文档。所以有些用户被统计了很多次。
sel WEEK
, DESCRIPTION
, count(*) as DOC_CNT
,count(distinct USER_ID) as USER_CNT
from THE_TABLE_1
group by 1,2;
我想统计每周的不同用户数,而不考虑 DESCRIPTION 字段。你知道任何优雅的方式来获得这个吗? 如果重要的话,我正在使用 Teradata 引擎。
谢谢
编辑: 输入看起来像这样:
DOC_NO USER_ID DT WEEK day_of_week DESCRIPTION
1 0019071988 AC_N490314 10/03/2020 10 3 Maintain Business Partner
2 0018864387 AC_SSELVAM1 03/03/2020 9 3 Customer Change
3 0018840898 AC_RHARIHAR1 02/03/2020 9 2 Change Asset
4 0018883336 AC_AGNANA1 03/03/2020 9 3 Create Asset
5 0017743110 AC_DKUPPUSA1 03/02/2020 5 2 Change Bank
6 0017946108 AC_SMADHESH 07/02/2020 5 6 Create Supplier
7 0019573163 AC_SJAYACHA1 26/03/2020 12 5 Select Idocs
8 0017660339 AC_SSELVAM1 31/01/2020 4 6 Create material
9 0018324802 AC_DKUPPUSA1 18/02/2020 7 3 VIM Workplace
10 0019161678 AC_N478361 14/03/2020 10 7 Release Blocked Invoices
并且输出应该如下所示,尽管此处 USER_CNT 值计算错误。
WEEK DESCRIPTION DOC_CNT USER_CNT
1 10 Reset Cleared Items 229 13
2 3 Maintain Business Partner 600 10
3 4 Data Capture/Indexing Invoice 4,974 31
4 7 Other 1,207 54
5 9 Check VIM Business Rules 2,132 23
6 6 Check VIM Business Rules 2,863 29
7 5 Other 1,096 52
8 12 Check VIM Business Rules 1,390 19
9 4 Check VIM Business Rules 2,710 27
10 4 Other 1,462 56
我认为这里的问题是您在当前查询中使用 description
字段对行进行分组:
COUNT(*) AS doc_cnt, -- # rows per week / description
COUNT(DISTINCT user_id) AS user_cnt -- # distinct users per week / description
据我了解,您想要 # documents and distinct users per week, regardless of description
。假设每个文档一行,你可以这样做:
SELECT
src.week,
MAX(src.docs_per_week) AS docs_per_week,
COUNT(*) AS users_per_week
FROM (
SELECT
week,
COUNT(*) OVER(PARTITION BY week) AS docs_per_week -- # rows per week
FROM the_table_1
QUALIFY ROW_NUMBER() OVER(PARTITION BY week, user_id) = 1 -- Remove duplicate users
) src
GROUP BY src.week -- One row per week