SQL - 创建具有另一列的非重复计数的列
SQL - Create column with distinct count of another column
我正在尝试向我的 SQl table 添加 3 列。
第 1 列:Total_Hours_Overall
我希望此列汇总每个 sequence_ID
的总小时数
第 2 列:Total_No_Codes
我希望此列是 'Client' 列中不同值的计数。
第 3 列:客户总数
我希望此列是 'Client' 列中不同值的计数,其中 'Job_Type' 列等于 'Client'
以下是我目前拥有的 table 的片段。有 800 多个唯一 sequence_ID,每个 ID 有多行。我希望上面的计算列显示每一行的值。
Sequence_ID |Date_European |Hours |Month |Day |Year |Day_of_Week |Client_Number |Client |Job_No |Job_Type
1001 |01/09/2017 |7.3 |9 |1 |2017 |Friday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE
1001 |04/09/2017 |7.3 |9 |4 |2017 |Monday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE
1001 |09/08/2017 |2 |8 |9 |2017 |Wednesday |1 |Admin |A1 |Non-Billable
1001 |24/08/2017 |1.3 |8 |24 |2017 |Thursday |2 |Client1 |A2 |Client
1001 |28/08/2017 |2.3 |8 |28 |2017 |Monday |2 |Client1 |A2 |Client
1001 |16/08/2017 |0.5 |8 |16 |2017 |Wednesday |3 |Client2 |A3 |Client
1001 |16/08/2017 |1 |8 |16 |2017 |Wednesday |2 |Client1 |A2 |Client
1001 |18/08/2017 |3 |8 |18 |2017 |Friday |3 |Client2 |A3 |Client
1001 |22/08/2017 |0.7 |8 |22 |2017 |Tuesday |4 |Client3 |A4 |Client
1001 |16/08/2017 |7.3 |8 |16 |2017 |Wednesday |5 |Client4 |A5 |Client
1001 |18/08/2017 |1.3 |8 |18 |2017 |Friday |5 |Client4 |A5 |Client
1001 |21/08/2017 |1 |8 |21 |2017 |Monday |5 |Client4 |A5 |Client
1001 |12/09/2017 |0.6 |9 |12 |2017 |Tuesday |5 |Client4 |A5 |Client
1002 |01/09/2017 |7.3 |9 |1 |2017 |Friday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE
1002 |04/09/2017 |7.3 |9 |4 |2017 |Monday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE
1002 |09/08/2017 |2 |8 |9 |2017 |Wednesday |1 |Admin |A1 |Non-Billable
1002 |24/08/2017 |1.3 |8 |24 |2017 |Thursday |4 |Client3 |A4 |Client
1002 |28/08/2017 |2.3 |8 |28 |2017 |Monday |5 |Client4 |A5 |Client
下面是我希望 table 的样子。
Sequence_ID |Date_European |Hours |Month |Day |Year |Day_of_Week |Client_Number |Client |Job_No |Job_Type |Total_Hours_Overall |Total_No_Codes |Total Clients
1001 |01/09/2017 |7.3 |9 |1 |2017 |Friday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE |35.6 |6 |4
1001 |04/09/2017 |7.3 |9 |4 |2017 |Monday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE |35.6 |6 |4
1001 |09/08/2017 |2 |8 |9 |2017 |Wednesday |1 |Admin |A1 |Non-Billable |35.6 |6 |4
1001 |24/08/2017 |1.3 |8 |24 |2017 |Thursday |2 |Client1 |A2 |Client |35.6 |6 |4
1001 |28/08/2017 |2.3 |8 |28 |2017 |Monday |2 |Client1 |A2 |Client |35.6 |6 |4
1001 |16/08/2017 |0.5 |8 |16 |2017 |Wednesday |3 |Client2 |A3 |Client |35.6 |6 |4
1001 |16/08/2017 |1 |8 |16 |2017 |Wednesday |2 |Client1 |A2 |Client |35.6 |6 |4
1001 |18/08/2017 |3 |8 |18 |2017 |Friday |3 |Client2 |A3 |Client |35.6 |6 |4
1001 |22/08/2017 |0.7 |8 |22 |2017 |Tuesday |4 |Client3 |A4 |Client |35.6 |6 |4
1001 |16/08/2017 |7.3 |8 |16 |2017 |Wednesday |5 |Client4 |A5 |Client |35.6 |6 |4
1001 |18/08/2017 |1.3 |8 |18 |2017 |Friday |5 |Client4 |A5 |Client |35.6 |6 |4
1001 |21/08/2017 |1 |8 |21 |2017 |Monday |5 |Client4 |A5 |Client |35.6 |6 |4
1001 |12/09/2017 |0.6 |9 |12 |2017 |Tuesday |5 |Client4 |A5 |Client |35.6 |6 |4
1002 |01/09/2017 |7.3 |9 |1 |2017 |Friday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE |20.2 |2 |2
1002 |04/09/2017 |7.3 |9 |4 |2017 |Monday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE |20.2 |2 |2
1002 |09/08/2017 |2 |8 |9 |2017 |Wednesday |1 |Admin |A1 |Non-Billable |20.2 |2 |2
1002 |24/08/2017 |1.3 |8 |24 |2017 |Thursday |4 |Client3 |A4 |Client |20.2 |2 |2
1002 |28/08/2017 |2.3 |8 |28 |2017 |Monday |5 |Client4 |A5 |Client |20.2 |2 |2
我已经尝试(并失败)了很多方法。我认为需要使用子查询,但我似乎无法获得正确的格式。
另一个我很难理解以产生我需要的格式的问题是计数函数,因为我知道这需要一个 group by 子句,但我想保留我的 table 中的所有行目前。
我想我对整个事情想得太多了,所以如果您能提供帮助,我们将不胜感激。
提前致谢
您可以使用 window 个函数:
select t.*, sum(hours) over (partition by sequence_id) as sum_hours,
max(codes_seqnum) over (partition by sequence_id) as num_codes,
max(clients_seqnum) over (partition by sequence_id) as num_clients
from (select t.*,
dense_rank() over (partition by sequence_id order by client) as codes_seqnum,
dense_rank() over (partition by sequence_id, job_type order by (case when job_type = 'Client' then client end)) as client_seqnum
from t
) t;
COUNT(DISTINCT)
在 SQL 服务器中使用 window 函数实际上很棘手。上面基本上可以,但是如果没有客户端就不会return 0
。在结果中只 JOIN
可能会更好:
select t.*, tt.sum_hours, tt.num_codes, tt.num_clients
from t join
(select sequence_id, sum(hours) as sum_hours,
count(distinct client) as num_codes,
count(distinct case when job_type = 'Client' then client end) as num_clients
from t
group by sequence_id
) tt
on tt.sequence_id = t.sequence_id;
我正在尝试向我的 SQl table 添加 3 列。
第 1 列:Total_Hours_Overall
我希望此列汇总每个 sequence_ID
的总小时数第 2 列:Total_No_Codes
我希望此列是 'Client' 列中不同值的计数。
第 3 列:客户总数
我希望此列是 'Client' 列中不同值的计数,其中 'Job_Type' 列等于 'Client'
以下是我目前拥有的 table 的片段。有 800 多个唯一 sequence_ID,每个 ID 有多行。我希望上面的计算列显示每一行的值。
Sequence_ID |Date_European |Hours |Month |Day |Year |Day_of_Week |Client_Number |Client |Job_No |Job_Type
1001 |01/09/2017 |7.3 |9 |1 |2017 |Friday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE
1001 |04/09/2017 |7.3 |9 |4 |2017 |Monday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE
1001 |09/08/2017 |2 |8 |9 |2017 |Wednesday |1 |Admin |A1 |Non-Billable
1001 |24/08/2017 |1.3 |8 |24 |2017 |Thursday |2 |Client1 |A2 |Client
1001 |28/08/2017 |2.3 |8 |28 |2017 |Monday |2 |Client1 |A2 |Client
1001 |16/08/2017 |0.5 |8 |16 |2017 |Wednesday |3 |Client2 |A3 |Client
1001 |16/08/2017 |1 |8 |16 |2017 |Wednesday |2 |Client1 |A2 |Client
1001 |18/08/2017 |3 |8 |18 |2017 |Friday |3 |Client2 |A3 |Client
1001 |22/08/2017 |0.7 |8 |22 |2017 |Tuesday |4 |Client3 |A4 |Client
1001 |16/08/2017 |7.3 |8 |16 |2017 |Wednesday |5 |Client4 |A5 |Client
1001 |18/08/2017 |1.3 |8 |18 |2017 |Friday |5 |Client4 |A5 |Client
1001 |21/08/2017 |1 |8 |21 |2017 |Monday |5 |Client4 |A5 |Client
1001 |12/09/2017 |0.6 |9 |12 |2017 |Tuesday |5 |Client4 |A5 |Client
1002 |01/09/2017 |7.3 |9 |1 |2017 |Friday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE
1002 |04/09/2017 |7.3 |9 |4 |2017 |Monday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE
1002 |09/08/2017 |2 |8 |9 |2017 |Wednesday |1 |Admin |A1 |Non-Billable
1002 |24/08/2017 |1.3 |8 |24 |2017 |Thursday |4 |Client3 |A4 |Client
1002 |28/08/2017 |2.3 |8 |28 |2017 |Monday |5 |Client4 |A5 |Client
下面是我希望 table 的样子。
Sequence_ID |Date_European |Hours |Month |Day |Year |Day_of_Week |Client_Number |Client |Job_No |Job_Type |Total_Hours_Overall |Total_No_Codes |Total Clients
1001 |01/09/2017 |7.3 |9 |1 |2017 |Friday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE |35.6 |6 |4
1001 |04/09/2017 |7.3 |9 |4 |2017 |Monday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE |35.6 |6 |4
1001 |09/08/2017 |2 |8 |9 |2017 |Wednesday |1 |Admin |A1 |Non-Billable |35.6 |6 |4
1001 |24/08/2017 |1.3 |8 |24 |2017 |Thursday |2 |Client1 |A2 |Client |35.6 |6 |4
1001 |28/08/2017 |2.3 |8 |28 |2017 |Monday |2 |Client1 |A2 |Client |35.6 |6 |4
1001 |16/08/2017 |0.5 |8 |16 |2017 |Wednesday |3 |Client2 |A3 |Client |35.6 |6 |4
1001 |16/08/2017 |1 |8 |16 |2017 |Wednesday |2 |Client1 |A2 |Client |35.6 |6 |4
1001 |18/08/2017 |3 |8 |18 |2017 |Friday |3 |Client2 |A3 |Client |35.6 |6 |4
1001 |22/08/2017 |0.7 |8 |22 |2017 |Tuesday |4 |Client3 |A4 |Client |35.6 |6 |4
1001 |16/08/2017 |7.3 |8 |16 |2017 |Wednesday |5 |Client4 |A5 |Client |35.6 |6 |4
1001 |18/08/2017 |1.3 |8 |18 |2017 |Friday |5 |Client4 |A5 |Client |35.6 |6 |4
1001 |21/08/2017 |1 |8 |21 |2017 |Monday |5 |Client4 |A5 |Client |35.6 |6 |4
1001 |12/09/2017 |0.6 |9 |12 |2017 |Tuesday |5 |Client4 |A5 |Client |35.6 |6 |4
1002 |01/09/2017 |7.3 |9 |1 |2017 |Friday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE |20.2 |2 |2
1002 |04/09/2017 |7.3 |9 |4 |2017 |Monday |0 |ANNUAL LEAVE |0 |ANNUAL LEAVE |20.2 |2 |2
1002 |09/08/2017 |2 |8 |9 |2017 |Wednesday |1 |Admin |A1 |Non-Billable |20.2 |2 |2
1002 |24/08/2017 |1.3 |8 |24 |2017 |Thursday |4 |Client3 |A4 |Client |20.2 |2 |2
1002 |28/08/2017 |2.3 |8 |28 |2017 |Monday |5 |Client4 |A5 |Client |20.2 |2 |2
我已经尝试(并失败)了很多方法。我认为需要使用子查询,但我似乎无法获得正确的格式。
另一个我很难理解以产生我需要的格式的问题是计数函数,因为我知道这需要一个 group by 子句,但我想保留我的 table 中的所有行目前。
我想我对整个事情想得太多了,所以如果您能提供帮助,我们将不胜感激。 提前致谢
您可以使用 window 个函数:
select t.*, sum(hours) over (partition by sequence_id) as sum_hours,
max(codes_seqnum) over (partition by sequence_id) as num_codes,
max(clients_seqnum) over (partition by sequence_id) as num_clients
from (select t.*,
dense_rank() over (partition by sequence_id order by client) as codes_seqnum,
dense_rank() over (partition by sequence_id, job_type order by (case when job_type = 'Client' then client end)) as client_seqnum
from t
) t;
COUNT(DISTINCT)
在 SQL 服务器中使用 window 函数实际上很棘手。上面基本上可以,但是如果没有客户端就不会return 0
。在结果中只 JOIN
可能会更好:
select t.*, tt.sum_hours, tt.num_codes, tt.num_clients
from t join
(select sequence_id, sum(hours) as sum_hours,
count(distinct client) as num_codes,
count(distinct case when job_type = 'Client' then client end) as num_clients
from t
group by sequence_id
) tt
on tt.sequence_id = t.sequence_id;