Presto SQL - 条件后每行计数 1,条件前负 1

Presto SQL - Count 1 for each row after condition and negative 1 before

我有这样的数据:

date         group   state value
2018-01-01   A       A        20
2018-01-02   A       A        0
2018-01-03   A       A        0
2018-01-04   A       B        70
2018-01-05   A       B        0
2018-01-06   A       B        80

我想从状态 A 移动到状态 B,其中第一个日期 = 0,之后的一天 = 1,以此类推,我想将每个日期计数 1。我还希望之前的日期按 -1 计算。我也想通过组列来做到这一点,以确保每个组都有一个单独的计数。

这将是输出:

date         group   state value  count
2018-01-01   A       A        20 -3    
2018-01-02   A       A        0  -2
2018-01-03   A       A        0  -1
2018-01-04   A       B        70  0
2018-01-05   A       B        0   1
2018-01-06   A       B        80  2

我试过这样的事情:

SELECT ROW_NUMBER() OVER (PARTITION BY group, state ORDER BY date)

但是我最后得到了一列 1。

您可以将 SUMwindow 函数一起使用 .

这个sqlfiddle是SQL-server,但是prestodb也支持windows函数,就让DATEDIFF转换成date_diff 函数。

CREATE TABLE T(
  date DATE,
    [group] VARCHAR(50),
    state VARCHAR(50),
    value INT
);




INSERT INTO T VALUES ('2018-01-01','A','A' ,20);
INSERT INTO T VALUES ('2018-01-02','A','A' ,0);
INSERT INTO T VALUES ('2018-01-03','A','A' ,0);
INSERT INTO T VALUES ('2018-01-04','A','B' ,70);
INSERT INTO T VALUES ('2018-01-05','A','B' ,0);
INSERT INTO T VALUES ('2018-01-06','A','B' ,80);

查询 1:

SELECT *,SUM(CASE 
             WHEN state = 'B' AND MINDT = date  THEN 0
             WHEN  state = 'B' THEN 1
             else -1 end
            ) OVER(PARTITION BY [group], state ORDER BY 
                   CASE WHEN state = 'B' THEN date_diff(day,MAXDT,date)
                        WHEN state = 'A' THEN date_diff(day,date,MINDT)
                   END)  count
FROM (
  SELECT *, 
         MAX(date) over(PARTITION BY [group], state ORDER BY date desc) MAXDT,
         MIN(date) over(PARTITION BY [group], state ORDER BY date) MINDT
  FROM T
) tt
order by date

Results:

|       date | group | state | value |      MAXDT |      MINDT | count |
|------------|-------|-------|-------|------------|------------|-------|
| 2018-01-01 |     A |     A |    20 | 2018-01-03 | 2018-01-01 |    -3 |
| 2018-01-02 |     A |     A |     0 | 2018-01-03 | 2018-01-01 |    -2 |
| 2018-01-03 |     A |     A |     0 | 2018-01-03 | 2018-01-01 |    -1 |
| 2018-01-04 |     A |     B |    70 | 2018-01-06 | 2018-01-04 |     0 |
| 2018-01-05 |     A |     B |     0 | 2018-01-06 | 2018-01-04 |     1 |
| 2018-01-06 |     A |     B |    80 | 2018-01-06 | 2018-01-04 |     2 |

尝试使用this逻辑:

SELECT t.*,
       ( case when ( group_ = 'A' and state = 'B' ) then
          ROW_NUMBER() OVER (PARTITION BY group_, state ORDER BY date)
         else
          ROW_NUMBER() OVER (PARTITION BY group_, state) -
          COUNT(1) over (PARTITION BY group_, state) 
         end ) - 1 as count
  FROM tab t;

如果只有一个过渡,我会推荐:

select t.*,
       (seqnum - max(case when state = 'B' then seqnum end) over ()) as counter
from (select t.*, row_number() over (order by date) as seqnum
      from t
     ) t;

换句话说,生成一个顺序序列。然后减去 B 第一次出现的值。