使用 SQL 填充值来计算前一周的值

Using SQL to pad values to calculate previous week values

样本Table

   CREATE TABLE SAMPLE_TABLE (WEEK DATE, TYPE VARCHAR(50), Movie VARCHAR(50), Question VARCHAR(50), Answer VARCHAR(50), value NUMBER(38,0));

示例数据

INSERT INTO sample_table VALUES('10/1/2020',    'A',    'Contaco',  '1',    'N/A',  '1'),
('10/1/2020',   'A',    'Contaco',  '1',    'Definitely not',   '4'),
('10/1/2020',   'A',    'Contaco',  '1',    'Definitely',   '2'),
('10/1/2020',   'A',    'Contaco',  '1',    'Probably', '2'),
('10/1/2020',   'A',    'Contaco',  '1',    'Maybe',    '1'),
('10/8/2020',   'A',    'Contaco',  '1',    'N/A',  '3'),
('10/8/2020',   'A',    'Contaco',  '1',    'Definitely not',   '1'),
('10/8/2020',   'A',    'Contaco',  '1',    'Definitely',   '2'),
('10/8/2020',   'A',    'Contaco',  '1',    'Probably', '4'),
('10/8/2020',   'A',    'Contaco',  '1',    'Maybe',    '1'),
('10/15/2020',  'A',    'Contaco',  '1',    'N/A',  '2'),
('10/15/2020',  'A',    'Contaco',  '1',    'Definitely not',   '1'),
('10/15/2020',  'A',    'Contaco',  '1',    'Definitely',   '2'),
('10/15/2020',  'A',    'Contaco',  '1',    'Probably', '3'),
('10/15/2020',  'A',    'Contaco',  '1',    'Maybe',    '2'),
('10/1/2020',   'B',    'Contaco',  '1',    'N/A',  '1'),
('10/1/2020',   'B',    'Contaco',  '1',    'Definitely not',   '4'),
('10/1/2020',   'B',    'Contaco',  '1',    'Definitely',   '2'),
('10/1/2020',   'B',    'Contaco',  '1',    'Maybe',    '1'),
('10/8/2020',   'B',    'Contaco',  '1',    'N/A',  '3'),
('10/8/2020',   'B',    'Contaco',  '1',    'Definitely',   '1'),
('10/8/2020',   'B',    'Contaco',  '1',    'Probably', '2'),
('10/8/2020',   'B',    'Contaco',  '1',    'Maybe',    '1'),
('10/15/2020',  'B',    'Contaco',  '1',    'N/A',  '2'),
('10/15/2020',  'B',    'Contaco',  '1',    'Definitely not',   '1'),
('10/15/2020',  'B',    'Contaco',  '1',    'Definitely',   '2'),
('10/15/2020',  'B',    'Contaco',  '1',    'Maybe',    '2') ;

当前查询

 Select week, type, movie, question, answer, value,
 LAG(value, 1, 0) OVER (PARTITION BY movie, question, answer, type ORDER BY movie, type, week ASC) AS one_week_prior_value,
 LAG(value, 2, 0) OVER (PARTITION BY movie, question, answer, type ORDER BY movie, type, week ASC) AS two_week_prior_value
 from sample_table ;

对于此查询 - 我正在尝试使用滞后函数为相同类型、电影、问题和答案导出“一周前值”和“两周前值”的值。它在 Type = A 时非常有效,因为每个问题的答案都在每周的数据中。

问题是当 Type = B 时,所有“答案”选项并非每周都可用。滞后函数查找前一个最非空值,或者它不会为不存在的“答案”值提供当前周的值。下面的例子

两期:

  1. 2020 年 10 月 15 日“一周前值”的“绝对不是”行应该是 0 而不是 4,因为前一周没有任何“绝对不是”的值。 =16=]

  2. 10/15/2020 应该有一行表示“可能”,0 表示“值”,因为它没有那一周的值,但确实有前一周的值2 周。

+--------------------------+------+---------+----------+----------------+-------+----------------------+----------------------+
| Month, Day, Year of Week | Type | Movie   | Question | Answer         | Value | One Week Prior Value | Two Week Prior Value |
+--------------------------+------+---------+----------+----------------+-------+----------------------+----------------------+
| 1-Oct-20                 | A    | Contaco | 1        | Definitely     | 2     | 0                    | 0                    |
| 1-Oct-20                 | A    | Contaco | 1        | Definitely not | 4     | 0                    | 0                    |
| 1-Oct-20                 | A    | Contaco | 1        | Maybe          | 1     | 0                    | 0                    |
| 1-Oct-20                 | A    | Contaco | 1        | N/A            | 1     | 0                    | 0                    |
| 1-Oct-20                 | A    | Contaco | 1        | Probably       | 2     | 0                    | 0                    |
| 8-Oct-20                 | A    | Contaco | 1        | Definitely     | 2     | 2                    | 0                    |
| 8-Oct-20                 | A    | Contaco | 1        | Definitely not | 1     | 4                    | 0                    |
| 8-Oct-20                 | A    | Contaco | 1        | Maybe          | 1     | 1                    | 0                    |
| 8-Oct-20                 | A    | Contaco | 1        | N/A            | 3     | 1                    | 0                    |
| 8-Oct-20                 | A    | Contaco | 1        | Probably       | 4     | 2                    | 0                    |
| 15-Oct-20                | A    | Contaco | 1        | Definitely     | 2     | 2                    | 2                    |
| 15-Oct-20                | A    | Contaco | 1        | Definitely not | 1     | 1                    | 4                    |
| 15-Oct-20                | A    | Contaco | 1        | Maybe          | 2     | 1                    | 1                    |
| 15-Oct-20                | A    | Contaco | 1        | N/A            | 2     | 3                    | 1                    |
| 15-Oct-20                | A    | Contaco | 1        | Probably       | 3     | 4                    | 2                    |
| 1-Oct-20                 | B    | Contaco | 1        | Definitely     | 2     | 0                    | 0                    |
| 1-Oct-20                 | B    | Contaco | 1        | Definitely not | 4     | 0                    | 0                    |
| 1-Oct-20                 | B    | Contaco | 1        | Maybe          | 1     | 0                    | 0                    |
| 1-Oct-20                 | B    | Contaco | 1        | N/A            | 1     | 0                    | 0                    |
| 8-Oct-20                 | B    | Contaco | 1        | Definitely     | 1     | 2                    | 0                    |
| 8-Oct-20                 | B    | Contaco | 1        | Maybe          | 1     | 1                    | 0                    |
| 8-Oct-20                 | B    | Contaco | 1        | N/A            | 3     | 1                    | 0                    |
| 8-Oct-20                 | B    | Contaco | 1        | Probably       | 2     | 0                    | 0                    |
| 15-Oct-20                | B    | Contaco | 1        | Definitely     | 2     | 1                    | 2                    |
| 15-Oct-20                | B    | Contaco | 1        | Definitely not | 1     | 4                    | 0                    |
| 15-Oct-20                | B    | Contaco | 1        | Maybe          | 2     | 1                    | 1                    |
| 15-Oct-20                | B    | Contaco | 1        | N/A            | 2     | 3                    | 1                    |
+--------------------------+------+---------+----------+----------------+-------+----------------------+----------------------+

这就是我理想的输出结果。关于如何使用给定的可用数据实现这一目标的任何建议?我花了一些时间来写这个 post 希望它是值得的!提前谢谢你

  1. 您可以看到 10/15/2020 - “绝对不是”的一周先验值是 0 而不是 4
  2. 您可以看到 10/15/2020 - “可能”的值为 0

+------------------+------+-----------+----------+----------------+-------+------------------+--------------------+
| Week             | Type | Movie     | Question | Answer         | value | prior week value | 2 week prior value |
+------------------+------+-----------+----------+----------------+-------+------------------+--------------------+
| October 1, 2020  | A    | Contaco| 1        | N/A            | 1.000 | 0.000            | 0.000              |
| October 1, 2020  | A    | Contaco| 1        | Definitely not | 4.000 | 0.000            | 0.000              |
| October 1, 2020  | A    | Contaco| 1        | Definitely     | 2.000 | 0.000            | 0.000              |
| October 1, 2020  | A    | Contaco| 1        | Probably       | 2.000 | 0.000            | 0.000              |
| October 1, 2020  | A    | Contaco| 1        | Maybe          | 1.000 | 0.000            | 0.000              |
| October 8, 2020  | A    | Contaco| 1        | N/A            | 3.000 | 1.000            | 0.000              |
| October 8, 2020  | A    | Contaco| 1        | Definitely not | 1.000 | 4.000            | 0.000              |
| October 8, 2020  | A    | Contaco| 1        | Definitely     | 2.000 | 2.000            | 0.000              |
| October 8, 2020  | A    | Contaco| 1        | Probably       | 4.000 | 2.000            | 0.000              |
| October 8, 2020  | A    | Contaco| 1        | Maybe          | 1.000 | 1.000            | 0.000              |
| October 15, 2020 | A    | Contaco| 1        | N/A            | 2.000 | 3.000            | 1.000              |
| October 15, 2020 | A    | Contaco| 1        | Definitely not | 1.000 | 4.000            | 4.000              |
| October 15, 2020 | A    | Contaco| 1        | Definitely     | 2.000 | 2.000            | 2.000              |
| October 15, 2020 | A    | Contaco| 1        | Probably       | 3.000 | 4.000            | 2.000              |
| October 15, 2020 | A    | Contaco| 1        | Maybe          | 2.000 | 1.000            | 1.000              |
| October 1, 2020  | B    | Contaco| 1        | N/A            | 1.000 | 0.000            | 0.000              |
| October 1, 2020  | B    | Contaco| 1        | Definitely not | 4.000 | 0.000            | 0.000              |
| October 1, 2020  | B    | Contaco| 1        | Definitely     | 2.000 | 0.000            | 0.000              |
| October 1, 2020  | B    | Contaco| 1        | Probably       | 0.000 | 0.000            | 0.000              |
| October 1, 2020  | B    | Contaco| 1        | Maybe          | 1.000 | 0.000            | 0.000              |
| October 8, 2020  | B    | Contaco| 1        | N/A            | 3.000 | 1.000            | 0.000              |
| October 8, 2020  | B    | Contaco| 1        | Definitely not | 0.000 | 4.000            | 0.000              |
| October 8, 2020  | B    | Contaco| 1        | Definitely     | 2.000 | 2.000            | 0.000              |
| October 8, 2020  | B    | Contaco| 1        | Probably       | 4.000 | 0.000            | 0.000              |
| October 8, 2020  | B    | Contaco| 1        | Maybe          | 1.000 | 1.000            | 0.000              |
| October 15, 2020 | B    | Contaco| 1        | N/A            | 2.000 | 3.000            | 1.000              |
| October 15, 2020 | B    | Contaco| 1        | Definitely not | 1.000 | 0.000            | 4.000              |
| October 15, 2020 | B    | Contaco| 1        | Definitely     | 2.000 | 2.000            | 2.000              |
| October 15, 2020 | B    | Contaco| 1        | Probably       | 0.000 | 4.000            | 0.000              |
| October 15, 2020 | B    | Contaco| 1        | Maybe          | 2.000 | 1.000            | 1.000              |
+------------------+------+-----------+----------+----------------+-------+------------------+--------------------+

您可以通过几个相关查询来解决它(相当于左自连接):

  select week, type, movie, question, answer, value
   , (select any_value(value)
      from sample_table 
      where week=a.week-7 
      and (movie, question, answer, type) = (a.movie, a.question, a.answer, a.type)
     ) prev1
   , (select any_value(value) 
      from sample_table 
      where week=a.week-14 
      and (movie, question, answer, type) = (a.movie, a.question, a.answer, a.type)
     ) prev2
 from sample_table a
 order by movie, question, answer, type, week;

感谢简单的问题重现设置!


现在,评论中要求添加 0。我觉得 SQL 太疯狂了,但是既然你在设置上投入了这么多...

with combos as (
    select *
    from (select distinct movie from sample_table) a
    , (select distinct question from sample_table) b
    , (select distinct answer from sample_table) c
    , (select distinct type from sample_table) d
    , (select distinct week from sample_table) e
)

select week, type, movie, question, answer
, ifnull((select any_value(value)
  from sample_table 
  where (week, movie, question, answer, type) = (a.week, a.movie, a.question, a.answer, a.type)
 ), (select 0*max(value)
  from sample_table 
  where (movie, question, answer, type) = (a.movie, a.question, a.answer, a.type)
  and week<a.week
 )) value
, ifnull((select any_value(value)
  from sample_table 
  where (week, movie, question, answer, type) = (a.week-7, a.movie, a.question, a.answer, a.type)
 ), (select 0*max(value)
  from sample_table 
  where (movie, question, answer, type) = (a.movie, a.question, a.answer, a.type)
  and week<a.week
 )) prev1
, ifnull((select any_value(value)
  from sample_table 
  where (week, movie, question, answer, type) = (a.week-14, a.movie, a.question, a.answer, a.type)
 ), (select 0*max(value)
  from sample_table 
  where (movie, question, answer, type) = (a.movie, a.question, a.answer, a.type)
  and week<a.week-7
 )) prev2
from combos a
order by movie, question, answer, type, week;

你很接近。您需要恢复丢失的 B 条目,例如 10/15/2020, "Probably",然后应用您的 window 函数。例如,像这样:

SELECT
   t.*,
   LAG(t.value, 1, 0) OVER (
       PARTITION BY t.movie, t.question, t.answer, t.type 
       ORDER BY t.movie, t.type, t.week ASC
   ) AS one_week_prior_value,
   LAG(t.value, 2, 0) OVER (
       PARTITION BY t.movie, t.question, t.answer, t.type 
       ORDER BY t.movie, t.type, t.week ASC
   ) AS two_week_prior_value
FROM (
  SELECT
    weeks.week, 
    movies.type, 
    movies.movie, 
    movies.question, 
    movies.answer, 
    COALESCE(t.value,0) AS value
  FROM (
      SELECT DISTINCT week 
      FROM sample_table
  ) weeks
  CROSS JOIN (
      SELECT DISTINCT type, movie, question, answer 
      FROM sample_table
  ) movies
  LEFT JOIN sample_table t 
      ON weeks.week = t.week AND 
         movies.movie = t.movie AND 
         movies.type = t.type AND
         movies.question = t.question AND
         movies.answer = t.answer
) t