用最后的结果填充 table

Question

我有一个 table，其中包含以下信息：

id | amount |   date   | customer_id
 1 |  0.00  | 11/12/17 | 1
 2 | 54.00  | 11/12/17 | 1
 3 | 60.00  | 02/12/18 | 1
 4 |  0.00  | 01/18/17 | 2
 5 | 14.00  | 03/12/17 | 2
 6 | 24.00  | 02/22/18 | 2
 7 |  0.00  | 09/12/16 | 3
 8 | 74.00  | 10/01/17 | 3

我需要它看起来像下面这样：

ranked_id | id | amount |   date   | customer_id
        1 |  1 |  0.00  | 11/12/17 | 1
        2 |  2 | 54.00  | 11/12/17 | 1            
        3 |  3 | 60.00  | 02/12/18 | 1
        4 |  3 | 60.00  | 02/12/18 | 1
        5 |  3 | 60.00  | 02/12/18 | 1
        6 |  3 | 60.00  | 02/12/18 | 1
        7 |  3 | 60.00  | 02/12/18 | 1
        8 |  4 |  0.00  | 01/18/17 | 2
        9 |  5 | 14.00  | 03/12/17 | 2
       10 |  6 | 24.00  | 02/22/18 | 2
       11 |  6 | 24.00  | 02/22/18 | 2
       12 |  6 | 24.00  | 02/22/18 | 2
       13 |  6 | 24.00  | 02/22/18 | 2
       14 |  6 | 24.00  | 02/22/18 | 2
       15 |  7 |  0.00  | 09/12/16 | 3
       16 |  8 | 74.00  | 10/01/17 | 3
       17 |  8 | 74.00  | 10/01/17 | 3
       18 |  8 | 74.00  | 10/01/17 | 3
       19 |  8 | 74.00  | 10/01/17 | 3
       20 |  8 | 74.00  | 10/01/17 | 3
       21 |  8 | 74.00  | 10/01/17 | 3

我知道分区和排序（在 ranked_id 上），但我不知道如何将最后一行重复 7 次。

Answer 1

在 Postgres 中，您可以使用 generate_series() 和交叉连接来生成所有行。然后你可以选择你想要的：

select row_number() over (order by customer_id, id) as ranking_id,
       coalesce(t.id, cid) as id, coalesce(t.amount, c.amount) as amount
       coalesce(t.date, c.date) as date, t.customer_id
from (select distinct on (customer_id) t.*
      from t
      order by customer_id, date desc
     ) c cross join
     generate_series(1, 7) g(i) left join
     (select t.*, row_number() over (partition by customer_id order by date) as i
      from t
     ) t
     on t.customer_id = c.customer_id and t.i = g.i;

Answer 2

正如@Gordon Linoff 所建议的那样，您可以使用 generate_series() 函数与不同的 customer_ids 交叉来生成下面 T1 中所需的所有行。然后在 T2（也在下方）中，row_number 函数用于生成从 t1 到外部连接的顺序值以及 customer_id.

从那里开始，当没有原始数据要连接到 case 语句和分析 first_value 函数时，就可以得到每个 customer_id 的最后一个值在。我无法使 last_value 分析函数工作可能是由于 postgresql 缺少忽略空值指令，所以我使用 first_Value 和降序排序，并且只使用 return不存在其他数据时的分析值。

with t1 as (
select distinct 
       dense_rank() over (order by customer_id, generate_series) ranked_id
     , customer_id
     , generate_series
  from table1
  cross join generate_series(1,7)
), t2 as (
  select row_number() over (partition by customer_id order by id) rn
       , table1.*
    from table1
)
select t1.ranked_id
     , case when t2.customer_id is not null
            then t2.id
            else  first_value(t2.id)
                 over (partition by t1.customer_id
                       order by id desc nulls last)
       end id
     , case when t2.customer_id is not null
            then t2.amount
            else  first_value(t2.amount)
                 over (partition by t1.customer_id
                       order by id desc nulls last)
       end amount
     , case when t2.customer_id is not null
            then t2.date
            else  first_value(t2.date)
                 over (partition by t1.customer_id
                       order by id desc nulls last)
       end date
     , t1.customer_id
  from t1
  left join t2
    on t2.customer_id = t1.customer_id
   and t2.id = t1.generate_series
 order by ranked_id;

这里是 SQL Fiddle 代码演示。

用最后的结果填充 table

Fill table with last result

sql

postgresql

ranking