根据混合日期填写缺失的行

Fill in missing rows based on mixed dates

我正在 Informatica IICS 中执行映射,并尝试根据多个字段填充数据集中缺失的行。

下面是数据示例 table。有一个 ID 字段,一个 Week_Start 字段,它是报告数据的一周的开始日期,一个相应的 Week_Number 和一个 Year 字段,它指定属于上一年或本年度的数据。 Sales 是该特定 ID 的销售数量,Sales_Type 是销售类别。

但是,有些日期特定的人没有进行销售,因此缺少与该数据对应的行。我想用所有相关信息填写这些行,并将 Sales 字段设置为 0.

我的实际数据有 6 周 window 的信息,包括前一年和当年的 7 种不同销售类型。所以我希望每个 ID 有 6x2x7 = 84 行。即如果我有 100 个唯一 ID,我的最终 table 应该有 8400 行。

Table 缺少行:

+----+------------+-------------+---------+-------+------------+
| ID | Week_Start | Week_Number |  Year   | Sales | Sales_Type |
+----+------------+-------------+---------+-------+------------+
|  1 | 01/01/2018 |           1 | Prior   |     1 | A          |
|  1 | 01/08/2018 |           2 | Prior   |     3 | A          |
|  1 | 01/15/2018 |           3 | Prior   |     3 | A          |
|  1 | 01/29/2018 |           5 | Prior   |     4 | A          |
|  1 | 01/01/2019 |           1 | Current |     2 | A          |
|  1 | 01/08/2019 |           2 | Current |     4 | A          |
|  1 | 01/15/2019 |           3 | Current |     1 | A          |
|  1 | 01/22/2019 |           4 | Current |     1 | A          |
|  1 | 01/01/2018 |           1 | Prior   |     1 | B          |
|  1 | 01/08/2018 |           2 | Prior   |     3 | B          |
|  1 | 01/15/2018 |           3 | Prior   |     3 | B          |
|  1 | 01/29/2018 |           5 | Prior   |     4 | B          |
|  1 | 01/01/2019 |           1 | Current |     2 | B          |
|  1 | 01/08/2019 |           2 | Current |     4 | B          |
|  1 | 01/15/2019 |           3 | Current |     1 | B          |
|  1 | 01/22/2019 |           4 | Current |     1 | B          |
+----+------------+-------------+---------+-------+------------+

填充缺失行的预期结果:

+----+------------+-------------+---------+-------+------------+
| ID | Week_Start | Week_Number |  Year   | Sales | Sales_Type |
+----+------------+-------------+---------+-------+------------+
|  1 | 01/01/2018 |           1 | Prior   |     1 | A          |
|  1 | 01/08/2018 |           2 | Prior   |     3 | A          |
|  1 | 01/15/2018 |           3 | Prior   |     3 | A          |
|  1 | 01/22/2018 |           4 | Prior   |     0 | A          |
|  1 | 01/29/2018 |           5 | Prior   |     4 | A          |
|  1 | 01/01/2019 |           1 | Current |     2 | A          |
|  1 | 01/08/2019 |           2 | Current |     4 | A          |
|  1 | 01/15/2019 |           3 | Current |     1 | A          |
|  1 | 01/22/2019 |           4 | Current |     1 | A          |
|  1 | 01/29/2019 |           5 | Current |     0 | A          |
|  1 | 01/01/2018 |           1 | Prior   |     1 | B          |
|  1 | 01/08/2018 |           2 | Prior   |     3 | B          |
|  1 | 01/15/2018 |           3 | Prior   |     3 | B          |
|  1 | 01/22/2018 |           4 | Prior   |     0 | B          |
|  1 | 01/29/2018 |           5 | Prior   |     4 | B          |
|  1 | 01/01/2019 |           1 | Current |     2 | B          |
|  1 | 01/08/2019 |           2 | Current |     4 | B          |
|  1 | 01/15/2019 |           3 | Current |     1 | B          |
|  1 | 01/22/2019 |           4 | Current |     1 | B          |
|  1 | 01/29/2019 |           5 | Current |     0 | B          |
+----+------------+-------------+---------+-------+------------+

我曾尝试在 ICS 中使用转换,但是 none 它们完成了我正在尝试做的事情。关于如何做到这一点,我最好的猜测是在 SQL 中使用递归 CTE 并引入 SQL 脚本来生成这些缺失的行。

我的问题是,如何在多个分区上执行此操作?这不仅仅是我感兴趣的缺失日期,它还缺失了两年的日期和几种不同类型的销售。 Week_Start 列包含混合数据,这使情况变得更加复杂。我早期尝试这样做最终生成了 2018 年日期和 2019 年数据之间的所有行。

使用 cross join 生成行并使用 left join 引入值:

select w.week_start, w.week_number, ys.year, ys.sales_type,
       coalesce(t.sales, 0) as sales
from (select distinct week_start, week_number from t) w cross join
     (select distinct year, sales_type from t) ys left join
     t
     on t.week_start = w.week_start and
        t.year = ys.year and
        t.sales_type = ys.sales_type;