带有数据透视表的复杂 SQL 查询

Question

我有 3 个 tables，users，trips 和 tripDetails。当用户创建行程时，会在行程中创建包含以下字段的行 table：

id(INT), user_id(INT) & dateCreated(DATE)

并在 tripDetails table 中创建了 4 行：

id(INT), trip_id(INT), field(VARCHAR) & value(VARCHAR)

其中字段是'smartbox.destination'、'smartbox.dateFrom'、'smartbox.dateTo', 'smartbox.numberOfPeople' 并且值列是一个 varchar。但是，假设用户更改了目的地并保存了此更改，在 trips table 中创建了一条新记录，并且在 tripDetails table（更新后的目的地）

中仅创建了一条记录]

现在我想创建一个 select，它将给我一个给定日期的用户行程快照，列为 headers:

user_id, trip_id, destination, dateFrom, dateTo, numberOfPeople, givenDay(DATE)

这样，如果某个字段在给定的那一天发生了更改，所有其他列将显示它们相对于那一天的最新值。

我已经设置了一个 sqlfiddle here

Answer 1

首先，请允许我说：您的 "overriding key/value pair" 数据处理方式存在严重缺陷。

现在解决您的问题。假设

tripDetails.value 列声明为 not null，
您的 SQL 客户要求您提供 givenDate、
您以（精确）格式 yyyy-mm-dd、

您的查询可能类似于

with pivot$ as (
    select
        U.id as user_id, T.id as trip_id, max(T.dateCreated) as trip_date,
        max(decode(TD.field, 'smartbox.destination', TD.value)) as trip_destination,
        max(decode(TD.field, 'smartbox.dateFrom', TD.value)) as trip_date_from,
        max(decode(TD.field, 'smartbox.dateTo', TD.value)) as trip_date_to,
        max(decode(TD.field, 'smartbox.numberOfPeople', TD.value)) as trip_no_of_people
    from users U
        join trips T
            on T.user_id = U.id
        join tripDetails TD
            on TD.trip_id = T.id
            and TD.field in ('smartbox.destination', 'smartbox.dateFrom', 'smartbox.dateTo', 'smartbox.numberOfPeople')
    where T.dateCreated <= date'&givenDate'
    group by U.id, T.id
),
resolve_versioning$ as (
    select user_id, trip_id, trip_date,
        first_value(trip_destination) ignore nulls over (partition by user_id order by trip_date desc rows between current row and unbounded following) as trip_destination,
        first_value(trip_date_from) ignore nulls over (partition by user_id order by trip_date desc rows between current row and unbounded following) as trip_date_from,
        first_value(trip_date_to) ignore nulls over (partition by user_id order by trip_date desc rows between current row and unbounded following) as trip_date_to,
        first_value(trip_no_of_people) ignore nulls over (partition by user_id order by trip_date desc rows between current row and unbounded following) as trip_no_of_people,
        row_number() over (partition by user_id order by trip_date desc) as relevance$
    from pivot$
)
select user_id, trip_id,
    trip_destination, trip_date_from, trip_date_to, trip_no_of_people,
    date'&givenDate' as given_date
from resolve_versioning$
where relevance$ <= 1
;

这分三步完成：

pivot$ 子查询将 key/value 对非规范化为更宽的行，使用 trip_id 作为数据集的逻辑主键，有效地在没有 [=] 时留下列 NULL 67=] 对 trip_id。（顺便说一句，这是 tripDetails.value 列的不可空性对于查询成功至关重要的地方）
resolve_versioning$ 子查询利用 first_value() 分析函数，处理用户 (partition by user_id) 所有行程的每个单独行程详细信息，找到第一个 (first_value) 非 NULL (ignore nulls) 相应旅行详细信息的值，从 "youngest" 旅行日期搜索回更早的日子 (order by trip_date desc) ... 或者，如果您查看相反，它会在旅行日期的排序中查找旅行详细信息的最后一个非 NULL 值。
rows between current row and unbounded following 是一种 "magic"，正确处理特定分析 window 所必需的 order by . (Read here for an explanation.)
整个 row_number() over (partition by user_id order by trip_date desc) 只是对从 1 向上的所有结果行进行编号，其中 1 分配给按旅行日期排序的 "youngest" 行。然后，在最外面的 select 中，整个结果被过滤以仅显示最年轻的行 (relevance$ <= 1)。

尽情享受吧！

带有数据透视表的复杂 SQL 查询

Complex SQL Query with a Pivot

sql

oracle

pivot