了解 INNER JOIN 逻辑
Understanding INNER JOIN logic
我有以下 table 汇率架构:
name
type
kind
null?
default
primary key
unique key
COUNTRY
VARCHAR(10)
COLUMN
Y
N
N
RATETYPE
VARCHAR(6)
COLUMN
Y
N
N
FROMCURRENCY
VARCHAR(3)
COLUMN
Y
N
N
TOCURRENCY
VARCHAR(3)
COLUMN
Y
N
N
STARTDATE
VARCHAR(12)
COLUMN
Y
N
N
RATE
NUMBER(15,7)
COLUMN
Y
N
N
其中我只想要 USD/MTHEND 行,即:
SELECT FromCurrency, ToCurrency, Date(StartDate, 'YYYYMMDD') AS StartDate, Rate
FROM EXCHANGERATES
WHERE DATE(StartDate, 'YYYYMMDD') > CURRENT_DATE - 15000 AND RATETYPE = 'MTHEND' AND ToCurrency = 'USD'
ORDER BY FromCurrency, ToCurrency, StartDate;
FROMCURRENCY
TOCURRENCY
STARTDATE
RATE
JPY
USD
2018-12-01
113.4700000
JPY
USD
2019-03-30
0.0090342
JPY
USD
2019-06-28
0.0092721
JPY
USD
2019-08-02
0.0093388
JPY
USD
2019-08-30
0.0093967
JPY
USD
2019-09-27
0.0092729
JPY
USD
2019-11-01
0.0092592
JPY
USD
2019-11-29
0.0091315
JPY
USD
2019-12-28
0.0091174
JPY
USD
2020-02-01
0.0091675
JPY
USD
2020-02-29
0.0091802
JPY
USD
2020-03-28
0.0092157
JPY
USD
2020-05-02
0.0093431
JPY
USD
2020-05-30
0.0093266
JPY
USD
2020-06-27
0.0093361
JPY
USD
2020-08-01
0.0095812
JPY
USD
2020-08-29
0.0094144
JPY
USD
2020-09-26
0.0094966
JPY
USD
2020-10-31
0.0095739
JPY
USD
2020-11-27
0.0096061
JPY
USD
2020-12-26
0.0096525
JPY
USD
2021-01-30
0.0095693
JPY
USD
2021-02-27
0.0094197
...
...
...
...
JPY
USD
2022-02-26
0.0086700
但是没有结束日期列,因此我有以下查询使用 self INNER JOIN 来设置结束日期:
SELECT
EX.FromCurrency,
EX.ToCurrency,
DATE(EX.StartDate,'YYYYMMDD') AS StartDate, DATE(EX2.EndDate,'YYYYMMDD') AS EndDate,
EX.Rate
FROM
EXCHANGERATES EX
INNER JOIN(
SELECT
FromCurrency,
ToCurrency,
Max(StartDate) AS StartDate,
20251231 AS EndDate
FROM
EXCHANGERATES
WHERE
RateType = 'MTHEND'
GROUP BY
Fromcurrency,
ToCurrency
UNION
SELECT
E2.FromCurrency,
E2.ToCurrency,
Max(E.StartDate) AS StartDate,
to_number(to_char(DateAdd(DAY,-1,To_Date(to_char(E2.StartDate),'YYYYMMDD')),'YYYYMMDD')) AS EndDate
FROM
EXCHANGERATES E
INNER JOIN
EXCHANGERATES E2 ON
E.StartDate < E2.StartDate
AND E.RateType = E2.RateType
WHERE
E.RateType = 'MTHEND'
GROUP BY
E2.FromCurrency,
E2.ToCurrency,
E2.StartDate) AS EX2 ON
EX.FromCurrency = EX2.FromCurrency
AND EX.ToCurrency = EX2.ToCurrency
AND EX.StartDate = EX2.StartDate
AND EX.RateType = 'MTHEND'
WHERE
Ex.tocurrency = 'USD'
ORDER BY 1, 2, 3;
FROMCURRENCY
TOCURRENCY
STARTDATE
ENDDATE
RATE
JPY
USD
2019-12-28
2020-01-31
0.0091174
JPY
USD
2020-05-02
2020-05-29
0.0093431
JPY
USD
2020-05-30
2020-06-26
0.0093266
JPY
USD
2020-06-27
2020-07-31
0.0093361
JPY
USD
2020-08-01
2020-08-28
0.0095812
JPY
USD
2020-09-26
2020-10-30
0.0094966
JPY
USD
2020-10-31
2020-11-26
0.0095739
JPY
USD
2020-12-26
2021-01-29
0.0096525
JPY
USD
2021-01-30
2021-02-26
0.0095693
JPY
USD
2021-02-27
2021-03-26
0.0094197
为什么 INNER 结果与下面使用 LEAD 的 tinazmu 查询不同?下面捕获所有具有正确结束日期的唯一 USD/MTHEND 行:
SELECT
FromCurrency,
ToCurrency,
DATE(StartDate,'YYYYMMDD') AS StartDate,
LEAD(DateAdd(DAY, -1, Date(StartDate, 'YYYYMMDD')),1,'2025-12-31')
OVER (PARTITION BY FromCurrency, ToCurrency, RateType
ORDER BY StartDate) as EndDate,
Rate
FROM
EXCHANGERATES
WHERE RateType = 'MTHEND' AND ToCurrency = 'USD'
ORDER BY FromCurrency, ToCurrency, StartDate;
FROMCURRENCY
TOCURRENCY
STARTDATE
ENDDATE
RATE
JPY
USD
2018-12-01
2019-03-29
113.4700000
JPY
USD
2019-03-30
2019-06-27
0.0090342
JPY
USD
2019-06-28
2019-08-01
0.0092721
JPY
USD
2019-08-02
2019-08-29
0.0093388
JPY
USD
2019-08-30
2019-09-26
0.0093967
JPY
USD
2019-09-27
2019-10-31
0.0092729
JPY
USD
2019-11-01
2019-11-28
0.0092592
JPY
USD
2019-11-29
2019-12-27
0.0091315
JPY
USD
2019-12-28
2020-01-31
0.0091174
JPY
USD
2020-02-01
2020-02-28
0.0091675
你没有显示你的 EXCHANGERATES table,但它似乎只有一个日期:StartDate(它应该被称为 EffectiveDate),并且它为每个货币对和日期保留一行可用率。事实上,汇率每天都在变化,public 假期除外,不保留假期汇率(通过复制前一天的汇率)并不能节省多少。然后,只需说 ON ... EXCHANGERATES.StartDate=DayN
,就可以 运行 他们对 day-n 的汇率转换查询,而以上所有操作都是不必要的。
如果您对基础 EXCHANGERATE table 的人口制度没有任何控制权,那么您必须找到一种方法来获取 DayN 的汇率,如果该汇率不可用,则为 DayN-1 , 等等。如果您知道周末唯一缺少的费率,您可以简单地加入此 table 3 次,全部使用 LEFT JOIN,第一次使用 StartDate=DayN,第二次使用 StartDate.DayN-1,等等.. ,并选择最新的可用。
另一方面,如果存在不可预测的 table 持续时间间隔,您的问题将变成 gaps/island 问题,您发布的查询是解决它的一种方法。还有其他方法,不一定更好,寻找SQL差距和孤岛问题,巩固islands/packing.
我不知道 Snowflake 平台,但在 SQLServer(或 Teradata)中,这可以替代您的查询:
SELECT
FromCurrency,
ToCurrency,
RateType,
Rate,
StartDate,
LEAD(DateAdd(day, -1, StartDate),1,'2025-12-31')
OVER (partition by FromCurrency, ToCurrency, RateType
ORDER BY by StartDate) as EndDate
FROM EXCHANGERATES E
2022 年 2 月 28 日更新;根据我对您数据的理解,这应该可以替代您的查询:
SELECT
FromCurrency,
ToCurrency,
DATE(StartDate, 'YYYYMMDD') as StartDate,
LEAD(DateAdd(day, -1, DATE(StartDate, 'YYYYMMDD')),1,'2025-12-31')
OVER (PARTITION by FromCurrency, ToCurrency, RateType
ORDER BY StartDate) as EndDate,
Rate
FROM EXCHANGERATES E
WHERE ToCurrency='USD'
and RateType='MTHEND'
ORDER BY 1, 2, 3;
你能检查一下吗?
2022 年 3 月 1 日更新:
联合子查询 EX2 简单地找到 'Month End Rates' 的所有日期间隔:
并集的第 1 部分(SELECT ... Max(StartDate) AS StartDate, 20251231 AS EndDate)找到最新的 StartDate,每个 From/ToCurrency 和 calls 的组合都有月末汇率这从 StartDate 到 2025-12-31(未来的日期)有效。这样,最新汇率可用于任何日期 >=max(StartDate)
然后按如下方式组合(UNION 的第 2 部分)较旧的记录:对于 table (E2) 中的每个月末汇率,它会在 table (E) 中找到之前的汇率, E.StartDate 上有一个新的汇率
外部查询 (EX) 然后自己获取速率,将它们与 EX2 中导出的间隔相结合。
为了使其正常工作,UNION 第二部分的连接条件必须指定相同的货币(否则我们会找到与先前记录不同的货币的汇率):
E.StartDate < E2.StartDate
AND E.RateType = E2.RateType
AND E.FromCurrency = E2.FromCurrency
AND E.ToCurrency=E2.ToCurrency
也许这就解释了差异...
我有以下 table 汇率架构:
name | type | kind | null? | default | primary key | unique key |
---|---|---|---|---|---|---|
COUNTRY | VARCHAR(10) | COLUMN | Y | N | N | |
RATETYPE | VARCHAR(6) | COLUMN | Y | N | N | |
FROMCURRENCY | VARCHAR(3) | COLUMN | Y | N | N | |
TOCURRENCY | VARCHAR(3) | COLUMN | Y | N | N | |
STARTDATE | VARCHAR(12) | COLUMN | Y | N | N | |
RATE | NUMBER(15,7) | COLUMN | Y | N | N |
其中我只想要 USD/MTHEND 行,即:
SELECT FromCurrency, ToCurrency, Date(StartDate, 'YYYYMMDD') AS StartDate, Rate
FROM EXCHANGERATES
WHERE DATE(StartDate, 'YYYYMMDD') > CURRENT_DATE - 15000 AND RATETYPE = 'MTHEND' AND ToCurrency = 'USD'
ORDER BY FromCurrency, ToCurrency, StartDate;
FROMCURRENCY | TOCURRENCY | STARTDATE | RATE |
---|---|---|---|
JPY | USD | 2018-12-01 | 113.4700000 |
JPY | USD | 2019-03-30 | 0.0090342 |
JPY | USD | 2019-06-28 | 0.0092721 |
JPY | USD | 2019-08-02 | 0.0093388 |
JPY | USD | 2019-08-30 | 0.0093967 |
JPY | USD | 2019-09-27 | 0.0092729 |
JPY | USD | 2019-11-01 | 0.0092592 |
JPY | USD | 2019-11-29 | 0.0091315 |
JPY | USD | 2019-12-28 | 0.0091174 |
JPY | USD | 2020-02-01 | 0.0091675 |
JPY | USD | 2020-02-29 | 0.0091802 |
JPY | USD | 2020-03-28 | 0.0092157 |
JPY | USD | 2020-05-02 | 0.0093431 |
JPY | USD | 2020-05-30 | 0.0093266 |
JPY | USD | 2020-06-27 | 0.0093361 |
JPY | USD | 2020-08-01 | 0.0095812 |
JPY | USD | 2020-08-29 | 0.0094144 |
JPY | USD | 2020-09-26 | 0.0094966 |
JPY | USD | 2020-10-31 | 0.0095739 |
JPY | USD | 2020-11-27 | 0.0096061 |
JPY | USD | 2020-12-26 | 0.0096525 |
JPY | USD | 2021-01-30 | 0.0095693 |
JPY | USD | 2021-02-27 | 0.0094197 |
... | ... | ... | ... |
JPY | USD | 2022-02-26 | 0.0086700 |
但是没有结束日期列,因此我有以下查询使用 self INNER JOIN 来设置结束日期:
SELECT
EX.FromCurrency,
EX.ToCurrency,
DATE(EX.StartDate,'YYYYMMDD') AS StartDate, DATE(EX2.EndDate,'YYYYMMDD') AS EndDate,
EX.Rate
FROM
EXCHANGERATES EX
INNER JOIN(
SELECT
FromCurrency,
ToCurrency,
Max(StartDate) AS StartDate,
20251231 AS EndDate
FROM
EXCHANGERATES
WHERE
RateType = 'MTHEND'
GROUP BY
Fromcurrency,
ToCurrency
UNION
SELECT
E2.FromCurrency,
E2.ToCurrency,
Max(E.StartDate) AS StartDate,
to_number(to_char(DateAdd(DAY,-1,To_Date(to_char(E2.StartDate),'YYYYMMDD')),'YYYYMMDD')) AS EndDate
FROM
EXCHANGERATES E
INNER JOIN
EXCHANGERATES E2 ON
E.StartDate < E2.StartDate
AND E.RateType = E2.RateType
WHERE
E.RateType = 'MTHEND'
GROUP BY
E2.FromCurrency,
E2.ToCurrency,
E2.StartDate) AS EX2 ON
EX.FromCurrency = EX2.FromCurrency
AND EX.ToCurrency = EX2.ToCurrency
AND EX.StartDate = EX2.StartDate
AND EX.RateType = 'MTHEND'
WHERE
Ex.tocurrency = 'USD'
ORDER BY 1, 2, 3;
FROMCURRENCY | TOCURRENCY | STARTDATE | ENDDATE | RATE |
---|---|---|---|---|
JPY | USD | 2019-12-28 | 2020-01-31 | 0.0091174 |
JPY | USD | 2020-05-02 | 2020-05-29 | 0.0093431 |
JPY | USD | 2020-05-30 | 2020-06-26 | 0.0093266 |
JPY | USD | 2020-06-27 | 2020-07-31 | 0.0093361 |
JPY | USD | 2020-08-01 | 2020-08-28 | 0.0095812 |
JPY | USD | 2020-09-26 | 2020-10-30 | 0.0094966 |
JPY | USD | 2020-10-31 | 2020-11-26 | 0.0095739 |
JPY | USD | 2020-12-26 | 2021-01-29 | 0.0096525 |
JPY | USD | 2021-01-30 | 2021-02-26 | 0.0095693 |
JPY | USD | 2021-02-27 | 2021-03-26 | 0.0094197 |
为什么 INNER 结果与下面使用 LEAD 的 tinazmu 查询不同?下面捕获所有具有正确结束日期的唯一 USD/MTHEND 行:
SELECT
FromCurrency,
ToCurrency,
DATE(StartDate,'YYYYMMDD') AS StartDate,
LEAD(DateAdd(DAY, -1, Date(StartDate, 'YYYYMMDD')),1,'2025-12-31')
OVER (PARTITION BY FromCurrency, ToCurrency, RateType
ORDER BY StartDate) as EndDate,
Rate
FROM
EXCHANGERATES
WHERE RateType = 'MTHEND' AND ToCurrency = 'USD'
ORDER BY FromCurrency, ToCurrency, StartDate;
FROMCURRENCY | TOCURRENCY | STARTDATE | ENDDATE | RATE |
---|---|---|---|---|
JPY | USD | 2018-12-01 | 2019-03-29 | 113.4700000 |
JPY | USD | 2019-03-30 | 2019-06-27 | 0.0090342 |
JPY | USD | 2019-06-28 | 2019-08-01 | 0.0092721 |
JPY | USD | 2019-08-02 | 2019-08-29 | 0.0093388 |
JPY | USD | 2019-08-30 | 2019-09-26 | 0.0093967 |
JPY | USD | 2019-09-27 | 2019-10-31 | 0.0092729 |
JPY | USD | 2019-11-01 | 2019-11-28 | 0.0092592 |
JPY | USD | 2019-11-29 | 2019-12-27 | 0.0091315 |
JPY | USD | 2019-12-28 | 2020-01-31 | 0.0091174 |
JPY | USD | 2020-02-01 | 2020-02-28 | 0.0091675 |
你没有显示你的 EXCHANGERATES table,但它似乎只有一个日期:StartDate(它应该被称为 EffectiveDate),并且它为每个货币对和日期保留一行可用率。事实上,汇率每天都在变化,public 假期除外,不保留假期汇率(通过复制前一天的汇率)并不能节省多少。然后,只需说 ON ... EXCHANGERATES.StartDate=DayN
,就可以 运行 他们对 day-n 的汇率转换查询,而以上所有操作都是不必要的。
如果您对基础 EXCHANGERATE table 的人口制度没有任何控制权,那么您必须找到一种方法来获取 DayN 的汇率,如果该汇率不可用,则为 DayN-1 , 等等。如果您知道周末唯一缺少的费率,您可以简单地加入此 table 3 次,全部使用 LEFT JOIN,第一次使用 StartDate=DayN,第二次使用 StartDate.DayN-1,等等.. ,并选择最新的可用。
另一方面,如果存在不可预测的 table 持续时间间隔,您的问题将变成 gaps/island 问题,您发布的查询是解决它的一种方法。还有其他方法,不一定更好,寻找SQL差距和孤岛问题,巩固islands/packing.
我不知道 Snowflake 平台,但在 SQLServer(或 Teradata)中,这可以替代您的查询:
SELECT
FromCurrency,
ToCurrency,
RateType,
Rate,
StartDate,
LEAD(DateAdd(day, -1, StartDate),1,'2025-12-31')
OVER (partition by FromCurrency, ToCurrency, RateType
ORDER BY by StartDate) as EndDate
FROM EXCHANGERATES E
2022 年 2 月 28 日更新;根据我对您数据的理解,这应该可以替代您的查询:
SELECT
FromCurrency,
ToCurrency,
DATE(StartDate, 'YYYYMMDD') as StartDate,
LEAD(DateAdd(day, -1, DATE(StartDate, 'YYYYMMDD')),1,'2025-12-31')
OVER (PARTITION by FromCurrency, ToCurrency, RateType
ORDER BY StartDate) as EndDate,
Rate
FROM EXCHANGERATES E
WHERE ToCurrency='USD'
and RateType='MTHEND'
ORDER BY 1, 2, 3;
你能检查一下吗?
2022 年 3 月 1 日更新:
联合子查询 EX2 简单地找到 'Month End Rates' 的所有日期间隔: 并集的第 1 部分(SELECT ... Max(StartDate) AS StartDate, 20251231 AS EndDate)找到最新的 StartDate,每个 From/ToCurrency 和 calls 的组合都有月末汇率这从 StartDate 到 2025-12-31(未来的日期)有效。这样,最新汇率可用于任何日期 >=max(StartDate)
然后按如下方式组合(UNION 的第 2 部分)较旧的记录:对于 table (E2) 中的每个月末汇率,它会在 table (E) 中找到之前的汇率, E.StartDate 外部查询 (EX) 然后自己获取速率,将它们与 EX2 中导出的间隔相结合。 为了使其正常工作,UNION 第二部分的连接条件必须指定相同的货币(否则我们会找到与先前记录不同的货币的汇率): 也许这就解释了差异... E.StartDate < E2.StartDate
AND E.RateType = E2.RateType
AND E.FromCurrency = E2.FromCurrency
AND E.ToCurrency=E2.ToCurrency