简化 SELECT 语句
Simplifying SELECT statement
所以我有一个我认为应该有效的声明...但是它感觉很次优,我无法在我的生活中弄清楚如何优化它。
我有以下表格:
- 交易
- [Id] 是
PRIMARY KEY IDENTITY
- [哈希] 有一个
UNIQUE
约束
- [BlockNumber] 有一个索引
- 转会
- [Id] 是
PRIMARY KEY IDENTITY
- [TransactionId] 是引用 [Transactions] 的外键。[Id]
- 代币价格
- [Id] 是
PRIMARY KEY IDENTITY
- TokenPriceAttempts
- [Id] 是
PRIMARY KEY IDENTITY
- [TransferId] 是引用 [Transfers] 的外键。[Id]
我想做的是 select 所有转账,以及来自其相关交易(一笔交易到多笔转账)的一些数据,我目前没有在其中存储价格与该转账相关的代币价格。
在查询的第一部分,我得到了所有转账的列表,并计算了最近找到的代币价格之间的 DIFF。如果找不到,则为 null(我最终想要 select)。我在交易时间戳的两边都允许 3 小时 - 如果在该时间跨度内没有发现任何内容,它将为空。
其次,我 select 从这个集合开始,首先确保 diff 为空,因为这意味着价格丢失,最后,令牌价格尝试要么没有尝试获取的条目一个价格,或者如果它列出了少于 5 次尝试并且最后一次尝试是在一个多星期前。
我的布局方式导致 WHERE 子句中基本上有 3 个相同/相似的 SELECT 语句,感觉非常不理想...
我该如何改进这种方法?
WITH [transferDateDiff] AS
(
SELECT
[t1].[Id],
[t1].[TransactionId],
[t1].[From],
[t1].[To],
[t1].[Value],
[t1].[Type],
[t1].[ContractAddress],
[t1].[TokenId],
[t2].[Hash],
[t2].[Timestamp],
ABS(DATEDIFF(SECOND, [tp].[Timestamp], [t2].[Timestamp])) AS diff
FROM
[dbo].[Transfers] AS [t1]
LEFT JOIN
[dbo].[Transactions] AS [t2]
ON [t1].[TransactionId] = [t2].[Id]
LEFT JOIN
(
SELECT
*
FROM
[dbo].[TokenPrices]
)
AS [tp]
ON [tp].[ContractAddress] = [t1].[ContractAddress]
AND [tp].[Timestamp] >= DATEADD(HOUR, - 3, [t2].[Timestamp])
AND [tp].[Timestamp] <= DATEADD(HOUR, 3, [t2].[Timestamp])
WHERE
[t1].[Type] < 2
)
SELECT
[tdd].[Id],
[tdd].[TransactionId],
[tdd].[From],
[tdd].[To],
[tdd].[Value],
[tdd].[Type],
[tdd].[ContractAddress],
[tdd].[TokenId],
[tdd].[Hash],
[tdd].[Timestamp]
FROM
[transferDateDiff] AS tdd
WHERE
[tdd].[diff] IS NULL AND
(
(
SELECT
COUNT(*)
FROM
[dbo].[TokenPriceAttempts] tpa
WHERE
[tpa].[TransferId] = [tdd].[Id]
)
= 0 OR
(
(
SELECT
COUNT(*)
FROM
[dbo].[TokenPriceAttempts] tpa
WHERE
[tpa].[TransferId] = [tdd].[Id]
)
< 5 AND
(
DATEDIFF(DAY,
(
SELECT
MAX([tpa].[Created])
FROM
[dbo].[TokenPriceAttempts] tpa
WHERE
[tpa].[TransferId] = [tdd].[Id]
),
CURRENT_TIMESTAMP
) >= 7
)
)
)
我不明白你为什么要这样做:
LEFT JOIN
(
SELECT
*
FROM
[dbo].[TokenPrices]
)
AS [tp]
ON [tp].[ContractAddress] = [t1].[ContractAddress]
AND [tp].[Timestamp] >= DATEADD(HOUR, - 3, [t2].[Timestamp])
AND [tp].[Timestamp] <= DATEADD(HOUR, 3, [t2].[Timestamp])
这不是
LEFT JOIN [dbo].[TokenPrices] as TP ...
这个:
SELECT
COUNT(*)
FROM
[dbo].[TokenPriceAttempts] tpa
WHERE
[tpa].[TransferId] = [tdd].[Id]
可能是另一个 CTE 而不是子...
事实上,您的任何子查询都可能是 CTE,这是 CTE 的一部分,使事情更容易阅读。
,TPA
AS
(
SELECT COUNT(*)
FROM [dbo].[TokenPriceAttempts] tpa
WHERE [tpa].[TransferId] = [tdd].[Id]
)
这里试图帮助简化。我删除了所有真正不需要的 [brackets] 除非你 运行 变成保留关键字之类的东西,或者名称中有空格的列(开头不好)。
无论如何,您的主查询每个 ID 有 3 个 select 实例。为了消除这种情况,我对一个子查询执行了一个 LEFT JOIN,该子查询将所有类型 < 2 的传输和 JOINS 拉到一次价格尝试中。这样,结果将已经 pre-aggregated count(*) 和 Max(Created) 完成一次,用于与您的 WITH CTE 声明相同的传输基础。因此,您不必每次都保留 运行 3 个查询,也不必查询所有传输的整个 table,只需查询具有相同基础类型 < 2 条件的那些。结果子查询别名“PQ”(preQuery)
这现在简化了外部 WHERE 子句从每个 Id 的冗余计数中的可读性。
WITH transferDateDiff AS
(
SELECT
t1.Id,
t1.TransactionId,
t1.From,
t1.To,
t1.Value,
t1.Type,
t1.ContractAddress,
t1.TokenId,
t2.Hash,
t2.Timestamp,
ABS( DATEDIFF( SECOND, tp.Timestamp, t2.Timestamp )) AS diff
FROM
dbo.Transfers t1
LEFT JOIN dbo.Transactions t2
ON t1.TransactionId = t2.Id
LEFT JOIN dbo.TokenPrices tp
ON t1.ContractAddress = tp.ContractAddress
AND tp.Timestamp >= DATEADD(HOUR, - 3, t2.Timestamp)
AND tp.Timestamp <= DATEADD(HOUR, 3, t2.Timestamp)
WHERE
t1.Type < 2
)
SELECT
tdd.Id,
tdd.TransactionId,
tdd.From,
tdd.To,
tdd.Value,
tdd.Type,
tdd.ContractAddress,
tdd.TokenId,
tdd.Hash,
tdd.Timestamp
FROM
transferDateDiff tdd
LEFT JOIN
( SELECT
t1.Id,
COUNT(*) Attempts,
MAX(tpa.Created) MaxCreated
FROM
dbo.Transfers t1
JOIN dbo.TokenPriceAttempts tpa
on t1.Id = tpa.TransferId
WHERE
t1.Type < 2
GROUP BY
t1.Id ) PQ
on tdd.Id = PQ.Id
WHERE
tdd.diff IS NULL
AND ( PQ.Attempts IS NULL
OR PQ.Attempts = 0
OR ( PQ.Attempts < 5
AND DATEDIFF(DAY, PQ.MaxCreated, CURRENT_TIMESTAMP ) >= 7
)
)
已修改以将 WITH CTE 删除到单个查询中
SELECT
t1.Id,
t1.TransactionId,
t1.From,
t1.To,
t1.Value,
t1.Type,
t1.ContractAddress,
t1.TokenId,
t2.Hash,
t2.Timestamp
FROM
-- Now, this pre-query is left-joined to token price attempts
-- so ALL Transfers of type < 2 are considered
( SELECT
t1.Id,
coalesce( COUNT(*), 0 ) Attempts,
MAX(tpa.Created) MaxCreated
FROM
dbo.Transfers t1
LEFT JOIN dbo.TokenPriceAttempts tpa
on t1.Id = tpa.TransferId
WHERE
t1.Type < 2
GROUP BY
t1.Id ) PQ
-- Now, we can just directly join to transfers for the rest
JOIN dbo.Transfers t1
on PQ.Id = t1.Id
-- and the rest from the WITH CTE construct
LEFT JOIN dbo.Transactions t2
ON t1.TransactionId = t2.Id
LEFT JOIN dbo.TokenPrices tp
ON t1.ContractAddress = tp.ContractAddress
AND tp.Timestamp >= DATEADD(HOUR, - 3, t2.Timestamp)
AND tp.Timestamp <= DATEADD(HOUR, 3, t2.Timestamp)
WHERE
ABS( DATEDIFF( SECOND, tp.Timestamp, t2.Timestamp )) IS NULL
AND ( PQ.Attempts = 0
OR ( PQ.Attempts < 5
AND DATEDIFF(DAY, PQ.MaxCreated, CURRENT_TIMESTAMP ) >= 7 )
)
所以我有一个我认为应该有效的声明...但是它感觉很次优,我无法在我的生活中弄清楚如何优化它。
我有以下表格:
- 交易
- [Id] 是
PRIMARY KEY IDENTITY
- [哈希] 有一个
UNIQUE
约束 - [BlockNumber] 有一个索引
- [Id] 是
- 转会
- [Id] 是
PRIMARY KEY IDENTITY
- [TransactionId] 是引用 [Transactions] 的外键。[Id]
- [Id] 是
- 代币价格
- [Id] 是
PRIMARY KEY IDENTITY
- [Id] 是
- TokenPriceAttempts
- [Id] 是
PRIMARY KEY IDENTITY
- [TransferId] 是引用 [Transfers] 的外键。[Id]
- [Id] 是
我想做的是 select 所有转账,以及来自其相关交易(一笔交易到多笔转账)的一些数据,我目前没有在其中存储价格与该转账相关的代币价格。
在查询的第一部分,我得到了所有转账的列表,并计算了最近找到的代币价格之间的 DIFF。如果找不到,则为 null(我最终想要 select)。我在交易时间戳的两边都允许 3 小时 - 如果在该时间跨度内没有发现任何内容,它将为空。
其次,我 select 从这个集合开始,首先确保 diff 为空,因为这意味着价格丢失,最后,令牌价格尝试要么没有尝试获取的条目一个价格,或者如果它列出了少于 5 次尝试并且最后一次尝试是在一个多星期前。
我的布局方式导致 WHERE 子句中基本上有 3 个相同/相似的 SELECT 语句,感觉非常不理想...
我该如何改进这种方法?
WITH [transferDateDiff] AS
(
SELECT
[t1].[Id],
[t1].[TransactionId],
[t1].[From],
[t1].[To],
[t1].[Value],
[t1].[Type],
[t1].[ContractAddress],
[t1].[TokenId],
[t2].[Hash],
[t2].[Timestamp],
ABS(DATEDIFF(SECOND, [tp].[Timestamp], [t2].[Timestamp])) AS diff
FROM
[dbo].[Transfers] AS [t1]
LEFT JOIN
[dbo].[Transactions] AS [t2]
ON [t1].[TransactionId] = [t2].[Id]
LEFT JOIN
(
SELECT
*
FROM
[dbo].[TokenPrices]
)
AS [tp]
ON [tp].[ContractAddress] = [t1].[ContractAddress]
AND [tp].[Timestamp] >= DATEADD(HOUR, - 3, [t2].[Timestamp])
AND [tp].[Timestamp] <= DATEADD(HOUR, 3, [t2].[Timestamp])
WHERE
[t1].[Type] < 2
)
SELECT
[tdd].[Id],
[tdd].[TransactionId],
[tdd].[From],
[tdd].[To],
[tdd].[Value],
[tdd].[Type],
[tdd].[ContractAddress],
[tdd].[TokenId],
[tdd].[Hash],
[tdd].[Timestamp]
FROM
[transferDateDiff] AS tdd
WHERE
[tdd].[diff] IS NULL AND
(
(
SELECT
COUNT(*)
FROM
[dbo].[TokenPriceAttempts] tpa
WHERE
[tpa].[TransferId] = [tdd].[Id]
)
= 0 OR
(
(
SELECT
COUNT(*)
FROM
[dbo].[TokenPriceAttempts] tpa
WHERE
[tpa].[TransferId] = [tdd].[Id]
)
< 5 AND
(
DATEDIFF(DAY,
(
SELECT
MAX([tpa].[Created])
FROM
[dbo].[TokenPriceAttempts] tpa
WHERE
[tpa].[TransferId] = [tdd].[Id]
),
CURRENT_TIMESTAMP
) >= 7
)
)
)
我不明白你为什么要这样做:
LEFT JOIN
(
SELECT
*
FROM
[dbo].[TokenPrices]
)
AS [tp]
ON [tp].[ContractAddress] = [t1].[ContractAddress]
AND [tp].[Timestamp] >= DATEADD(HOUR, - 3, [t2].[Timestamp])
AND [tp].[Timestamp] <= DATEADD(HOUR, 3, [t2].[Timestamp])
这不是
LEFT JOIN [dbo].[TokenPrices] as TP ...
这个:
SELECT
COUNT(*)
FROM
[dbo].[TokenPriceAttempts] tpa
WHERE
[tpa].[TransferId] = [tdd].[Id]
可能是另一个 CTE 而不是子... 事实上,您的任何子查询都可能是 CTE,这是 CTE 的一部分,使事情更容易阅读。
,TPA
AS
(
SELECT COUNT(*)
FROM [dbo].[TokenPriceAttempts] tpa
WHERE [tpa].[TransferId] = [tdd].[Id]
)
这里试图帮助简化。我删除了所有真正不需要的 [brackets] 除非你 运行 变成保留关键字之类的东西,或者名称中有空格的列(开头不好)。
无论如何,您的主查询每个 ID 有 3 个 select 实例。为了消除这种情况,我对一个子查询执行了一个 LEFT JOIN,该子查询将所有类型 < 2 的传输和 JOINS 拉到一次价格尝试中。这样,结果将已经 pre-aggregated count(*) 和 Max(Created) 完成一次,用于与您的 WITH CTE 声明相同的传输基础。因此,您不必每次都保留 运行 3 个查询,也不必查询所有传输的整个 table,只需查询具有相同基础类型 < 2 条件的那些。结果子查询别名“PQ”(preQuery)
这现在简化了外部 WHERE 子句从每个 Id 的冗余计数中的可读性。
WITH transferDateDiff AS
(
SELECT
t1.Id,
t1.TransactionId,
t1.From,
t1.To,
t1.Value,
t1.Type,
t1.ContractAddress,
t1.TokenId,
t2.Hash,
t2.Timestamp,
ABS( DATEDIFF( SECOND, tp.Timestamp, t2.Timestamp )) AS diff
FROM
dbo.Transfers t1
LEFT JOIN dbo.Transactions t2
ON t1.TransactionId = t2.Id
LEFT JOIN dbo.TokenPrices tp
ON t1.ContractAddress = tp.ContractAddress
AND tp.Timestamp >= DATEADD(HOUR, - 3, t2.Timestamp)
AND tp.Timestamp <= DATEADD(HOUR, 3, t2.Timestamp)
WHERE
t1.Type < 2
)
SELECT
tdd.Id,
tdd.TransactionId,
tdd.From,
tdd.To,
tdd.Value,
tdd.Type,
tdd.ContractAddress,
tdd.TokenId,
tdd.Hash,
tdd.Timestamp
FROM
transferDateDiff tdd
LEFT JOIN
( SELECT
t1.Id,
COUNT(*) Attempts,
MAX(tpa.Created) MaxCreated
FROM
dbo.Transfers t1
JOIN dbo.TokenPriceAttempts tpa
on t1.Id = tpa.TransferId
WHERE
t1.Type < 2
GROUP BY
t1.Id ) PQ
on tdd.Id = PQ.Id
WHERE
tdd.diff IS NULL
AND ( PQ.Attempts IS NULL
OR PQ.Attempts = 0
OR ( PQ.Attempts < 5
AND DATEDIFF(DAY, PQ.MaxCreated, CURRENT_TIMESTAMP ) >= 7
)
)
已修改以将 WITH CTE 删除到单个查询中
SELECT
t1.Id,
t1.TransactionId,
t1.From,
t1.To,
t1.Value,
t1.Type,
t1.ContractAddress,
t1.TokenId,
t2.Hash,
t2.Timestamp
FROM
-- Now, this pre-query is left-joined to token price attempts
-- so ALL Transfers of type < 2 are considered
( SELECT
t1.Id,
coalesce( COUNT(*), 0 ) Attempts,
MAX(tpa.Created) MaxCreated
FROM
dbo.Transfers t1
LEFT JOIN dbo.TokenPriceAttempts tpa
on t1.Id = tpa.TransferId
WHERE
t1.Type < 2
GROUP BY
t1.Id ) PQ
-- Now, we can just directly join to transfers for the rest
JOIN dbo.Transfers t1
on PQ.Id = t1.Id
-- and the rest from the WITH CTE construct
LEFT JOIN dbo.Transactions t2
ON t1.TransactionId = t2.Id
LEFT JOIN dbo.TokenPrices tp
ON t1.ContractAddress = tp.ContractAddress
AND tp.Timestamp >= DATEADD(HOUR, - 3, t2.Timestamp)
AND tp.Timestamp <= DATEADD(HOUR, 3, t2.Timestamp)
WHERE
ABS( DATEDIFF( SECOND, tp.Timestamp, t2.Timestamp )) IS NULL
AND ( PQ.Attempts = 0
OR ( PQ.Attempts < 5
AND DATEDIFF(DAY, PQ.MaxCreated, CURRENT_TIMESTAMP ) >= 7 )
)