按某列分组的移位操作SQL
Shift operation grouping by a certain column SQL
所以我和 GreenPlum 一起在一个大的 table 名称 purchases 上包含超过 400 万行。这是 table:
的示例
userId | purchaseTime | timeDiff
------------------------------------------
17 | 2016-02-01 11:01:02 |
17 | 2016-02-01 13:24:58 |
17 | 2016-02-01 21:12:36 |
67 | 2016-02-01 17:04:49 |
84 | 2016-02-01 16:13:20 |
94 | 2016-02-01 05:46:13 |
94 | 2016-02-01 21:33:19 |
table 是按 userID 和 purchaseTime 排序的,以帮助理解我的目标
我的 objective 是通过包含每个用户当前行与上次购买时间之间的时间差来更新此 table。
让它看起来像这样:
userId | purchaseTime | timeDiff
------------------------------------------
17 | 2016-02-01 11:01:02 | NULL
17 | 2016-02-01 13:24:58 | 2:23:56
17 | 2016-02-01 21:12:36 | 8:12:38
67 | 2016-02-01 17:04:49 | NULL
84 | 2016-02-01 16:13:20 | NULL
94 | 2016-02-01 05:46:13 | NULL
94 | 2016-02-01 21:33:19 | 16:13:06
您的回答中的 select 对我有帮助。现在我需要执行 UPDATE,但在执行 UPDATE 时出现语法错误:
WITH tmp_table AS
(
SELECT userId ,
purchaseTime ,
purchaseTime - LAG(purchaseTime )
OVER (PARTITION BY userId ORDER BY purchaseTime) AS timeDiff
FROM purchases
)
UPDATE purchases SET timeDiff = tmp_table.timeDiff
FROM tmp_table
WHERE userId = tmp_table.userId
AND purchaseTime = tmp_table.purchaseTime;
谁能帮我更新 table?
您可以使用lag
window函数来查找上次购买日期,并将两者相减即可:
SELECT userId,
purchaseTime,
purchaseTime -
LAG(purchaseTime) OVER
(PARTITION BY userId ORDER BY purchaseTime) AS timeDiff
FROM purchases
所以根据@mureinik 的查询,为了进行更新,您必须执行以下操作:
UPDATE purchases
SET timeDiff = tmp_table.timeDiff
FROM (SELECT userId, purchaseTime ,
(EXTRACT(epoch FROM purchaseTime - LAG(purchaseTime) OVER
(PARTITION BY userId ORDER BY purchaseTime))/60)::integer AS timeDiff
FROM purchases) AS tmp_table
WHERE purchases.userId = tmp_table.userId
AND purchases.timeDiff = tmp_table.timeDiff;
在更新中,您将有 EXTRACT
和 epoch FROM
语句,这是为了 return 间隔中的秒数。如果你想在几分钟内将它们除以 60 </code>,最后如果你想四舍五入,只需将它转换为 <code>integer
。
所以我和 GreenPlum 一起在一个大的 table 名称 purchases 上包含超过 400 万行。这是 table:
的示例userId | purchaseTime | timeDiff
------------------------------------------
17 | 2016-02-01 11:01:02 |
17 | 2016-02-01 13:24:58 |
17 | 2016-02-01 21:12:36 |
67 | 2016-02-01 17:04:49 |
84 | 2016-02-01 16:13:20 |
94 | 2016-02-01 05:46:13 |
94 | 2016-02-01 21:33:19 |
table 是按 userID 和 purchaseTime 排序的,以帮助理解我的目标
我的 objective 是通过包含每个用户当前行与上次购买时间之间的时间差来更新此 table。
让它看起来像这样:
userId | purchaseTime | timeDiff
------------------------------------------
17 | 2016-02-01 11:01:02 | NULL
17 | 2016-02-01 13:24:58 | 2:23:56
17 | 2016-02-01 21:12:36 | 8:12:38
67 | 2016-02-01 17:04:49 | NULL
84 | 2016-02-01 16:13:20 | NULL
94 | 2016-02-01 05:46:13 | NULL
94 | 2016-02-01 21:33:19 | 16:13:06
您的回答中的 select 对我有帮助。现在我需要执行 UPDATE,但在执行 UPDATE 时出现语法错误:
WITH tmp_table AS
(
SELECT userId ,
purchaseTime ,
purchaseTime - LAG(purchaseTime )
OVER (PARTITION BY userId ORDER BY purchaseTime) AS timeDiff
FROM purchases
)
UPDATE purchases SET timeDiff = tmp_table.timeDiff
FROM tmp_table
WHERE userId = tmp_table.userId
AND purchaseTime = tmp_table.purchaseTime;
谁能帮我更新 table?
您可以使用lag
window函数来查找上次购买日期,并将两者相减即可:
SELECT userId,
purchaseTime,
purchaseTime -
LAG(purchaseTime) OVER
(PARTITION BY userId ORDER BY purchaseTime) AS timeDiff
FROM purchases
所以根据@mureinik 的查询,为了进行更新,您必须执行以下操作:
UPDATE purchases
SET timeDiff = tmp_table.timeDiff
FROM (SELECT userId, purchaseTime ,
(EXTRACT(epoch FROM purchaseTime - LAG(purchaseTime) OVER
(PARTITION BY userId ORDER BY purchaseTime))/60)::integer AS timeDiff
FROM purchases) AS tmp_table
WHERE purchases.userId = tmp_table.userId
AND purchases.timeDiff = tmp_table.timeDiff;
在更新中,您将有 EXTRACT
和 epoch FROM
语句,这是为了 return 间隔中的秒数。如果你想在几分钟内将它们除以 60 </code>,最后如果你想四舍五入,只需将它转换为 <code>integer
。