SQL: 第二旧的日期
SQL: The second oldest date
假设您有一个 table 类似于此:
|email | purchase_date |
|:--------------|:---------------------|
|stan@gmail.com | Jun 30 2020 12:00AM |
|stan@gmail.com | Aug 05 2020 5:00PM |
|stan@gmail.com | Mar 22 2018 3:00AM |
|eric@yahoo.com | Aug 05 2020 5:00PM |
|eric@yahoo.com | Mar 22 2018 3:00PM |
|kyle@gmail.com | Mar 22 2018 3:00PM |
|kyle@gmail.com | Jun 30 2020 12:00AM |
|kyle@gmail.com | Aug 05 2020 5:00PM |
|kenny@gmail.com| Aug 05 2020 5:00PM |
完全随机。我使用的实际数据库实际上更复杂,列更多。
两列都是STRING类型。这不方便。购买日期应为 DATE 类型。 Kenny 只购买了一次商品,因此结果中不应该有任何关于他的行 table。
另请注意,有很多相同的日期。
我想 select 每个电子邮件地址的电子邮件和第 2 个最早的购买日期(命名为 'second_purchase'),结果如下所示:
|email | second_purchase |
|:--------------|:-------------------- |
|stan@gmail.com | Jun 30 2020 12:00AM |
|eric@yahoo.com | Aug 05 2021 5:00PM |
|kyle@gmail.com | Jun 30 2020 12:00AM |
我似乎无法正确理解逻辑或语法。我不想把我所有的代码都放在这里,因为我已经尝试了我的想法的许多变体......
它似乎没有以某种方式工作。但我很乐意看到 SQL 方面的专家提供的示例代码。我的想法可能不是那么好..:-)
这个版本其实就是SOQL(Salesforce Object Query Language)。这可能很重要。
抱歉没有正确设置 table 样式,我似乎也没有工作,即使我使用了推荐的样式。我无法 post。这实际上是非常令人沮丧的。
无论如何,谢谢你的帮助!
您可以尝试以下 sql,它在每个用户的电子邮件上使用 dense_rank
并通过铸造 purchase_date
查询#1
WITH date_converted_table AS (
SELECT
email,
purchase_date,
DENSE_RANK() OVER (
PARTITION BY email
ORDER BY CAST(purchase_date as timestamp) ASC
) dr
FROM
mytable
)
SELECT
email,
purchase_date as second_purchase
FROM
date_converted_table
WHERE dr=2;
email
second_purchase
eric@yahoo.com
Aug 05 2020 5:00PM
kyle@gmail.com
Jun 30 2020 12:00AM
stan@gmail.com
Jun 30 2020 12:00AM
查询#2
SELECT
email,
purchase_date as second_purchase
FROM (
SELECT
email,
purchase_date,
DENSE_RANK() OVER (
PARTITION BY email
ORDER BY CAST(purchase_date as timestamp) ASC
) dr
FROM
mytable
) tb
WHERE dr=2;
email
second_purchase
eric@yahoo.com
Aug 05 2020 5:00PM
kyle@gmail.com
Jun 30 2020 12:00AM
stan@gmail.com
Jun 30 2020 12:00AM
更新 1
因为它与评论中的跟进问题有关:
Is it possible to upgrade the result so that there are first_purchase
dates (where dr=1) adn second_purchase dates (where dr=2) in separate
columns?
案例表达式和聚合可能会帮助您,如下所示。 having 子句确保有第一个和第二个购买日期。
SELECT
email,
MAX(CASE
WHEN dr=1 THEN purchase_date
END) as first_purchase,
MAX(CASE
WHEN dr=2 THEN purchase_date
END) as second_purchase
FROM (
SELECT
email,
purchase_date,
DENSE_RANK() OVER (
PARTITION BY email
ORDER BY CAST(purchase_date as timestamp) ASC
) dr
FROM
mytable
) tb
GROUP BY email
HAVING
SUM(
CASE WHEN dr=1 THEN 1 ELSE 0 END
) > 0 AND
SUM(
CASE WHEN dr=2 THEN 1 ELSE 0 END
) > 0;
email
first_purchase
second_purchase
eric@yahoo.com
Mar 22 2018 3:00PM
Aug 05 2020 5:00PM
kyle@gmail.com
Mar 22 2018 3:00PM
Jun 30 2020 12:00AM
stan@gmail.com
Mar 22 2018 3:00AM
Jun 30 2020 12:00AM
让我知道这是否适合你。
假设您有一个 table 类似于此:
|email | purchase_date |
|:--------------|:---------------------|
|stan@gmail.com | Jun 30 2020 12:00AM |
|stan@gmail.com | Aug 05 2020 5:00PM |
|stan@gmail.com | Mar 22 2018 3:00AM |
|eric@yahoo.com | Aug 05 2020 5:00PM |
|eric@yahoo.com | Mar 22 2018 3:00PM |
|kyle@gmail.com | Mar 22 2018 3:00PM |
|kyle@gmail.com | Jun 30 2020 12:00AM |
|kyle@gmail.com | Aug 05 2020 5:00PM |
|kenny@gmail.com| Aug 05 2020 5:00PM |
完全随机。我使用的实际数据库实际上更复杂,列更多。
两列都是STRING类型。这不方便。购买日期应为 DATE 类型。 Kenny 只购买了一次商品,因此结果中不应该有任何关于他的行 table。 另请注意,有很多相同的日期。
我想 select 每个电子邮件地址的电子邮件和第 2 个最早的购买日期(命名为 'second_purchase'),结果如下所示:
|email | second_purchase |
|:--------------|:-------------------- |
|stan@gmail.com | Jun 30 2020 12:00AM |
|eric@yahoo.com | Aug 05 2021 5:00PM |
|kyle@gmail.com | Jun 30 2020 12:00AM |
我似乎无法正确理解逻辑或语法。我不想把我所有的代码都放在这里,因为我已经尝试了我的想法的许多变体...... 它似乎没有以某种方式工作。但我很乐意看到 SQL 方面的专家提供的示例代码。我的想法可能不是那么好..:-)
这个版本其实就是SOQL(Salesforce Object Query Language)。这可能很重要。
抱歉没有正确设置 table 样式,我似乎也没有工作,即使我使用了推荐的样式。我无法 post。这实际上是非常令人沮丧的。
无论如何,谢谢你的帮助!
您可以尝试以下 sql,它在每个用户的电子邮件上使用 dense_rank
并通过铸造 purchase_date
查询#1
WITH date_converted_table AS (
SELECT
email,
purchase_date,
DENSE_RANK() OVER (
PARTITION BY email
ORDER BY CAST(purchase_date as timestamp) ASC
) dr
FROM
mytable
)
SELECT
email,
purchase_date as second_purchase
FROM
date_converted_table
WHERE dr=2;
second_purchase | |
---|---|
eric@yahoo.com | Aug 05 2020 5:00PM |
kyle@gmail.com | Jun 30 2020 12:00AM |
stan@gmail.com | Jun 30 2020 12:00AM |
查询#2
SELECT
email,
purchase_date as second_purchase
FROM (
SELECT
email,
purchase_date,
DENSE_RANK() OVER (
PARTITION BY email
ORDER BY CAST(purchase_date as timestamp) ASC
) dr
FROM
mytable
) tb
WHERE dr=2;
second_purchase | |
---|---|
eric@yahoo.com | Aug 05 2020 5:00PM |
kyle@gmail.com | Jun 30 2020 12:00AM |
stan@gmail.com | Jun 30 2020 12:00AM |
更新 1
因为它与评论中的跟进问题有关:
Is it possible to upgrade the result so that there are first_purchase dates (where dr=1) adn second_purchase dates (where dr=2) in separate columns?
案例表达式和聚合可能会帮助您,如下所示。 having 子句确保有第一个和第二个购买日期。
SELECT
email,
MAX(CASE
WHEN dr=1 THEN purchase_date
END) as first_purchase,
MAX(CASE
WHEN dr=2 THEN purchase_date
END) as second_purchase
FROM (
SELECT
email,
purchase_date,
DENSE_RANK() OVER (
PARTITION BY email
ORDER BY CAST(purchase_date as timestamp) ASC
) dr
FROM
mytable
) tb
GROUP BY email
HAVING
SUM(
CASE WHEN dr=1 THEN 1 ELSE 0 END
) > 0 AND
SUM(
CASE WHEN dr=2 THEN 1 ELSE 0 END
) > 0;
first_purchase | second_purchase | |
---|---|---|
eric@yahoo.com | Mar 22 2018 3:00PM | Aug 05 2020 5:00PM |
kyle@gmail.com | Mar 22 2018 3:00PM | Jun 30 2020 12:00AM |
stan@gmail.com | Mar 22 2018 3:00AM | Jun 30 2020 12:00AM |
让我知道这是否适合你。