SQL: 第二旧的日期

SQL: The second oldest date

假设您有一个 table 类似于此:

 |email          |   purchase_date      |
 |:--------------|:---------------------|
 |stan@gmail.com |  Jun 30 2020 12:00AM |  
 |stan@gmail.com |  Aug 05 2020 5:00PM  |  
 |stan@gmail.com |  Mar 22 2018 3:00AM  |  
 |eric@yahoo.com |  Aug 05 2020 5:00PM  |  
 |eric@yahoo.com |  Mar 22 2018 3:00PM  |  
 |kyle@gmail.com |  Mar 22 2018 3:00PM  |  
 |kyle@gmail.com |  Jun 30 2020 12:00AM |  
 |kyle@gmail.com |  Aug 05 2020 5:00PM  |  
 |kenny@gmail.com|  Aug 05 2020 5:00PM  |

完全随机。我使用的实际数据库实际上更复杂,列更多。

两列都是STRING类型。这不方便。购买日期应为 DATE 类型。 Kenny 只购买了一次商品,因此结果中不应该有任何关于他的行 table。 另请注意,有很多相同的日期。

我想 select 每个电子邮件地址的电子邮件和第 2 个最早的购买日期(命名为 'second_purchase'),结果如下所示:

|email          | second_purchase      |
|:--------------|:-------------------- |
|stan@gmail.com | Jun 30 2020 12:00AM  | 
|eric@yahoo.com | Aug 05 2021 5:00PM   | 
|kyle@gmail.com | Jun 30 2020 12:00AM  | 

我似乎无法正确理解逻辑或语法。我不想把我所有的代码都放在这里,因为我已经尝试了我的想法的许多变体...... 它似乎没有以某种方式工作。但我很乐意看到 SQL 方面的专家提供的示例代码。我的想法可能不是那么好..:-)

这个版本其实就是SOQL(Salesforce Object Query Language)。这可能很重要。

抱歉没有正确设置 table 样式,我似乎也没有工作,即使我使用了推荐的样式。我无法 post。这实际上是非常令人沮丧的。

无论如何,谢谢你的帮助!

您可以尝试以下 sql,它在每个用户的电子邮件上使用 dense_rank 并通过铸造 purchase_date

查询#1

WITH date_converted_table AS (
    SELECT
        email,
        purchase_date,
        DENSE_RANK() OVER (
          PARTITION BY email
          ORDER BY CAST(purchase_date as timestamp) ASC
        ) dr
    FROM
        mytable
)
SELECT
    email,
    purchase_date as second_purchase
FROM 
    date_converted_table
WHERE dr=2;
email second_purchase
eric@yahoo.com Aug 05 2020 5:00PM
kyle@gmail.com Jun 30 2020 12:00AM
stan@gmail.com Jun 30 2020 12:00AM

查询#2

SELECT
    email,
    purchase_date as second_purchase
FROM (
    SELECT
        email,
        purchase_date,
        DENSE_RANK() OVER (
          PARTITION BY email
          ORDER BY CAST(purchase_date as timestamp) ASC
        ) dr
    FROM
        mytable
) tb
WHERE dr=2;
email second_purchase
eric@yahoo.com Aug 05 2020 5:00PM
kyle@gmail.com Jun 30 2020 12:00AM
stan@gmail.com Jun 30 2020 12:00AM

View on DB Fiddle

更新 1

因为它与评论中的跟进问题有关:

Is it possible to upgrade the result so that there are first_purchase dates (where dr=1) adn second_purchase dates (where dr=2) in separate columns?

案例表达式和聚合可能会帮助您,如下所示。 having 子句确保有第一个和第二个购买日期。

SELECT
    email,
    MAX(CASE
        WHEN dr=1 THEN purchase_date
    END) as first_purchase,
    MAX(CASE
        WHEN dr=2 THEN purchase_date
    END) as second_purchase
FROM (
    SELECT
        email,
        purchase_date,
        DENSE_RANK() OVER (
          PARTITION BY email
          ORDER BY CAST(purchase_date as timestamp) ASC
        ) dr
    FROM
        mytable
) tb
GROUP BY email
HAVING
    SUM(
        CASE WHEN dr=1 THEN 1 ELSE 0 END  
    ) > 0 AND
     SUM(
        CASE WHEN dr=2 THEN 1 ELSE 0 END  
    ) > 0;
email first_purchase second_purchase
eric@yahoo.com Mar 22 2018 3:00PM Aug 05 2020 5:00PM
kyle@gmail.com Mar 22 2018 3:00PM Jun 30 2020 12:00AM
stan@gmail.com Mar 22 2018 3:00AM Jun 30 2020 12:00AM

View on DB Fiddle

让我知道这是否适合你。