涉及连接和 Having(或 Where)子句的 BigQuery 嵌套挑战

BigQuery Nested Challenge Involving Joins and Having (or Where) Clauses

我遇到了一个有点超出我能力范围的挑战,所以我要马上投入。

我在 BigQuery 中有一个示例数据集,您可以在此处找到用于测试目的:https://bigquery.cloud.google.com/table/robotic-charmer-726:bl_test_data.complex_problem

我需要找出 SQL 代码来查询我的 table 并执行以下操作:

通过使用以下规则进行聚合(我将从一个电子邮件地址开始,然后在最后添加另一个):

作为前面的一般说明,所有内容都应小写,以便在聚合时 Ben=ben。

Email是最宽泛的聚合,按小写版本聚合。

汇总所有这些小写电子邮件的金额,如下图蓝色部分所示。

其次是名字和姓氏,根据名字和姓氏的小写字母之和选择。

请注意,名字或姓氏不会被单独考虑。见下文,Ben 的总金额为 160,而 Kathleen 的总金额仅为 150,但 Kathleen 仍被选中,因为她的全名的总金额高于任何其他全名。

接下来根据最高金额选择SELECTED NAME的小写完整地址。

与名称类似,完整地址将所有列一起考虑。

现在我将添加另一个电子邮件地址,我们将做同样的事情。

每个小写的电子邮件地址都被单独考虑。我现在意识到我应该用我的照片更清楚地说明这一点,但我不想再做一遍……太多的工作。所以我希望我说得够清楚了。

希望您觉得这是一个非常有趣的挑战!

可能有更简洁的方法来执行此操作,但这将为您提供所需的答案:

    select email, first_name, last_name, address, city, state, zip, total_amount amount
from (
    select d.email email, d.first_name first_name, d.last_name last_name, d.amount amount, d.total_amount total_amount, e.address address, e.city city, e.state state, e.zip zip, row_number() over (partition by e.email order by e.amount desc) ord
    from (
        select a.email email, a.first_name first_name, a.last_name last_name, b.amount amount, c.amount total_amount
        from (
          SELECT  
            lower(email) email, lower(first_name) first_name, lower(last_name) last_name, lower(concat(first_name, last_name)) as name_group, lower(address) address, lower(city) city, lower(state) state, lower(concat(address,city,state)) as location_group, zip, sum(amount) amount 
          FROM [robotic-charmer-726:bl_test_data.complex_problem]
          group by 1,2,3,4,5,6,7,8,9
        ) a
        inner join (
          select email, first_name, last_name, name_group, amount
          from (
            select email, first_name, last_name, name_group, amount, row_number() over (partition by email order by amount desc) as ord
            from (
              select lower(email) email , lower(first_name) first_name, lower(last_name) last_name, lower(concat(first_name,last_name)) as name_group, sum(amount) amount, 
              from [robotic-charmer-726:bl_test_data.complex_problem]
              group by 1, 2, 3, 4
            )
          )
          where ord = 1
        ) b
        on a.name_group = b.name_group
        inner join (
          select lower(email) email, sum(amount) amount
          from [robotic-charmer-726:bl_test_data.complex_problem]
          group by 1
        ) c
        on a.email = c.email
        group by 1,2,3,4,5
    ) d
    inner join (
        select lower(email) email, lower(first_name) first_name, lower(last_name) last_name, lower(address) address, lower(city) city, lower(state) state, zip,lower(concat(lower(address),lower(city), lower(state), zip)) as location_group, sum(amount) amount
        from [robotic-charmer-726:bl_test_data.complex_problem]
        group by 1,2,3,4,5,6,7,8
    ) e
    on d.email = e.email and d.first_name = e.first_name and d.last_name = e.last_name
)
where ord = 1