ATHENA/PRESTO 具有多个未嵌套表的复杂查询

ATHENA/PRESTO complex query with multiple unnested tables

我想在多个 table 上创建连接。 table login : 我想从 login 中检索所有数据 table 日志记录:计算每个数据库的 Nb_of_sessions 以及用户对每个特定事件类型的计算 table 会议:为每个数据库和每个用户计算 Nb_of_meetings table 实时:为每个数据库和每个用户计算 Nb_of_live

我的查询结果正确:

SELECT db.id,_id as userid,firstname,lastname
FROM  "logins"."login",
UNNEST(dbs) AS a1 (db)

SELECT dbid,userid,count(distinct(sessionid)) as no_of_visits,
       array_join(array_agg(value.from_url),',') as from_url
FROM "loggings"."logging"
where event='url_event'
group by db.id,userid;

SELECT dbid,userid AS userid,count(*) as nb_interviews,
      array_join(array_agg(interviewer),',') as interviewer 
FROM  "meetings"."meeting" 
group by dbid,userid;

SELECT dbid,r1.user._id AS userid,count(_id) as nb_chat 
FROM "lives"."live",
UNNEST(users) AS r1 (user)
group by dbid,r1.user._id;

但是当我开始尝试将它们放在一起时,似乎我检索到了错误的数据(我只检索了数据库)并且似乎效率不高。

select a1.db.id,a._id as userid,a.firstname,a.lastname,count(rl._id) as nb_chat
FROM 
"logins"."login" a,
"loggings"."logging" b,
"meetings"."meeting" c,
"lives"."live" d,
UNNEST(dbs) AS a1 (db),
UNNEST(users) AS r1 (user)
where a._id = b.userid AND a._id = c.userid AND a._id = r1.user._id
group by 1,2,3,4

你有什么想法吗?

此致。

最简单的方法是使用 with 构建子查询,然后引用它们。

with parameter参考:

You can use WITH to flatten nested queries, or to simplify subqueries.

The WITH clause precedes the SELECT list in a query and defines one or more subqueries for use within the SELECT query.

Each subquery defines a temporary table, similar to a view definition, which you can reference in the FROM clause. The tables are used only when the query runs.

由于您已经有了有效的子查询,因此以下应该有效:

with logins as 
(
    SELECT db.id,_id as userid,firstname,lastname
    FROM  "logins"."login",
    UNNEST(dbs) AS a1 (db)
)
,visits as
(
    SELECT dbid,userid,count(distinct(sessionid)) as no_of_visits,
           array_join(array_agg(value.from_url),',') as from_url
    FROM "loggings"."logging"
    where event='url_event'
    group by db.id,userid
)
,meetings as
(
    SELECT dbid,userid AS userid,count(*) as nb_interviews,
          array_join(array_agg(interviewer),',') as interviewer 
    FROM  "meetings"."meeting" 
    group by dbid,userid
)
,chats as 
(
    SELECT dbid,r1.user._id AS userid,count(_id) as nb_chat 
    FROM "lives"."live",
    UNNEST(users) AS r1 (user)
    group by dbid,r1.user._id
)
select *
from logins l
left join visits v
    on l.dbid = v.dbid
    and l.userid = v.userid
left join meetings m
    on l.dbid = m.dbid
    and l.userid = m.userid
left join chats c
    on l.dbid = c.dbid
    and l.userid = c.userid;