大查询 - JOIN 两个表
Big Query - JOIN two tables
我们有两个 table,如下所示:
Table一个
Name | Question | Answer
-----+-----------+-------
Bob | Interest | art_and_theatre
Sue | Interest | finances_and_investments
Sue | Interest | art_and_theatre
Joe | Interest | cooking_and_nutrition
Joe | Interest | nutrition_and_drinks
Joe | Interest | eco_life
Joe | Interest | beauty
Bob | Interest | nutrition_and_drinks
Table B(静态)
Interest | Segment
--------------------------------------------+------------------
art_and_theatre | S1
cooking_and_nutrition, nutrition_and_drinks | S2
finances_and_investments | S3
finances_and_investments | S4
technology | S5
telecommunications | S6
art_and_theatre | S7
art_and_theatre | S8
eco_life, cooking_and_nutrition, beauty | S9
预计table
Name | Question | Answer
-----+-----------+-------
Bob | Interest | art_and_theatre
Sue | Interest | finances_and_investments
Sue | Interest | art_and_theatre
Joe | Interest | cooking_and_nutrition
Joe | Interest | nutrition_and_drinks
Bob | Interest | nutrition_and_drinks
(+)
Bob | Segment | S1
Bob | Segment | S7
Bob | Segment | S8
Sue | Segment | S3
Sue | Segment | S4
Sue | Segment | S1
Sue | Segment | S7
Sue | Segment | S8
Joe | Segment | S2
Joe | Segment | S9
如您所见,一个用户可以有多个兴趣,多个兴趣可以属于一个细分。这种 JOIN 在 Big Query 中可行吗?
注意:兴趣栏将有一个或多个值。仅当所有值都匹配时才需要连接段。
嗯。 . .我正在考虑 union all
和 join
:
select a.name, a.question, a.answer
from a
union all
select a.name, 'Segment', b.segment
from a join
b
on a.answer = b.interest;
是的,这是可能的,您应该可以通过以下方式做到这一点 SQL
with temp as (
SELECT a.*, b.*
FROM TABLEA a
JOIN TABLEB b
on a.answer = b.interest
)
SELECT t.Name, t.Question, t.Answer from temp
UNION ALL
SELECT t.Name, 'Segment' as Question, t.Segment as Answer from temp
以下适用于 BigQuery 标准 SQL
#standardSQL
select name, question, answer from `project.dataset.tableA`
union all
select distinct name, 'segment' as question, segment as answer
from (
select answer, segment
from `project.dataset.tableB`,
unnest(split(interest, ', ')) answer
)
join `project.dataset.tableA`
using(answer)
-- order by question, name, answer
如果应用于您问题中的示例数据 - 输出为
我们有两个 table,如下所示:
Table一个
Name | Question | Answer
-----+-----------+-------
Bob | Interest | art_and_theatre
Sue | Interest | finances_and_investments
Sue | Interest | art_and_theatre
Joe | Interest | cooking_and_nutrition
Joe | Interest | nutrition_and_drinks
Joe | Interest | eco_life
Joe | Interest | beauty
Bob | Interest | nutrition_and_drinks
Table B(静态)
Interest | Segment
--------------------------------------------+------------------
art_and_theatre | S1
cooking_and_nutrition, nutrition_and_drinks | S2
finances_and_investments | S3
finances_and_investments | S4
technology | S5
telecommunications | S6
art_and_theatre | S7
art_and_theatre | S8
eco_life, cooking_and_nutrition, beauty | S9
预计table
Name | Question | Answer
-----+-----------+-------
Bob | Interest | art_and_theatre
Sue | Interest | finances_and_investments
Sue | Interest | art_and_theatre
Joe | Interest | cooking_and_nutrition
Joe | Interest | nutrition_and_drinks
Bob | Interest | nutrition_and_drinks
(+)
Bob | Segment | S1
Bob | Segment | S7
Bob | Segment | S8
Sue | Segment | S3
Sue | Segment | S4
Sue | Segment | S1
Sue | Segment | S7
Sue | Segment | S8
Joe | Segment | S2
Joe | Segment | S9
如您所见,一个用户可以有多个兴趣,多个兴趣可以属于一个细分。这种 JOIN 在 Big Query 中可行吗?
注意:兴趣栏将有一个或多个值。仅当所有值都匹配时才需要连接段。
嗯。 . .我正在考虑 union all
和 join
:
select a.name, a.question, a.answer
from a
union all
select a.name, 'Segment', b.segment
from a join
b
on a.answer = b.interest;
是的,这是可能的,您应该可以通过以下方式做到这一点 SQL
with temp as (
SELECT a.*, b.*
FROM TABLEA a
JOIN TABLEB b
on a.answer = b.interest
)
SELECT t.Name, t.Question, t.Answer from temp
UNION ALL
SELECT t.Name, 'Segment' as Question, t.Segment as Answer from temp
以下适用于 BigQuery 标准 SQL
#standardSQL
select name, question, answer from `project.dataset.tableA`
union all
select distinct name, 'segment' as question, segment as answer
from (
select answer, segment
from `project.dataset.tableB`,
unnest(split(interest, ', ')) answer
)
join `project.dataset.tableA`
using(answer)
-- order by question, name, answer
如果应用于您问题中的示例数据 - 输出为