显示实例拥有哪些属性的查询,超出属性的超集
Query that displays which attributes an instance possesses, out of a superset of attributes
我在 Bigquery 中有一个包含两个 table 的关系数据集。
第一个 table 保存客户数据
+-------------+--------+
| Customer ID | Name |
+-------------+--------+
| 1 | Bob |
+-------------+--------+
| 2 | Jenny |
+-------------+--------+
| 3 | Janice |
+-------------+--------+
第二个 table 包含与第一个 table 中的客户关联的各种 name/value 对:
+-------------+----------+-------+
| Customer ID | Category | Value |
+-------------+----------+-------+
| 1 | A | A |
+-------------+----------+-------+
| 1 | A | B |
+-------------+----------+-------+
| 1 | B | A |
+-------------+----------+-------+
| 2 | B | B |
+-------------+----------+-------+
我想生成一个报告来枚举每个客户,并在 table 2 中找到的每个 name:value 下设置一个 TRUE,例如:
+-------------+------+------+-----+------+------+
| Customer ID | A:A | A:B | A:C | B:A | B:B |
+-------------+------+------+-----+------+------+
| 1 | TRUE | TRUE | | TRUE | |
+-------------+------+------+-----+------+------+
| 2 | | | | | TRUE |
+-------------+------+------+-----+------+------+
| 3 | | | | | |
+-------------+------+------+-----+------+------+
我已尝试将每个 category:value 组合指定为我的 select 语句
中的列
select
customer id,
a:a,
a:b,
a:c,
b:a,
b:b
from
table_1 t1
join
table_2 t2
on
t1.customer_id = t2.customer_id
但这没有给我任何帮助,因为一旦找到值,我不知道如何获取查询以将单元格设置为 TRUE。
抱歉,如果这很明显,我是 SQL 的新手。
您需要某种聚合,例如:
select t1.customer_id,
bool_or(t2.category = 'a' and t2.value = 'a') as a_a,
bool_or(t2.category = 'a' and t2.value = 'b') as a_b,
bool_or(t2.category = 'a' and t2.value = 'c') as a_c,
bool_or(t2.category = 'b' and t2.value = 'a') as b_a,
bool_or(t2.category = 'b' and t2.value = 'b') as b_b
from table_1 t1 join
table_2 t2
on t1.customer_id = t2.customer_id
group by t1.customer_id;
以下适用于 BigQuery 标准 SQL
#standardSQL
SELECT customer_id,
LOGICAL_OR((category, value) = ('A', 'A')) AS a_a,
LOGICAL_OR((category, value) = ('A', 'B')) AS a_b,
LOGICAL_OR((category, value) = ('A', 'C')) AS a_c,
LOGICAL_OR((category, value) = ('B', 'A')) AS b_a,
LOGICAL_OR((category, value) = ('B', 'B')) AS b_b
FROM `project.dataset.table1`
JOIN `project.dataset.table2`
USING (customer_id)
GROUP BY customer_id
您可以使用您问题中的示例数据来测试和使用上面的示例,如下例所示
#standardSQL
WITH `project.dataset.table1` AS (
SELECT 1 Customer_ID, 'Bob' Name UNION ALL
SELECT 2, 'Jenny' UNION ALL
SELECT 3, 'Janice'
), `project.dataset.table2` AS (
SELECT 1 Customer_ID, 'A' Category, 'A' Value UNION ALL
SELECT 1, 'A', 'B' UNION ALL
SELECT 1, 'B', 'A' UNION ALL
SELECT 2, 'B', 'B'
)
SELECT customer_id,
LOGICAL_OR((category, value) = ('A', 'A')) AS a_a,
LOGICAL_OR((category, value) = ('A', 'B')) AS a_b,
LOGICAL_OR((category, value) = ('A', 'C')) AS a_c,
LOGICAL_OR((category, value) = ('B', 'A')) AS b_a,
LOGICAL_OR((category, value) = ('B', 'B')) AS b_b
FROM `project.dataset.table1`
JOIN `project.dataset.table2`
USING (customer_id)
GROUP BY customer_id
结果
Row customer_id a_a a_b a_c b_a b_b
1 1 true true false true false
2 2 false false false false true
如果您 need/want 输出与您的问题完全相同 - 您可以使用以下调整后的版本
#standardSQL
SELECT customer_id,
IF(LOGICAL_OR((category, value) = ('A', 'A')), 'TRUE', '') AS a_a,
IF(LOGICAL_OR((category, value) = ('A', 'B')), 'TRUE', '') AS a_b,
IF(LOGICAL_OR((category, value) = ('A', 'C')), 'TRUE', '') AS a_c,
IF(LOGICAL_OR((category, value) = ('B', 'A')), 'TRUE', '') AS b_a,
IF(LOGICAL_OR((category, value) = ('B', 'B')), 'TRUE', '') AS b_b
FROM `project.dataset.table1`
JOIN `project.dataset.table2`
USING (customer_id)
GROUP BY customer_id
结果
Row customer_id a_a a_b a_c b_a b_b
1 1 TRUE TRUE TRUE
2 2 TRUE
注意:在上面的示例中 - 您实际上不需要连接,因为您没有使用表 1 中的字段而不是用作过滤器(仅显示表 1 中的用户)
我在 Bigquery 中有一个包含两个 table 的关系数据集。
第一个 table 保存客户数据
+-------------+--------+
| Customer ID | Name |
+-------------+--------+
| 1 | Bob |
+-------------+--------+
| 2 | Jenny |
+-------------+--------+
| 3 | Janice |
+-------------+--------+
第二个 table 包含与第一个 table 中的客户关联的各种 name/value 对:
+-------------+----------+-------+
| Customer ID | Category | Value |
+-------------+----------+-------+
| 1 | A | A |
+-------------+----------+-------+
| 1 | A | B |
+-------------+----------+-------+
| 1 | B | A |
+-------------+----------+-------+
| 2 | B | B |
+-------------+----------+-------+
我想生成一个报告来枚举每个客户,并在 table 2 中找到的每个 name:value 下设置一个 TRUE,例如:
+-------------+------+------+-----+------+------+
| Customer ID | A:A | A:B | A:C | B:A | B:B |
+-------------+------+------+-----+------+------+
| 1 | TRUE | TRUE | | TRUE | |
+-------------+------+------+-----+------+------+
| 2 | | | | | TRUE |
+-------------+------+------+-----+------+------+
| 3 | | | | | |
+-------------+------+------+-----+------+------+
我已尝试将每个 category:value 组合指定为我的 select 语句
中的列select
customer id,
a:a,
a:b,
a:c,
b:a,
b:b
from
table_1 t1
join
table_2 t2
on
t1.customer_id = t2.customer_id
但这没有给我任何帮助,因为一旦找到值,我不知道如何获取查询以将单元格设置为 TRUE。
抱歉,如果这很明显,我是 SQL 的新手。
您需要某种聚合,例如:
select t1.customer_id,
bool_or(t2.category = 'a' and t2.value = 'a') as a_a,
bool_or(t2.category = 'a' and t2.value = 'b') as a_b,
bool_or(t2.category = 'a' and t2.value = 'c') as a_c,
bool_or(t2.category = 'b' and t2.value = 'a') as b_a,
bool_or(t2.category = 'b' and t2.value = 'b') as b_b
from table_1 t1 join
table_2 t2
on t1.customer_id = t2.customer_id
group by t1.customer_id;
以下适用于 BigQuery 标准 SQL
#standardSQL
SELECT customer_id,
LOGICAL_OR((category, value) = ('A', 'A')) AS a_a,
LOGICAL_OR((category, value) = ('A', 'B')) AS a_b,
LOGICAL_OR((category, value) = ('A', 'C')) AS a_c,
LOGICAL_OR((category, value) = ('B', 'A')) AS b_a,
LOGICAL_OR((category, value) = ('B', 'B')) AS b_b
FROM `project.dataset.table1`
JOIN `project.dataset.table2`
USING (customer_id)
GROUP BY customer_id
您可以使用您问题中的示例数据来测试和使用上面的示例,如下例所示
#standardSQL
WITH `project.dataset.table1` AS (
SELECT 1 Customer_ID, 'Bob' Name UNION ALL
SELECT 2, 'Jenny' UNION ALL
SELECT 3, 'Janice'
), `project.dataset.table2` AS (
SELECT 1 Customer_ID, 'A' Category, 'A' Value UNION ALL
SELECT 1, 'A', 'B' UNION ALL
SELECT 1, 'B', 'A' UNION ALL
SELECT 2, 'B', 'B'
)
SELECT customer_id,
LOGICAL_OR((category, value) = ('A', 'A')) AS a_a,
LOGICAL_OR((category, value) = ('A', 'B')) AS a_b,
LOGICAL_OR((category, value) = ('A', 'C')) AS a_c,
LOGICAL_OR((category, value) = ('B', 'A')) AS b_a,
LOGICAL_OR((category, value) = ('B', 'B')) AS b_b
FROM `project.dataset.table1`
JOIN `project.dataset.table2`
USING (customer_id)
GROUP BY customer_id
结果
Row customer_id a_a a_b a_c b_a b_b
1 1 true true false true false
2 2 false false false false true
如果您 need/want 输出与您的问题完全相同 - 您可以使用以下调整后的版本
#standardSQL
SELECT customer_id,
IF(LOGICAL_OR((category, value) = ('A', 'A')), 'TRUE', '') AS a_a,
IF(LOGICAL_OR((category, value) = ('A', 'B')), 'TRUE', '') AS a_b,
IF(LOGICAL_OR((category, value) = ('A', 'C')), 'TRUE', '') AS a_c,
IF(LOGICAL_OR((category, value) = ('B', 'A')), 'TRUE', '') AS b_a,
IF(LOGICAL_OR((category, value) = ('B', 'B')), 'TRUE', '') AS b_b
FROM `project.dataset.table1`
JOIN `project.dataset.table2`
USING (customer_id)
GROUP BY customer_id
结果
Row customer_id a_a a_b a_c b_a b_b
1 1 TRUE TRUE TRUE
2 2 TRUE
注意:在上面的示例中 - 您实际上不需要连接,因为您没有使用表 1 中的字段而不是用作过滤器(仅显示表 1 中的用户)