检索平面结果 - BigQuery 标准 SQL
Retrieving a flat result - BigQuery Standard SQL
我在 bigQuery 中有以下标准 SQL 查询:
SELECT
Chr,
start_position,
reference_bases,
call.name,
call.genotype,
alternate_bases.alt,
alternate_bases_CSQ_VT.*
FROM
`mutable`,
UNNEST(call) AS call,
UNNEST(call.genotype) AS genotype,
UNNEST(alternate_bases) AS alternate_bases,
UNNEST(alternate_bases.CSQ_VT) AS alternate_bases_CSQ_VT
WHERE
call.name = "sample name"
AND CLIN_SIG = "pathogenic"
AND genotype > 0
LIMIT
100
returned 的 table 是平坦的,除了生成的基因型字段(每行 returned 有两条记录)。我想 return 一个平面 table,其中每行的两个基因型值被转换为两个新列(等位基因 1 和等位基因 2),但我正在努力寻找正确的方法。任何指针都会很棒
你的问题有点难以理解。通常,最好的方法是使用 UNNEST()
的子查询。我不太明白你的数据模型(查询和示例数据不一致)。但是,像这样:
SELECT . . . ,
(SELECT MAX(call.name)
FROM UNNEST(call) genotype
WHERE call.genotype = 0
) as genotype_0,
(SELECT MAX(call.name)
FROM UNNEST(call) genotype
WHERE call.genotype = 1
) as genotype_1
FROM `mutable`,
UNNEST(alternate_bases) AS alternate_bases,
UNNEST(alternate_bases.CSQ_VT) AS alternate_bases_CSQ_VT
WHERE call.name = 'sample name'
CLIN_SIG = 'pathogenic'
LIMIT 100
我很感激你的回答我也会试试这个 - 很抱歉问题不太清楚。我确实想出了一个解决方案,虽然我不知道它是否有效。
SELECT
Chr,
start_position,
reference_bases,
call.name,
call.genotype[OFFSET(0)] as all1,
call.genotype[OFFSET(1)] as all2,
alternate_bases.alt,
alternate_bases_CSQ_VT.*
FROM
`mutable`,
UNNEST(call) AS call,
UNNEST(call.genotype) AS genotype,
UNNEST(alternate_bases) AS alternate_bases,
UNNEST(alternate_bases.CSQ_VT) AS alternate_bases_CSQ_VT
WHERE
call.name = "sample name"
AND CLIN_SIG = "pathogenic"
AND genotype > 0
LIMIT
100
我在 bigQuery 中有以下标准 SQL 查询:
SELECT
Chr,
start_position,
reference_bases,
call.name,
call.genotype,
alternate_bases.alt,
alternate_bases_CSQ_VT.*
FROM
`mutable`,
UNNEST(call) AS call,
UNNEST(call.genotype) AS genotype,
UNNEST(alternate_bases) AS alternate_bases,
UNNEST(alternate_bases.CSQ_VT) AS alternate_bases_CSQ_VT
WHERE
call.name = "sample name"
AND CLIN_SIG = "pathogenic"
AND genotype > 0
LIMIT
100
returned 的 table 是平坦的,除了生成的基因型字段(每行 returned 有两条记录)。我想 return 一个平面 table,其中每行的两个基因型值被转换为两个新列(等位基因 1 和等位基因 2),但我正在努力寻找正确的方法。任何指针都会很棒
你的问题有点难以理解。通常,最好的方法是使用 UNNEST()
的子查询。我不太明白你的数据模型(查询和示例数据不一致)。但是,像这样:
SELECT . . . ,
(SELECT MAX(call.name)
FROM UNNEST(call) genotype
WHERE call.genotype = 0
) as genotype_0,
(SELECT MAX(call.name)
FROM UNNEST(call) genotype
WHERE call.genotype = 1
) as genotype_1
FROM `mutable`,
UNNEST(alternate_bases) AS alternate_bases,
UNNEST(alternate_bases.CSQ_VT) AS alternate_bases_CSQ_VT
WHERE call.name = 'sample name'
CLIN_SIG = 'pathogenic'
LIMIT 100
我很感激你的回答我也会试试这个 - 很抱歉问题不太清楚。我确实想出了一个解决方案,虽然我不知道它是否有效。
SELECT
Chr,
start_position,
reference_bases,
call.name,
call.genotype[OFFSET(0)] as all1,
call.genotype[OFFSET(1)] as all2,
alternate_bases.alt,
alternate_bases_CSQ_VT.*
FROM
`mutable`,
UNNEST(call) AS call,
UNNEST(call.genotype) AS genotype,
UNNEST(alternate_bases) AS alternate_bases,
UNNEST(alternate_bases.CSQ_VT) AS alternate_bases_CSQ_VT
WHERE
call.name = "sample name"
AND CLIN_SIG = "pathogenic"
AND genotype > 0
LIMIT
100