修改 SQL 查询以包含其他参数
Modify an SQL query to include additional parameters
我想通过 SQL 查询提取数据,但给定的代码没有给我一份报告,其中包括我想要的所有数据。
基本上,该报告结合了许多样本(准确地说是 95 个)的数据,然后为我提供了这些样本的序列。它还比较这些序列,看看它们是否出现在比 1 更多的样本中。
我想将参数 "v_family" 和 "j_gene" 作为附加列包括在内,查询需要从其中一个样本中获取这些参数(以与获取氨基酸序列类似的方式("amino_acid") 来自出现此序列的样本之一)。
如何将我的两个附加参数添加到此报告中?
这是产生 6 列的当前查询(另请参阅随附的屏幕截图):
select
value,
rank,
count(*) over (partition by amino_acid) as contributors,
total,
amino_acid,
sample_name
from ( select
value,
row_number() over (partition by sample_name order by rank desc) as rank,
sum(value) over (partition by amino_acid) as total,
amino_acid,
sample_name
from ( select
sum(productive_frequency) as value,
sum(productive_frequency) as rank,
amino_acid,
sample_name
from sequences
group by
amino_acid,
sample_name
order by
value desc
)inner_query
) outer_inner
order by
sample_name asc,
rank
提出了以下编辑,但没有得到我想要的数据(见随附的屏幕截图):
select value, rank, count(*) over (partition by amino_acid) as contributors, total, amino_acid, sample_name from ( select value, row_number() over (partition by sample_name order by rank desc) as rank, sum(value) over (partition by amino_acid) as total, amino_acid, sample_name from ( select sum(productive_frequency) as value, sum(productive_frequency) as rank, amino_acid, sample_name, v_family from sequences group by amino_acid, sample_name, v_family order by value desc ) inner_query ) outer_inner order by sample_name asc, rank
old query
new query
这是建议,但没有改变结果:
select
value,
rank,
total,
amino_acid,
sample_name
from ( select
value,
row_number() over (partition by sample_name order by rank desc) as rank,
sum(value) over (partition by amino_acid) as total, count(*) over (partition by amino_acid,v_family,j_gene) as contributors,
amino_acid,
sample_name from ( SELECT sum(productive_frequency) AS value
,sum(productive_frequency) AS rank
,v_family
,j_gene
,amino_acid
,sample_name
FROM sequences
GROUP BY amino_acid
,sample_name
,v_family
,j_gene
ORDER BY value DESC ) inner_query ) outer_inner order by sample_name asc, rank
好的,解决了!正确代码如下,非常感谢大家的帮助!
SELECT value
,rank
,count(*) OVER (PARTITION BY amino_acid,v_family,j_gene) AS contributors
,total
,amino_acid
,sample_name
,v_family
,j_gene
FROM (
SELECT value
,row_number() OVER (PARTITION BY sample_name ORDER BY rank DESC) AS rank
,sum(value) OVER (PARTITION BY amino_acid,v_family,j_gene) AS total
,amino_acid
,sample_name
,v_family
,j_gene
FROM (
SELECT sum(productive_frequency) AS value
,sum(productive_frequency) AS rank
,v_family
,j_gene
,amino_acid
,sample_name
FROM sequences
GROUP BY amino_acid
,sample_name
,v_family
,j_gene
ORDER BY value DESC
) inner_query
) outer_inner
ORDER BY sample_name ASC
,rank
你的多级查询都是基于从table/view"sequence"中选择数据的最内层查询,所以如果还需要2个参数,则必须在最内层查询中添加仅,很可能有一个或多个额外的 table 将加入 "sequence" table/view。
而不是当前的 4 列值、等级、amino_acids、sample_name 将在最内层查询中有 6 列(加上 Gene、Family)。这些额外的 2 列必须包含在分组依据中,因此它们将出现在最顶部的查询中。
这可能是向基本查询添加 2 个新列的效果。 "group by" 语句通过所选列的唯一组合对数据进行分组,这两种情况下都是不同的。
比较这些查询结果:
SELECT sum(productive_frequency) AS value
,sum(productive_frequency) AS rank
,v_family
,j_gene
,amino_acid
,sample_name
FROM sequences
GROUP BY amino_acid
,sample_name
,v_family
,j_gene
ORDER BY value DESC
和
SELECT sum(productive_frequency) AS value
,sum(productive_frequency) AS rank
,amino_acid
,sample_name
FROM sequences
GROUP BY amino_acid
,sample_name
ORDER BY value DESC
如果行数不同,那么您可以使用以下语句检查所提及列的唯一组合:
select distinct
v_family
,j_gene
,amino_acid
,sample_name
FROM sequences
虽然 inner_query 结果改变了,但改变了 window 函数
sum(value) over (partition by amino_acid) as total,
到
sum(value) over (partition by amino_acid,v_family,j_gene) as total,
对于每次我按 [Enter]
时添加的回复,我深表歉意
新版本。
SELECT value
,rank
,count(*) OVER (PARTITION BY amino_acid,v_family,j_gene) AS contributors
,total
,amino_acid
,sample_name
,v_family
,j_gene
FROM (
SELECT value
,row_number() OVER (PARTITION BY sample_name ORDER BY rank DESC) AS rank
,sum(value) OVER (PARTITION BY amino_acid,v_family,j_gene) AS total
,amino_acid
,sample_name
,v_family
,j_gene
FROM (
SELECT sum(productive_frequency) AS value
,sum(productive_frequency) AS rank
,v_family
,j_gene
,amino_acid
,sample_name
FROM sequences
GROUP BY amino_acid
,sample_name
,v_family
,j_gene
ORDER BY value DESC
) inner_query
) outer_inner
ORDER BY sample_name ASC
,rank
我想通过 SQL 查询提取数据,但给定的代码没有给我一份报告,其中包括我想要的所有数据。
基本上,该报告结合了许多样本(准确地说是 95 个)的数据,然后为我提供了这些样本的序列。它还比较这些序列,看看它们是否出现在比 1 更多的样本中。
我想将参数 "v_family" 和 "j_gene" 作为附加列包括在内,查询需要从其中一个样本中获取这些参数(以与获取氨基酸序列类似的方式("amino_acid") 来自出现此序列的样本之一)。
如何将我的两个附加参数添加到此报告中?
这是产生 6 列的当前查询(另请参阅随附的屏幕截图):
select
value,
rank,
count(*) over (partition by amino_acid) as contributors,
total,
amino_acid,
sample_name
from ( select
value,
row_number() over (partition by sample_name order by rank desc) as rank,
sum(value) over (partition by amino_acid) as total,
amino_acid,
sample_name
from ( select
sum(productive_frequency) as value,
sum(productive_frequency) as rank,
amino_acid,
sample_name
from sequences
group by
amino_acid,
sample_name
order by
value desc
)inner_query
) outer_inner
order by
sample_name asc,
rank
提出了以下编辑,但没有得到我想要的数据(见随附的屏幕截图):
select value, rank, count(*) over (partition by amino_acid) as contributors, total, amino_acid, sample_name from ( select value, row_number() over (partition by sample_name order by rank desc) as rank, sum(value) over (partition by amino_acid) as total, amino_acid, sample_name from ( select sum(productive_frequency) as value, sum(productive_frequency) as rank, amino_acid, sample_name, v_family from sequences group by amino_acid, sample_name, v_family order by value desc ) inner_query ) outer_inner order by sample_name asc, rank
old query
new query
这是建议,但没有改变结果:
select
value,
rank,
total,
amino_acid,
sample_name
from ( select
value,
row_number() over (partition by sample_name order by rank desc) as rank,
sum(value) over (partition by amino_acid) as total, count(*) over (partition by amino_acid,v_family,j_gene) as contributors,
amino_acid,
sample_name from ( SELECT sum(productive_frequency) AS value
,sum(productive_frequency) AS rank
,v_family
,j_gene
,amino_acid
,sample_name
FROM sequences
GROUP BY amino_acid
,sample_name
,v_family
,j_gene
ORDER BY value DESC ) inner_query ) outer_inner order by sample_name asc, rank
好的,解决了!正确代码如下,非常感谢大家的帮助!
SELECT value
,rank
,count(*) OVER (PARTITION BY amino_acid,v_family,j_gene) AS contributors
,total
,amino_acid
,sample_name
,v_family
,j_gene
FROM (
SELECT value
,row_number() OVER (PARTITION BY sample_name ORDER BY rank DESC) AS rank
,sum(value) OVER (PARTITION BY amino_acid,v_family,j_gene) AS total
,amino_acid
,sample_name
,v_family
,j_gene
FROM (
SELECT sum(productive_frequency) AS value
,sum(productive_frequency) AS rank
,v_family
,j_gene
,amino_acid
,sample_name
FROM sequences
GROUP BY amino_acid
,sample_name
,v_family
,j_gene
ORDER BY value DESC
) inner_query
) outer_inner
ORDER BY sample_name ASC
,rank
你的多级查询都是基于从table/view"sequence"中选择数据的最内层查询,所以如果还需要2个参数,则必须在最内层查询中添加仅,很可能有一个或多个额外的 table 将加入 "sequence" table/view。
而不是当前的 4 列值、等级、amino_acids、sample_name 将在最内层查询中有 6 列(加上 Gene、Family)。这些额外的 2 列必须包含在分组依据中,因此它们将出现在最顶部的查询中。
这可能是向基本查询添加 2 个新列的效果。 "group by" 语句通过所选列的唯一组合对数据进行分组,这两种情况下都是不同的。
比较这些查询结果:
SELECT sum(productive_frequency) AS value
,sum(productive_frequency) AS rank
,v_family
,j_gene
,amino_acid
,sample_name
FROM sequences
GROUP BY amino_acid
,sample_name
,v_family
,j_gene
ORDER BY value DESC
和
SELECT sum(productive_frequency) AS value
,sum(productive_frequency) AS rank
,amino_acid
,sample_name
FROM sequences
GROUP BY amino_acid
,sample_name
ORDER BY value DESC
如果行数不同,那么您可以使用以下语句检查所提及列的唯一组合:
select distinct
v_family
,j_gene
,amino_acid
,sample_name
FROM sequences
虽然 inner_query 结果改变了,但改变了 window 函数
sum(value) over (partition by amino_acid) as total,
到
sum(value) over (partition by amino_acid,v_family,j_gene) as total,
对于每次我按 [Enter]
新版本。
SELECT value
,rank
,count(*) OVER (PARTITION BY amino_acid,v_family,j_gene) AS contributors
,total
,amino_acid
,sample_name
,v_family
,j_gene
FROM (
SELECT value
,row_number() OVER (PARTITION BY sample_name ORDER BY rank DESC) AS rank
,sum(value) OVER (PARTITION BY amino_acid,v_family,j_gene) AS total
,amino_acid
,sample_name
,v_family
,j_gene
FROM (
SELECT sum(productive_frequency) AS value
,sum(productive_frequency) AS rank
,v_family
,j_gene
,amino_acid
,sample_name
FROM sequences
GROUP BY amino_acid
,sample_name
,v_family
,j_gene
ORDER BY value DESC
) inner_query
) outer_inner
ORDER BY sample_name ASC
,rank