每个样本的特定频率
Specific frequency per samples
这是我的 table:
chr pos refalt
---------------
chr1 123 AA
chr1 123 AA
chr1 123 AA
chr1 123 AA
chr1 123 AA
chr1 123 AC
chr1 123 AC
chr1 123 AC
chr2 456 TC
chr3 789 GC
我需要计算具体频率,我举个例子:
每一行都是一位患者,因此 "chr1 123 AA " 有 5 位患者,"chr1 123 AC" 有 3 位患者。
我想知道A的频率
计算是:
13(A)
/16 , Because There are 13 people in "Chr1 123" who has A and in total they're 16 5XA (ref) 5XA(alt) + 3XA (ref) 3XC(alt)
对于 C:
3(C)/16 , Because only 3 people has C
如何在 SQL 中实现它是否太复杂?
Refalt
是一个 varchar
列,所以我需要拆分每个值以获得 ref 和 alt。
我知道有点复杂,请向我询问更多详情。
对于任何想知道(特别是生物学家)如何实现这一点的人:
select substring(refalt from 1 for 1),
count( substring(refalt from 1 for 1) )::numeric /
(select 2*count(*) from ft_variants where pos_chr like 'chr1 12783') as frequency_allele1
from ft_variants
where pos_chr like 'chr1 12783'
group by refalt
union
select substring(refalt from 2 for 1),
count( substring(refalt from 2 for 1) )::numeric /
(select 2*count(*) from ft_variants where pos_chr like 'chr1 12783') as frequency_allele2
from ft_variants
where pos_chr like 'chr1 12783'
group by refalt;
这是我的 table:
chr pos refalt
---------------
chr1 123 AA
chr1 123 AA
chr1 123 AA
chr1 123 AA
chr1 123 AA
chr1 123 AC
chr1 123 AC
chr1 123 AC
chr2 456 TC
chr3 789 GC
我需要计算具体频率,我举个例子:
每一行都是一位患者,因此 "chr1 123 AA " 有 5 位患者,"chr1 123 AC" 有 3 位患者。
我想知道A的频率
计算是:
13(A)
/16 , Because There are 13 people in "Chr1 123" who has A and in total they're 16 5XA (ref) 5XA(alt) + 3XA (ref) 3XC(alt)
对于 C:
3(C)/16 , Because only 3 people has C
如何在 SQL 中实现它是否太复杂?
Refalt
是一个 varchar
列,所以我需要拆分每个值以获得 ref 和 alt。
我知道有点复杂,请向我询问更多详情。
对于任何想知道(特别是生物学家)如何实现这一点的人:
select substring(refalt from 1 for 1),
count( substring(refalt from 1 for 1) )::numeric /
(select 2*count(*) from ft_variants where pos_chr like 'chr1 12783') as frequency_allele1
from ft_variants
where pos_chr like 'chr1 12783'
group by refalt
union
select substring(refalt from 2 for 1),
count( substring(refalt from 2 for 1) )::numeric /
(select 2*count(*) from ft_variants where pos_chr like 'chr1 12783') as frequency_allele2
from ft_variants
where pos_chr like 'chr1 12783'
group by refalt;