如何从 DNA 序列中获取片段

How to get fragments from a DNA sequence

我想将 DNA 基因组切割成任何 k-mer 大小,所以我创建了函数 Sliding_DNA(dna_list,size_to_split) 但我不起作用。

谁能帮帮我!

当我打印出变量 pedazos 时,它给了我以下信息:

'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC', 'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC', 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA', 'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT', 'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT']

代码:

def Sliding_DNA(dna_list,size_to_split):

# range por el que va a slide

#vecesRecorrer = int(len(dna_list) / 500)

lista_temp = []


#dna_to_split = dna_list[0]

#print(dna_to_split)

posiInicial = 0

posiFinal = 0

test = 'AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGGGTTGCCGATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCATT'

for nucleotide in test:

    pedazo = ""

    posiFinal = posiInicial + size_to_split

    for posiInicial in xrange(posiFinal):

        pedazo += nucleotide

        if len(pedazo)==size_to_split:

            lista_temp.append(pedazo)

    posiInicial += size_to_split


return lista_temp


pedazos = Sliding_DNA(dna_list,100)

问题是因为这个,

pedazo += posiInicial

您将空字符串分配给 pedazo 变量,因此它是一个字符串。 posiInicial 变量包含整数。所以 python 混淆了连接或对字符串和整数进行 +

所以把pedazo的值改成0

pedazo = 0

cont += 1

posiFinal = posiInicial + 500

for posiInicial in xrange(posiFinal):

    pedazo += posiInicial