如何从 DNA 序列中获取片段
How to get fragments from a DNA sequence
我想将 DNA 基因组切割成任何 k-mer 大小,所以我创建了函数 Sliding_DNA(dna_list,size_to_split) 但我不起作用。
谁能帮帮我!
当我打印出变量 pedazos 时,它给了我以下信息:
'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC', 'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC', 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA', 'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT', 'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT']
代码:
def Sliding_DNA(dna_list,size_to_split):
# range por el que va a slide
#vecesRecorrer = int(len(dna_list) / 500)
lista_temp = []
#dna_to_split = dna_list[0]
#print(dna_to_split)
posiInicial = 0
posiFinal = 0
test = 'AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGGGTTGCCGATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCATT'
for nucleotide in test:
pedazo = ""
posiFinal = posiInicial + size_to_split
for posiInicial in xrange(posiFinal):
pedazo += nucleotide
if len(pedazo)==size_to_split:
lista_temp.append(pedazo)
posiInicial += size_to_split
return lista_temp
pedazos = Sliding_DNA(dna_list,100)
问题是因为这个,
pedazo += posiInicial
您将空字符串分配给 pedazo
变量,因此它是一个字符串。 posiInicial
变量包含整数。所以 python 混淆了连接或对字符串和整数进行 +
。
所以把pedazo
的值改成0
pedazo = 0
cont += 1
posiFinal = posiInicial + 500
for posiInicial in xrange(posiFinal):
pedazo += posiInicial
我想将 DNA 基因组切割成任何 k-mer 大小,所以我创建了函数 Sliding_DNA(dna_list,size_to_split) 但我不起作用。
谁能帮帮我!
当我打印出变量 pedazos 时,它给了我以下信息:
'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC', 'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC', 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA', 'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT', 'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT']
代码:
def Sliding_DNA(dna_list,size_to_split):
# range por el que va a slide
#vecesRecorrer = int(len(dna_list) / 500)
lista_temp = []
#dna_to_split = dna_list[0]
#print(dna_to_split)
posiInicial = 0
posiFinal = 0
test = 'AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGGGTTGCCGATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATGATTGAAAAAACCATT'
for nucleotide in test:
pedazo = ""
posiFinal = posiInicial + size_to_split
for posiInicial in xrange(posiFinal):
pedazo += nucleotide
if len(pedazo)==size_to_split:
lista_temp.append(pedazo)
posiInicial += size_to_split
return lista_temp
pedazos = Sliding_DNA(dna_list,100)
问题是因为这个,
pedazo += posiInicial
您将空字符串分配给 pedazo
变量,因此它是一个字符串。 posiInicial
变量包含整数。所以 python 混淆了连接或对字符串和整数进行 +
。
所以把pedazo
的值改成0
pedazo = 0
cont += 1
posiFinal = posiInicial + 500
for posiInicial in xrange(posiFinal):
pedazo += posiInicial