找到一段 DNA 的最长回文子串
Find longest palindrome substring of a piece of DNA
我必须编写一个函数来打印一段 DNA 的最长回文子串。我已经写了一个函数来检查一段 DNA 本身是否是回文。请参阅下面的函数。
def make_complement_strand(DNA):
complement=[]
rules_for_complement={"A":"T","T":"A","C":"G","G":"C"}
for letter in DNA:
complement.append(rules_for_complement[letter])
return(complement)
def is_this_a_palindrome(DNA):
DNA=list(DNA)
if DNA!=(make_complement_strand(DNA)[::-1]):
print("false")
return False
else:
print("true")
return True
is_this_a_palindrome("GGGCCC")
但是现在:如何使函数打印 DNA 字符串的最长回文子串?
回文在遗传学背景下的含义与用于单词和句子的定义略有不同。由于双螺旋是由两条成对的核苷酸链形成的,这些核苷酸链在 5'- 到 3' 方向上 运行 方向相反,并且核苷酸总是以相同的方式配对(腺嘌呤 (A) 和胸腺嘧啶 (T) ) 对于 DNA,尿嘧啶 (U) 对于 RNA;胞嘧啶 (C) 和鸟嘌呤 (G)),如果(单链)核苷酸序列与其反向互补序列相等,则称其为回文序列。例如,DNA 序列 ACCTAGGT 是回文序列,因为它的逐个核苷酸互补序列是 TGGATCCA,将互补序列中核苷酸的顺序倒过来就是原始序列。
在这里,这应该是获得最长回文子串的不错起点。
def make_complement_strand(DNA):
complement=[]
rules_for_complement={"A":"T","T":"A","C":"G","G":"C"}
for letter in DNA:
complement.append(rules_for_complement[letter])
return(complement)
def is_this_a_palindrome(DNA):
DNA=list(DNA)
if DNA!=(make_complement_strand(DNA)[::-1]):
#print("false")
return False
else:
#print("true")
return True
def longest_palindrome_ss(org_dna, palindrone_func):
'''
Naive implementation-
We start with 2 pointers.
i starts at start of current subsqeunce and j starts from i+1 to end
increment i with every loop
Uses palindrome function provided by user
Further improvements-
1. Start with longest sequence instead of starting with smallest. i.e. start with i=0 and j=final_i and decrement.
'''
longest_palin=""
i=j=0
last_i=len(org_dna)
while i < last_i:
j=i+1
while j < last_i:
current_subsequence = org_dna[i:j+1]
if palindrone_func(current_subsequence):
if len(current_subsequence)>len(longest_palin):
longest_palin=current_subsequence
j+=1
i+=1
print(org_dna, longest_palin)
return longest_palin
longest_palindrome_ss("GGGCCC", is_this_a_palindrome)
longest_palindrome_ss("GAGCTT", is_this_a_palindrome)
longest_palindrome_ss("GGAATTCGA", is_this_a_palindrome)
这是一些处决 -
mahorir@mahorir-Vostro-3446:~/Desktop$ python3 dna_paln.py
GGGCCC GGGCCC
GAGCTT AGCT
GGAATTCGA GAATTC
我必须编写一个函数来打印一段 DNA 的最长回文子串。我已经写了一个函数来检查一段 DNA 本身是否是回文。请参阅下面的函数。
def make_complement_strand(DNA):
complement=[]
rules_for_complement={"A":"T","T":"A","C":"G","G":"C"}
for letter in DNA:
complement.append(rules_for_complement[letter])
return(complement)
def is_this_a_palindrome(DNA):
DNA=list(DNA)
if DNA!=(make_complement_strand(DNA)[::-1]):
print("false")
return False
else:
print("true")
return True
is_this_a_palindrome("GGGCCC")
但是现在:如何使函数打印 DNA 字符串的最长回文子串?
回文在遗传学背景下的含义与用于单词和句子的定义略有不同。由于双螺旋是由两条成对的核苷酸链形成的,这些核苷酸链在 5'- 到 3' 方向上 运行 方向相反,并且核苷酸总是以相同的方式配对(腺嘌呤 (A) 和胸腺嘧啶 (T) ) 对于 DNA,尿嘧啶 (U) 对于 RNA;胞嘧啶 (C) 和鸟嘌呤 (G)),如果(单链)核苷酸序列与其反向互补序列相等,则称其为回文序列。例如,DNA 序列 ACCTAGGT 是回文序列,因为它的逐个核苷酸互补序列是 TGGATCCA,将互补序列中核苷酸的顺序倒过来就是原始序列。
在这里,这应该是获得最长回文子串的不错起点。
def make_complement_strand(DNA):
complement=[]
rules_for_complement={"A":"T","T":"A","C":"G","G":"C"}
for letter in DNA:
complement.append(rules_for_complement[letter])
return(complement)
def is_this_a_palindrome(DNA):
DNA=list(DNA)
if DNA!=(make_complement_strand(DNA)[::-1]):
#print("false")
return False
else:
#print("true")
return True
def longest_palindrome_ss(org_dna, palindrone_func):
'''
Naive implementation-
We start with 2 pointers.
i starts at start of current subsqeunce and j starts from i+1 to end
increment i with every loop
Uses palindrome function provided by user
Further improvements-
1. Start with longest sequence instead of starting with smallest. i.e. start with i=0 and j=final_i and decrement.
'''
longest_palin=""
i=j=0
last_i=len(org_dna)
while i < last_i:
j=i+1
while j < last_i:
current_subsequence = org_dna[i:j+1]
if palindrone_func(current_subsequence):
if len(current_subsequence)>len(longest_palin):
longest_palin=current_subsequence
j+=1
i+=1
print(org_dna, longest_palin)
return longest_palin
longest_palindrome_ss("GGGCCC", is_this_a_palindrome)
longest_palindrome_ss("GAGCTT", is_this_a_palindrome)
longest_palindrome_ss("GGAATTCGA", is_this_a_palindrome)
这是一些处决 -
mahorir@mahorir-Vostro-3446:~/Desktop$ python3 dna_paln.py
GGGCCC GGGCCC
GAGCTT AGCT
GGAATTCGA GAATTC