最长和最短序列,Python

Longest and shortest sequence, Python

我有这个程序可以生成随机 N 个序列。

import random
N = 5
def randseq(abc, length):
    return "".join([random.choice(abc) for i in range(random.randint(1, length))])
for i in range(N):
    print(f'Sequence {i+1}:')
    print(randseq("ATCG", 120))

我得到了序列

序列 1:

TGGTACACGTGCTTAATGTTAACCTGTCTGGCGCAGGGTAACTATTTCATCCCT

序列 2:

CGTATATAATGCTTCCTCTTCAGGCGACCTTGCGATAGTGTCCGGCCATGTGAGTCCCTGTGGAGTGCCTTTAGATGACCTATACGTCTTTAGACTATGTTTATGGGG

序列 3:

CACAGCCTTCTCTCCAATG . . .

序列 N:

如何打印最长和最短的N个序列及其长度?

.....

请检查我的代码。描述在里面。

import random


def randseq(abc, length):
    return "".join([random.choice(abc) for i in range(random.randint(1, length))])


# You should move the input value to the main part of code
# If not, it will treat as global variable
N = 5

# Init the longest seq with shortest one (empty string) 
# to make sure that all random seq must longer than this init
longest_seq = ""

# Init the shortest seq with longest one 
# (assume that randseq("ATCG", 1000) is long enough) 
# to make sure that all random seq must shorter than this init
shortest_seq = randseq("ATCG", 1000)

for i in range(N):
    print(f'Sequence {i+1}:')
    seq = randseq("ATCG", 120)
    
    # Find the longest one then update it to the longest_seq variable
    if len(seq) > len(longest_seq):
        longest_seq = seq
    
    # Find the shortest one then update it to the shortest_seq variable
    if len(seq) < len(shortest_seq):
        shortest_seq = seq
    
    print(seq)
   
print("") 
print('The longest seq is ', longest_seq)
print('The lenght of longest seq is ', len(longest_seq))
print('The shortest is ', shortest_seq)
print('The lenght of shortest seq is ', len(shortest_seq))

示例结果(它是随机的,因此当您 运行 它时它不会与您相同)

Sequence 1:
CGGTGATCGCGATTACTGCCCGGCCTTGTCCACTCACAGCGATAACAGTGCTTATAGATCTCTCAAGTCTACCGTCTCACCCGTTGATTACCAA
Sequence 2:
AAGGTCAAGATTCGAATTCGTATCGCCGTATGGATAGGCGAAACGAGGGGTGGCTAAGGGGTAGACAGCAGAGCCGCTTTTGTACACCGTAAAACGGACGGTTCAGAACCGGAGGTACG
Sequence 3:
ACGGCCTCATGGATAATGCCCGGGGGAACAGGGAAGGAAAGATTTTGTCAAACTGATTCAGTTAC
Sequence 4:
GATACA
Sequence 5:
ATCGAAAGGAATATCTGTACGGGACGTTTGGTCTCGAGCCTAGCGTAAGCCGCCCGCAATTCGCTCTGATGAGCTACCG

The longest seq is  AAGGTCAAGATTCGAATTCGTATCGCCGTATGGATAGGCGAAACGAGGGGTGGCTAAGGGGTAGACAGCAGAGCCGCTTTTGTACACCGTAAAACGGACGGTTCAGAACCGGAGGTACG
The lenght of longest seq is  119
The shortest is  GATACA
The lenght of shortest seq is  6

注意事项:

在某些(很少)情况下,shortest_seq 的初始化可能太小(所有随机序列中最小的)。如果发生这种情况,程序将失败。您可以增加 randseq 输入的长度以减少遇到此问题的可能性。

例如

您可以更改它:

shortest_seq = randseq("ATCG", 1000)

至:

shortest_seq = randseq("ATCG", 10000)