BioPython 遍历来自 fasta 文件的序列

Question

我是 BioPython 的新手，我正在尝试导入一个 fasta/fastq 文件并遍历每个序列，同时对每个序列执行一些操作。我知道这看起来很基本，但出于某种原因，我下面的代码无法正确打印。

from Bio import SeqIO

newfile = open("new.txt", "w")
records = list(SeqIO.parse("rosalind_gc.txt", "fasta"))

i = 0
dna = records[i]

while i <= len(records):
    print (dna.name)
    i = i + 1

我试图基本上遍历记录并打印名称，但是我的代码最终只打印 "records[0]"，我希望它打印 "records[1-10]"。有人可以解释为什么它最终只打印 "records[0]"?

Answer 1

你的问题原因在这里：

i = 0
dna = records[i]

您的对象 'dna' 固定到记录的索引 0，即 records[0]。由于您不再调用它，因此 dna 将始终固定在该声明中。在 while 循环中的 print 语句中，使用如下内容：

while i <= len(records):
    print (records[i].name)
    i = i + 1

如果您希望将对象 dna 作为记录条目的副本，则需要将 dna 重新分配给每个索引，在 while 循环中执行此操作，如下所示：

while i <= len(records):
    dna = records[i]
    print (dna.name)
    i = i + 1

但是，这不是最有效的方法。最后，为了让您学习，比使用 i = i + 1 的 while 循环更好的方法是使用 for 循环，如下所示：

for i in range(0,len(records)):
    print (records[i].name)

For 循环会自动进行迭代，一个接一个。 range() 将给出一组从 0 到记录长度的整数。还有其他方法，但我保持简单。

BioPython 遍历来自 fasta 文件的序列

BioPython iterating through sequences from fasta file

python

biopython