拆分字符串和翻译字典中字符的问题 - 生物信息学 OOP

Problem with splitting string and translate characters in dictionary - bioinformatics OOP

我的程序有问题。这部分代码有问题

    def revcmpl(self):
        
        # TODO:convert sequence contained in the object
        #      to a list called seq
        
        seq = list(self.seq)
        
        # TODO: reverse the list in-place
        
        seq.reverse()
        
        # TODO: using string method join(), the class dictionary ALPH and a
        #       list comprehension, translate the reversed sequence and
        #       convert into a string
        
        seq = list(seq)
        seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())
        seq_revcmpl = str(seq_revcmpl)
        
        # TODO: create seqid variable and assign to it the object's seqid
        #       and the suffix '_revcmpl'
        
        seqid = f'{self.seqid}_revcmpl'
        
        # TODO: create a new object od DNASeq type using the new seqid,
        #       title contained in the object and
        #       reveresed and translated sequence,
        #       return the new object
        
        obj1 = DNASeq(seqid, title, seq_revcmpl)

        return obj1

我尝试使用字符串方法 join()、class 字典 ALPH 和列表推导,翻译反向序列并转换成字符串。我尝试 运行 这个:

# reload the sequences to have a collection of objects
# that are instances of the up-to-date DNASeq class

seqs = DNASeq.from_file('input/Staphylococcus_MLST_genes.fasta')

# select one of the sequences by its sequence id (seqid)
seq = seqs['yqiL']

new_seq = seq.revcmpl()

print( new_seq )

但是我得到一个错误

KeyError                                  Traceback (most recent call last)
<ipython-input-57-a28b468b9cfe> in <module>
      7 seq = seqs['yqiL']
      8 
----> 9 new_seq = seq.revcmpl()
     10 
     11 print( new_seq )

<ipython-input-43-07d175957482> in revcmpl(self)
    211 
    212         seq = list(seq)
--> 213         seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())
    214         seq_revcmpl = str(seq_revcmpl)
    215 

<ipython-input-43-07d175957482> in <genexpr>(.0)
    211 
    212         seq = list(seq)
--> 213         seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())
    214         seq_revcmpl = str(seq_revcmpl)
    215 

KeyError: 'GCGTTTAAAGACGTGCCAGCCTATGATTTAGGTGCGACTTTAATAGAACATATTATTAAAGAGACGGGTTTGAATCCAAGTGAGATTGATGAAGTTATCATCGGTAACGTACTACAAGCAGGACAAGGACAAAATCCAGCACGAATTGCTGCTATGAAAGGTGGCTTGCCAGAAACAGTACCTGCATTTACAGTGAATAAAGTATGTGGTTCTGGGTTAAAGTCGATTCAATTAGCATATCAATCTATTGTGACTGGTGAAAATGACATCGTGCTAGCTGGCGGTATGGAGAATATGTCTCAGTCACCAATGCTTGTCAACAACAGTCGCTTCGGTTTTAAAATGGGACATCAATCAATGGTTGATAGCATGGTATATGATGGTTTAACAGATGTATTTAATCAATATCATATGGGTATTACTGCTGAAAATTTAGTGGAGCAATATGGTATTTCAAGAGAAGAACAAGATACATTTGCTGTAAACTCACAACAAAAAGCAGTACGTGCACAGCAA'

但是为什么????我拆分了一个序列,seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())

问题在这里:

seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq.split())

self.seq 将不包含任何空格,因此 self.seq.split() 将 return 包含单个项目的列表 - 序列本身。

生成器表达式只有一次迭代(因为列表中只有一项,一个大字符串),key 将是整个序列。

我想你想要的是:

seq_revcmpl = ''.join(DNASeq.ALPH[key] for key in self.seq)