Biopython:如何获取蛋白质pdb文件的化合物名称?

Biopython: How to get the compound name of the pdb file of a protein?

我一直在尝试用以下术语解决它:

structure.header['compound']

但我得到的只是分子 ID 而不是它的名称!

可能这里有一个答案: http://biopython.org/wiki/The_Biopython_Structural_Bioinformatics_FAQ

keywords = structure.header['keywords']

The available keys are name, head, deposition_date, release_date, structure_method, resolution, structure_reference (maps to a list of references), journal_reference, author and compound (maps to a dictionary with various information about the crystallized compound).

为了得到crystal结构的名称,即在PDB站点显示的名称,您可以使用:

print(structure.header['name'])

例如(假设您的当前工​​作目录中有 1iah.pdb

from Bio.PDB import *
parser = PDBParser()
structure = parser.get_structure('1IAH', '1iah.pdb')
print(structure.header['name'])

会给你

' crystal structure of the atypical protein kinase domain of a trp ca-channel, chak (adp-mg complex)'

与此处显示的名称相同:http://www.rcsb.org/pdb/explore/explore.do?structureId=1IAH


更新回复评论

为了获得化合物的名称,可以使用:

print(structure.header['compound']['1']['molecule'])