在 for 循环中使用 rdkit 生成 png 文件,但每次都生成相同的 png 文件
Using rdkit in a for loop to produce png file, but the same png file is produced each time
我有一个看起来像这样的数据框
np_id SMILES standard_inchi_key
0 NPC4665 OC(=O)Cc1ccc(c(c1)O)O CFFZDZCDUFSOFZ-UHFFFAOYSA-N
2 NPC4668 OC(=O)C1=CCCNC1 QTDZOWFRBNTPQR-UHFFFAOYSA-N
32 NPC4962 CCCCCCCCC(=O)C ZAJNGDIORYACQU-UHFFFAOYSA-N
36 NPC4986 CC1=CC[C@]23[C@H]1[C@H]1OC(=O)C(=C)[C@@H]1CC[C... UVJYAKBJSGRTHA-CUZKYEQNSA-N
38 NPC5292 CC(=O)OC[C@]12CC[C@H]3[C@H]([C@]1(O)CC[C@@]2(O... RGHQRULWHKEQHE-GRVQADPTSA-N
我正在尝试生成分子的二维表示,使用 rdkit 来处理 SMILES。我写了以下代码:
from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem import AllChem
import pandas as pd
image_path = '/DTA_training/short/'
list_of_paths = []
for inchi_key in df['standard_inchi_key']:
image_name = str(inchi_key)
full_path = image_path+image_name
list_of_paths.append(full_path)
df['paths'] = list_of_paths
image_size = 500
for full_path in df['paths']:
for smile in df["SMILES"]:
mol = Chem.MolFromSmiles(str(smile))
if mol is None:
print(("Unable to read original SMILES"+full_path))
else:
_discard = AllChem.Compute2DCoords(mol)
Draw.MolToFile(mol, full_path, size=(image_size,image_size), fitImage=False, imageType='png')
每次都生成相同的png。我无法锻炼我的 for 循环有什么问题。谁能给点建议?
在 for full_path in df['paths']:
的每个循环中,您创建数据框中所有 SMILES 的图像
一个接着一个覆盖上一个,只剩下最后一个。
试试这个:
df.reset_index(drop=True, inplace=True) # thanks to mnis
for n in range(len(df["paths"])):
full_path = df["paths"][n]
mol = Chem.MolFromSmiles(df["SMILES"][n])
if mol is None:
print(("Unable to read original SMILES"+full_path))
else:
_discard = AllChem.Compute2DCoords(mol)
Draw.MolToFile(mol, full_path, size=(image_size,image_size), fitImage=False, imageType='png')
我有一个看起来像这样的数据框
np_id SMILES standard_inchi_key
0 NPC4665 OC(=O)Cc1ccc(c(c1)O)O CFFZDZCDUFSOFZ-UHFFFAOYSA-N
2 NPC4668 OC(=O)C1=CCCNC1 QTDZOWFRBNTPQR-UHFFFAOYSA-N
32 NPC4962 CCCCCCCCC(=O)C ZAJNGDIORYACQU-UHFFFAOYSA-N
36 NPC4986 CC1=CC[C@]23[C@H]1[C@H]1OC(=O)C(=C)[C@@H]1CC[C... UVJYAKBJSGRTHA-CUZKYEQNSA-N
38 NPC5292 CC(=O)OC[C@]12CC[C@H]3[C@H]([C@]1(O)CC[C@@]2(O... RGHQRULWHKEQHE-GRVQADPTSA-N
我正在尝试生成分子的二维表示,使用 rdkit 来处理 SMILES。我写了以下代码:
from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem import AllChem
import pandas as pd
image_path = '/DTA_training/short/'
list_of_paths = []
for inchi_key in df['standard_inchi_key']:
image_name = str(inchi_key)
full_path = image_path+image_name
list_of_paths.append(full_path)
df['paths'] = list_of_paths
image_size = 500
for full_path in df['paths']:
for smile in df["SMILES"]:
mol = Chem.MolFromSmiles(str(smile))
if mol is None:
print(("Unable to read original SMILES"+full_path))
else:
_discard = AllChem.Compute2DCoords(mol)
Draw.MolToFile(mol, full_path, size=(image_size,image_size), fitImage=False, imageType='png')
每次都生成相同的png。我无法锻炼我的 for 循环有什么问题。谁能给点建议?
在 for full_path in df['paths']:
的每个循环中,您创建数据框中所有 SMILES 的图像
一个接着一个覆盖上一个,只剩下最后一个。
试试这个:
df.reset_index(drop=True, inplace=True) # thanks to mnis
for n in range(len(df["paths"])):
full_path = df["paths"][n]
mol = Chem.MolFromSmiles(df["SMILES"][n])
if mol is None:
print(("Unable to read original SMILES"+full_path))
else:
_discard = AllChem.Compute2DCoords(mol)
Draw.MolToFile(mol, full_path, size=(image_size,image_size), fitImage=False, imageType='png')