遍历文本文件并将最小值存储在字典中
iterate over a text file and store the lowest value in a dictionary
我有一个非常大的文本文件 (Summary_post_docking.txt),我想过滤它以找到最低分数。
这是我想出的:
class Ranker:
def __init__(self):
self.results = {}
with open('HTS_post_docking/Summary_post_docking.txt', 'r') as summary:
for line in summary:
score = float(line.split()[2])
frag_name = str(line.split()[0].split('/')[9]).split('_')[0]
if 0 >= score >= -200:
self.results[frag_name] = score
old = self.results[frag_name]
if frag_name in self.results.keys():
new = float(line.split()[2])
if new < old:
self.results[frag_name] = new
print(self.results)
不幸的是,所有这一切都是采用它读取的最后一个值,而不是用新的较低值覆盖。
[str(line.split()[0].split('/')[9]).split('_')[0]] 是分子的名字,而 float( line.split()[2]) 是与之关联的分数。
我希望脚本将分子名称存储为键,将分数存储为值。对于每一行,每次它使用相同的键找到一个较低的分数时,我希望它将值升级到它找到的最小值。
编辑:
我添加了 txt 文件中的几行:
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose1 SCORE_sum: -70.13763978228677 avg_score: -0.7 SD_score: 0.44 avg_GBSA: -5.92 SD_GBSA: 2.96 avg_RMSD: 9.75 SD_RMSD: 3.49
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose2 SCORE_sum: -18.39638945104759 avg_score: -0.18 SD_score: 0.26 avg_GBSA: -5.2 SD_GBSA: 4.57 avg_RMSD: 34.57 SD_RMSD: 9.29
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose3 SCORE_sum: -206.23402454507794 avg_score: -2.06 SD_score: 1.15 avg_GBSA: -6.8 SD_GBSA: 1.66 avg_RMSD: 4.05 SD_RMSD: 1.73
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose4 SCORE_sum: -27.56483931516906 avg_score: -0.28 SD_score: 0.64 avg_GBSA: -2.2 SD_GBSA: 3.13 avg_RMSD: 15.43 SD_RMSD: 6.74
我已经按照提示更新了代码!
该脚本需要将与键关联的值更新为它找到的最低分数。
你的旧值可能等于None,而且...按分子管理旧值合乎逻辑吗?你不要那样做。
class Ranker:
def __init__(self):
self.results = {}
with open('HTS_post_docking/Summary_post_docking.txt', 'r') as summary:
for line in summary:
molecule_score = float(line.split()[2])
molecule_name = str(line.split()[0].split('/')[9]).split('_')[0]
if molecule_name not in self.results:
self.results[molecule_name] = score
elif self.results[molecule_name] > score:
self.results[molecule_name] = score
已解决!
class Ranker:
def __init__(self):
self.results = {}
with open('HTS_post_docking/Summary_post_docking.txt', 'r') as summary:
for line in summary:
self.set_score(line)
self.sorted = dict(sorted(self.results.items(), key=lambda item: item[1]))
print(self.sorted)
def set_score(self, line):
new_score = float(line.split()[2])
frag_name = str(line.split()[0].split('/')[9]).split('_')[0]
if not (0 >= new_score >= -250):
return
if frag_name in self.results.keys():
old_score = self.results[frag_name]
if new_score > old_score:
return
self.results[frag_name] = new_score
我有一个非常大的文本文件 (Summary_post_docking.txt),我想过滤它以找到最低分数。 这是我想出的:
class Ranker:
def __init__(self):
self.results = {}
with open('HTS_post_docking/Summary_post_docking.txt', 'r') as summary:
for line in summary:
score = float(line.split()[2])
frag_name = str(line.split()[0].split('/')[9]).split('_')[0]
if 0 >= score >= -200:
self.results[frag_name] = score
old = self.results[frag_name]
if frag_name in self.results.keys():
new = float(line.split()[2])
if new < old:
self.results[frag_name] = new
print(self.results)
不幸的是,所有这一切都是采用它读取的最后一个值,而不是用新的较低值覆盖。
[str(line.split()[0].split('/')[9]).split('_')[0]] 是分子的名字,而 float( line.split()[2]) 是与之关联的分数。
我希望脚本将分子名称存储为键,将分数存储为值。对于每一行,每次它使用相同的键找到一个较低的分数时,我希望它将值升级到它找到的最小值。
编辑:
我添加了 txt 文件中的几行:
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose1 SCORE_sum: -70.13763978228677 avg_score: -0.7 SD_score: 0.44 avg_GBSA: -5.92 SD_GBSA: 2.96 avg_RMSD: 9.75 SD_RMSD: 3.49
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose2 SCORE_sum: -18.39638945104759 avg_score: -0.18 SD_score: 0.26 avg_GBSA: -5.2 SD_GBSA: 4.57 avg_RMSD: 34.57 SD_RMSD: 9.29
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose3 SCORE_sum: -206.23402454507794 avg_score: -2.06 SD_score: 1.15 avg_GBSA: -6.8 SD_GBSA: 1.66 avg_RMSD: 4.05 SD_RMSD: 1.73
/scratch/ludovico3/spike/stalk/vs_docking_smiles/HTS_postdock/1_600/HTS_post_docking/Z385446130_pose4 SCORE_sum: -27.56483931516906 avg_score: -0.28 SD_score: 0.64 avg_GBSA: -2.2 SD_GBSA: 3.13 avg_RMSD: 15.43 SD_RMSD: 6.74
我已经按照提示更新了代码! 该脚本需要将与键关联的值更新为它找到的最低分数。
你的旧值可能等于None,而且...按分子管理旧值合乎逻辑吗?你不要那样做。
class Ranker:
def __init__(self):
self.results = {}
with open('HTS_post_docking/Summary_post_docking.txt', 'r') as summary:
for line in summary:
molecule_score = float(line.split()[2])
molecule_name = str(line.split()[0].split('/')[9]).split('_')[0]
if molecule_name not in self.results:
self.results[molecule_name] = score
elif self.results[molecule_name] > score:
self.results[molecule_name] = score
已解决!
class Ranker:
def __init__(self):
self.results = {}
with open('HTS_post_docking/Summary_post_docking.txt', 'r') as summary:
for line in summary:
self.set_score(line)
self.sorted = dict(sorted(self.results.items(), key=lambda item: item[1]))
print(self.sorted)
def set_score(self, line):
new_score = float(line.split()[2])
frag_name = str(line.split()[0].split('/')[9]).split('_')[0]
if not (0 >= new_score >= -250):
return
if frag_name in self.results.keys():
old_score = self.results[frag_name]
if new_score > old_score:
return
self.results[frag_name] = new_score