来自字典的意外字符串翻译
Unexpected String Translation from Dictionary
我想编写一个程序,读取一个文件并将一个 4 个字符长的短文本字符串转换为一个 4 个字符的新字符串。目前,我读入了一个包含两列的制表符分隔文本文件:"old tag" 和 "new tag"。我能够成功构建一个字典,将 "old tag" 映射为键,将 "new tag" 映射为值。
当我尝试使用 maketrans()
和 str.translate()
时,我的问题就出现了。不知何故,我的 "old_tag" 被转换成了 "new_tag",我什至 都没有 在我的字典里!我附上了我的意思的截图。
"P020" should get converted to "AGAC" as outline in my dictionary.
The error is that variable "old_tag" should get converted to "AGAC" as outlined in my dictionary, but it's instead getting converted to "ACAC" (look at variable "new_tag"). I don't even have ACAC in my translation table!
这是我执行字符串翻译的函数:
def translate_tag(f_in, old_tag, trn_dict):
"""Function to convert any old tags to their new format based on the translation dictionary (variable "trn_dict")."""
try:
# tag_lookup = trn_dict[old_tag]
# trans = maketrans(old_tag, tag_lookup)
trans = maketrans(old_tag, trn_dict[old_tag]) # Just did the above two lines on one line
except KeyError:
print("Error in file {}! The tag {} wasn't found in the translation table. "
"Make sure the translation table is up to date. "
"The program will continue with the rest of the file, but this tag will be skipped!".format(f_in,
old_tag))
return None
new_tag = old_tag.translate(trans)
return new_tag
Here's我的翻译table。这是一个制表符分隔的文本文件,旧标签是第 1 列,新标签是第 2 列。我将旧标签转换为新标签。
奇怪的是它对某些标签转换得很好。例如,"P010" 得到正确翻译。是什么导致了这个问题?
您不应使用 maketrans
,因为它适用于 单个字符 (根据 the official documentation)。将其设为字典,将原始文本(第一列)作为键,将新文本(第二列)作为值。
然后你可以用 trn_dict[x]
查找任何标记 x
,用 try
包装或预先测试 if x in trn_dict
.
database = """P001 AAAA
P002 AAAT
P003 AAAG
P004 AAAC
P005 AATA
P006 AATT
P007 AATG
P008 AATC
P009 ATAA
P010 ATAT
P011 ATAG
P012 ATAC
P013 ATTA
P014 ATTT
P015 ATTG
P016 ATTC
P017 AGAA
P018 AGAT
P019 AGAG
P020 AGAC
P021 AGTA
P022 AGTT
P023 AGTG
P024 AGTC
""".splitlines()
trn_dict = {str.split()[0]:str.split()[1] for str in database}
def translate_tag(old_tag, trn_dict):
"""Function to convert any old tags to their new format based on the translation dictionary (variable "trn_dict")."""
try:
return trn_dict[old_tag]
except KeyError:
print("Error in file {}! The tag {} wasn't found in the translation table. "
"Make sure the translation table is up to date. "
"The program will continue with the rest of the file, but this tag will be skipped!")
return None
print (translate_tag('P020', trn_dict))
显示期望值AGAC
。
(从字符串到列表到字典的代码是在程序中获取数据的快速技巧,并不是本指南的真正组成部分。)
我想编写一个程序,读取一个文件并将一个 4 个字符长的短文本字符串转换为一个 4 个字符的新字符串。目前,我读入了一个包含两列的制表符分隔文本文件:"old tag" 和 "new tag"。我能够成功构建一个字典,将 "old tag" 映射为键,将 "new tag" 映射为值。
当我尝试使用 maketrans()
和 str.translate()
时,我的问题就出现了。不知何故,我的 "old_tag" 被转换成了 "new_tag",我什至 都没有 在我的字典里!我附上了我的意思的截图。
"P020" should get converted to "AGAC" as outline in my dictionary.
The error is that variable "old_tag" should get converted to "AGAC" as outlined in my dictionary, but it's instead getting converted to "ACAC" (look at variable "new_tag"). I don't even have ACAC in my translation table!
这是我执行字符串翻译的函数:
def translate_tag(f_in, old_tag, trn_dict):
"""Function to convert any old tags to their new format based on the translation dictionary (variable "trn_dict")."""
try:
# tag_lookup = trn_dict[old_tag]
# trans = maketrans(old_tag, tag_lookup)
trans = maketrans(old_tag, trn_dict[old_tag]) # Just did the above two lines on one line
except KeyError:
print("Error in file {}! The tag {} wasn't found in the translation table. "
"Make sure the translation table is up to date. "
"The program will continue with the rest of the file, but this tag will be skipped!".format(f_in,
old_tag))
return None
new_tag = old_tag.translate(trans)
return new_tag
Here's我的翻译table。这是一个制表符分隔的文本文件,旧标签是第 1 列,新标签是第 2 列。我将旧标签转换为新标签。
奇怪的是它对某些标签转换得很好。例如,"P010" 得到正确翻译。是什么导致了这个问题?
您不应使用 maketrans
,因为它适用于 单个字符 (根据 the official documentation)。将其设为字典,将原始文本(第一列)作为键,将新文本(第二列)作为值。
然后你可以用 trn_dict[x]
查找任何标记 x
,用 try
包装或预先测试 if x in trn_dict
.
database = """P001 AAAA
P002 AAAT
P003 AAAG
P004 AAAC
P005 AATA
P006 AATT
P007 AATG
P008 AATC
P009 ATAA
P010 ATAT
P011 ATAG
P012 ATAC
P013 ATTA
P014 ATTT
P015 ATTG
P016 ATTC
P017 AGAA
P018 AGAT
P019 AGAG
P020 AGAC
P021 AGTA
P022 AGTT
P023 AGTG
P024 AGTC
""".splitlines()
trn_dict = {str.split()[0]:str.split()[1] for str in database}
def translate_tag(old_tag, trn_dict):
"""Function to convert any old tags to their new format based on the translation dictionary (variable "trn_dict")."""
try:
return trn_dict[old_tag]
except KeyError:
print("Error in file {}! The tag {} wasn't found in the translation table. "
"Make sure the translation table is up to date. "
"The program will continue with the rest of the file, but this tag will be skipped!")
return None
print (translate_tag('P020', trn_dict))
显示期望值AGAC
。
(从字符串到列表到字典的代码是在程序中获取数据的快速技巧,并不是本指南的真正组成部分。)