来自字典的意外字符串翻译

Unexpected String Translation from Dictionary

我想编写一个程序,读取一个文件并将一个 4 个字符长的短文本字符串转换为一个 4 个字符的新字符串。目前,我读入了一个包含两列的制表符分隔文本文件:"old tag" 和 "new tag"。我能够成功构建一个字典,将 "old tag" 映射为键,将 "new tag" 映射为值。

当我尝试使用 maketrans()str.translate() 时,我的问题就出现了。不知何故,我的 "old_tag" 被转换成了 "new_tag",我什至 都没有 在我的字典里!我附上了我的意思的截图。

"P020" should get converted to "AGAC" as outline in my dictionary.

The error is that variable "old_tag" should get converted to "AGAC" as outlined in my dictionary, but it's instead getting converted to "ACAC" (look at variable "new_tag"). I don't even have ACAC in my translation table!

这是我执行字符串翻译的函数:

def translate_tag(f_in, old_tag, trn_dict):
"""Function to convert any old tags to their new format based on the translation dictionary (variable "trn_dict")."""
try:
    # tag_lookup = trn_dict[old_tag]
    # trans = maketrans(old_tag, tag_lookup)
    trans = maketrans(old_tag, trn_dict[old_tag])  # Just did the above two lines on one line
except KeyError:
    print("Error in file {}! The tag {} wasn't found in the translation table. "
          "Make sure the translation table is up to date. "
          "The program will continue with the rest of the file, but this tag will be skipped!".format(f_in,
                                                                                                      old_tag))
    return None
new_tag = old_tag.translate(trans)
return new_tag

Here's我的翻译table。这是一个制表符分隔的文本文件,旧标签是第 1 列,新标签是第 2 列。我将旧标签转换为新标签。

奇怪的是它对某些标签转换得很好。例如,"P010" 得到正确翻译。是什么导致了这个问题?

您不应使用 maketrans,因为它适用于 单个字符 (根据 the official documentation)。将其设为字典,将原始文本(第一列)作为键,将新文本(第二列)作为值。

然后你可以用 trn_dict[x] 查找任何标记 x,用 try 包装或预先测试 if x in trn_dict.

database = """P001  AAAA
P002    AAAT
P003    AAAG
P004    AAAC
P005    AATA
P006    AATT
P007    AATG
P008    AATC
P009    ATAA
P010    ATAT
P011    ATAG
P012    ATAC
P013    ATTA
P014    ATTT
P015    ATTG
P016    ATTC
P017    AGAA
P018    AGAT
P019    AGAG
P020    AGAC
P021    AGTA
P022    AGTT
P023    AGTG
P024    AGTC
""".splitlines()

trn_dict = {str.split()[0]:str.split()[1] for str in database}

def translate_tag(old_tag, trn_dict):
    """Function to convert any old tags to their new format based on the translation dictionary (variable "trn_dict")."""
    try:
        return trn_dict[old_tag]
    except KeyError:
        print("Error in file {}! The tag {} wasn't found in the translation table. "
              "Make sure the translation table is up to date. "
              "The program will continue with the rest of the file, but this tag will be skipped!")
    return None

print (translate_tag('P020', trn_dict))

显示期望值AGAC

(从字符串到列表到字典的代码是在程序中获取数据的快速技巧,并不是本指南的真正组成部分。)