评估 defaultdict 棘手的字符串形式

Eval on tricky string forms of defaultdict

不知何故,我收到了一个使用 str() 方法保存的 defaultdict 文本文件,utf8 字符串为:

defaultdict(<class 'set'>, {'protection': {'1058c_204062v_00:39:16->00:39:18_ko'}, 'protect': {'50c_45523v_00:01:22->00:01:24_ko', '5457c_150765v_00:08:34->00:08:37_ko', '5457c_144739v_00:34:25->00:34:28_ko', '1058c_204062v_00:39:36->00:39:39_ko', '504c_68856v_00:15:47->00:15:49_ko'}})

当我使用 eval() 时,它抛出:

Traceback (most recent call last):
  File "consolidate.py", line 9, in <module>
    print (eval(translation_counter), ast.literal_eval(location))
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ast.py", line 46, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ast.py", line 35, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)
  File "<unknown>", line 1
    defaultdict(<class 'set'>, {'protection': {'1058c_204062v_00:39:16->00:39:18_ko'}, 'protect': {'50c_45523v_00:01:22->00:01:24_ko', '5457c_150765v_00:08:34->00:08:37_ko', '5457c_144739v_00:34:25->00:34:28_ko', '1058c_204062v_00:39:36->00:39:39_ko', '504c_68856v_00:15:47->00:15:49_ko'}})
                ^
SyntaxError: invalid syntax

根据 ,我也试过 ast.literal_eval() 并且它抛出与上面相同的错误。

然后我尝试使用 `.replace('_', '_') 以某种方式逃避它,它抛出了这个:

Traceback (most recent call last):
  File "consolidate.py", line 9, in <module>
    print (eval(translation_counter), ast.literal_eval(location.replace('_', r'\_')))
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ast.py", line 46, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ast.py", line 35, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)
  File "<unknown>", line 1
    defaultdict(<class 'set'>, {'protection': {'1058c\_204062v\_00:39:16->00:39:18\_ko'}, 'protect': {'50c\_45523v\_00:01:22->00:01:24\_ko', '5457c\_150765v\_00:08:34->00:08:37\_ko', '5457c\_144739v\_00:34:25->00:34:28\_ko', '1058c\_204062v\_00:39:36->00:39:39\_ko', '504c\_68856v\_00:15:47->00:15:49\_ko'}})
                ^
SyntaxError: invalid syntax

完整代码:

# -*- coding: utf-8 -*-

from collections import defaultdict, Counter
import ast

with open('related.txt', 'r', encoding='utf8') as fin:
    for line in fin:
        location = line.strip()
        print (eval(location))

head -n1 related.txt 看起来像这样:

defaultdict(<class 'set'>, {'protection': {'1058c_204062v_00:39:16->00:39:18_ko'}, 'protect': {'50c_45523v_00:01:22->00:01:24_ko', '5457c_150765v_00:08:34->00:08:37_ko', '5457c_144739v_00:34:25->00:34:28_ko', '1058c_204062v_00:39:36->00:39:39_ko', '504c_68856v_00:15:47->00:15:49_ko'}})

那是因为 <class 'set'> 无法计算。 您需要从中提取 class

p = re.compile(r"^defaultdict\(<class '(\w+)'>")
c = p.findall(s)[0]

然后将其替换为 class

的名称
new_s = s.replace("<class '%s'>"% c, c)

该字符串应该能够 eval 得到这个结果

defaultdict(set,
        {'protect': {'1058c_204062v_00:39:36->00:39:39_ko',
          '504c_68856v_00:15:47->00:15:49_ko',
          '50c_45523v_00:01:22->00:01:24_ko',
          '5457c_144739v_00:34:25->00:34:28_ko',
          '5457c_150765v_00:08:34->00:08:37_ko'},
         'protection': {'1058c_204062v_00:39:16->00:39:18_ko'}})