从字符串中读取 Bunch()
Read Bunch() from string
我在报告文件中有以下字符串:
"Bunch(conditions=['s1', 's2', 's3', 's4', 's5', 's6'], durations=[[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]], onsets=[[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]])"
我想把它变成一个 Bunch()
对象或一个 dict
,这样我就可以访问里面的信息(通过 my_var.conditions
或 my_var["conditions"]
) .
这与 eval()
配合得很好:
eval("Bunch(conditions=['s1', 's2', 's3', 's4', 's5', 's6'], durations=[[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]], onsets=[[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]])")
但是我想避免使用它。
我尝试编写几个字符串替换,以便将其转换为 dict 语法,然后使用 json.loads()
解析它,但这看起来非常骇人听闻,一旦遇到任何问题就会中断未来字符串中的新字段;例如:
"{"+"Bunch(conditions=['s1', 's2', 's3', 's4', 's5', 's6'], durations=[[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]], onsets=[[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]])"[1:-1]+"}".replace("conditions=","'conditions':")
你懂的。
你知道有没有更好的解析方法?
这是我丑陋的一段代码,请检查:
import re
import json
l = "Bunch(conditions=['s1', 's2', 's3', 's4', 's5', 's6'], durations=[[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]], onsets=[[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]])"
exec('{}="{}"'.format(l[:5],l[6:-1]))
sb = re.split("=| [a-zA-Z]", Bunch)
temp = ['"{}"'.format(x) if x.isalpha() else x for x in sb ]
temp2 = ','.join(temp)
temp3 = temp2.replace('",[', '":[')
temp4 = temp3.replace(',,', ',')
temp5 = temp4.replace("\'", '"')
temp6 = """{%s}""" %(temp5)
rslt = json.loads(temp6)
最终,输出:
rslt
Out[12]:
{'urations': [[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]],
'conditions': ['s1', 's2', 's3', 's4', 's5', 's6'],
'nsets': [[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]]}
rslt["conditions"]
Out[13]: ['s1', 's2', 's3', 's4', 's5', 's6']
总的来说,我认为re
是你需要的包,但由于我使用它的经验有限,我在这里可以很好地应用它。希望其他人能给出更优雅的解决方案。
仅供参考,你说你可以很容易地使用 eval
来获得你想要的东西,但是当我尝试使用它时,我得到了 TypeError: 'str' object is not callable
。您使用的是哪个 Python 版本? (我在Python27和Python33上试过,都不能用)
此 pyparsing 代码将为您的 Bunch 声明定义一个解析表达式。
from pyparsing import (pyparsing_common, Suppress, Keyword, Forward, quotedString,
Group, delimitedList, Dict, removeQuotes, ParseResults)
# define pyparsing parser for the Bunch declaration
LBRACK,RBRACK,LPAR,RPAR,EQ = map(Suppress, "[]()=")
integer = pyparsing_common.integer
real = pyparsing_common.real
ident = pyparsing_common.identifier
# define a recursive expression for nested lists
listExpr = Forward()
listItem = real | integer | quotedString.setParseAction(removeQuotes) | Group(listExpr)
listExpr << LBRACK + delimitedList(listItem) + RBRACK
# define an expression for the Bunch declaration
BUNCH = Keyword("Bunch")
arg_defn = Group(ident + EQ + listItem)
bunch_decl = BUNCH + LPAR + Dict(delimitedList(arg_defn))("args") + RPAR
这是针对您的示例输入的解析器 运行:
# run the sample input as a test
sample = """Bunch(conditions=['s1', 's2', 's3', 's4', 's5', 's6'],
durations=[[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]],
onsets=[[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]])"""
bb = bunch_decl.parseString(sample)
# print the parsed output as-is
print(bb)
给出:
['Bunch', [['conditions', ['s1', 's2', 's3', 's4', 's5', 's6']],
['durations', [[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]]],
['onsets', [[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]]]]]
使用 pyparsing,你还可以添加一个解析时回调,这样 pyparsing 就会为你做 tokens->Bunch 转换:
# define a simple placeholder class for Bunch
class Bunch(object):
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
def __repr__(self):
return "Bunch:(%s)" % ', '.join("%r: %s" % item for item in vars(self).items())
# add this as a parse action, and pyparsing will autoconvert the parsed data to a Bunch
bunch_decl.addParseAction(lambda t: Bunch(**t.args.asDict()))
现在解析器将为您提供一个实际的 Bunch 实例:
[Bunch:('durations': [[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]],
'conditions': ['s1', 's2', 's3', 's4', 's5', 's6'],
'onsets': [[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]])]
我在报告文件中有以下字符串:
"Bunch(conditions=['s1', 's2', 's3', 's4', 's5', 's6'], durations=[[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]], onsets=[[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]])"
我想把它变成一个 Bunch()
对象或一个 dict
,这样我就可以访问里面的信息(通过 my_var.conditions
或 my_var["conditions"]
) .
这与 eval()
配合得很好:
eval("Bunch(conditions=['s1', 's2', 's3', 's4', 's5', 's6'], durations=[[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]], onsets=[[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]])")
但是我想避免使用它。
我尝试编写几个字符串替换,以便将其转换为 dict 语法,然后使用 json.loads()
解析它,但这看起来非常骇人听闻,一旦遇到任何问题就会中断未来字符串中的新字段;例如:
"{"+"Bunch(conditions=['s1', 's2', 's3', 's4', 's5', 's6'], durations=[[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]], onsets=[[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]])"[1:-1]+"}".replace("conditions=","'conditions':")
你懂的。
你知道有没有更好的解析方法?
这是我丑陋的一段代码,请检查:
import re
import json
l = "Bunch(conditions=['s1', 's2', 's3', 's4', 's5', 's6'], durations=[[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]], onsets=[[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]])"
exec('{}="{}"'.format(l[:5],l[6:-1]))
sb = re.split("=| [a-zA-Z]", Bunch)
temp = ['"{}"'.format(x) if x.isalpha() else x for x in sb ]
temp2 = ','.join(temp)
temp3 = temp2.replace('",[', '":[')
temp4 = temp3.replace(',,', ',')
temp5 = temp4.replace("\'", '"')
temp6 = """{%s}""" %(temp5)
rslt = json.loads(temp6)
最终,输出:
rslt
Out[12]:
{'urations': [[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]],
'conditions': ['s1', 's2', 's3', 's4', 's5', 's6'],
'nsets': [[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]]}
rslt["conditions"]
Out[13]: ['s1', 's2', 's3', 's4', 's5', 's6']
总的来说,我认为re
是你需要的包,但由于我使用它的经验有限,我在这里可以很好地应用它。希望其他人能给出更优雅的解决方案。
仅供参考,你说你可以很容易地使用 eval
来获得你想要的东西,但是当我尝试使用它时,我得到了 TypeError: 'str' object is not callable
。您使用的是哪个 Python 版本? (我在Python27和Python33上试过,都不能用)
此 pyparsing 代码将为您的 Bunch 声明定义一个解析表达式。
from pyparsing import (pyparsing_common, Suppress, Keyword, Forward, quotedString,
Group, delimitedList, Dict, removeQuotes, ParseResults)
# define pyparsing parser for the Bunch declaration
LBRACK,RBRACK,LPAR,RPAR,EQ = map(Suppress, "[]()=")
integer = pyparsing_common.integer
real = pyparsing_common.real
ident = pyparsing_common.identifier
# define a recursive expression for nested lists
listExpr = Forward()
listItem = real | integer | quotedString.setParseAction(removeQuotes) | Group(listExpr)
listExpr << LBRACK + delimitedList(listItem) + RBRACK
# define an expression for the Bunch declaration
BUNCH = Keyword("Bunch")
arg_defn = Group(ident + EQ + listItem)
bunch_decl = BUNCH + LPAR + Dict(delimitedList(arg_defn))("args") + RPAR
这是针对您的示例输入的解析器 运行:
# run the sample input as a test
sample = """Bunch(conditions=['s1', 's2', 's3', 's4', 's5', 's6'],
durations=[[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]],
onsets=[[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]])"""
bb = bunch_decl.parseString(sample)
# print the parsed output as-is
print(bb)
给出:
['Bunch', [['conditions', ['s1', 's2', 's3', 's4', 's5', 's6']],
['durations', [[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]]],
['onsets', [[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]]]]]
使用 pyparsing,你还可以添加一个解析时回调,这样 pyparsing 就会为你做 tokens->Bunch 转换:
# define a simple placeholder class for Bunch
class Bunch(object):
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
def __repr__(self):
return "Bunch:(%s)" % ', '.join("%r: %s" % item for item in vars(self).items())
# add this as a parse action, and pyparsing will autoconvert the parsed data to a Bunch
bunch_decl.addParseAction(lambda t: Bunch(**t.args.asDict()))
现在解析器将为您提供一个实际的 Bunch 实例:
[Bunch:('durations': [[30.0], [30.0], [30.0], [30.0], [30.0], [30.0]],
'conditions': ['s1', 's2', 's3', 's4', 's5', 's6'],
'onsets': [[172.77], [322.77], [472.77], [622.77], [772.77], [922.77]])]