从子字符串中提取的映射运算符
Map operators extracted from substring
我有 list of dict
s:
print (L)
[{0: 'x==1', 1: 'y==2', 2: 'z!=1'}, {0: 'x==1', 1: 'y<=3', 2: 'z>1'}]
我想创建元组,值在运算符之前,运算符和值之后:
#first step
wanted = [[('x', '==', '1'), ('y', '==', '2'), ('z', '!=', '1')],
[('x', '==', '1'), ('y', '<=', '3'), ('z', '>', '1')]]
然后通过运算符映射第二个值:
import operator
ops = {'>': operator.gt,
'<': operator.lt,
'>=': operator.ge,
'<=': operator.le,
'==': operator.eq,
'!=': operator.ne}
#expected final output
wanted = [[('x', <built-in function eq>, '1'),
('y', <built-in function eq>, '2'),
('z', <built-in function ne>, '1')],
[('x', <built-in function eq>, '1'),
('y', <built-in function le>, '3'),
('z', <built-in function gt>, '1')]]
我试试:
L = [[re.findall(r'(.*)([<>=!]+)(.*)', v)[0] for k, v in x.items()] for x in L]
print (L)
[[('x=', '=', '1'), ('y=', '=', '2'), ('z!', '=', '1')],
[('x=', '=', '1'), ('y<', '=', '3'), ('z', '>', '1')]]
L = [[ops[y[1]] for y in x] for x in L]
但问题是错误匹配了中间子串 - 运算符,然后错误匹配了运算符的值。
正确匹配的正确正则表达式是什么?或者这是另一种可能的解决方案。例如通过 string.partition
?我打开所有可能的解决方案。
如果您的输入确实如此简单,我认为最直接的方法是 split 运算符字符:
In [1]: import re
In [2]: data = [{0: 'x==1', 1: 'y==2', 2: 'z!=1'}, {0: 'x==1', 1: 'y<=3', 2: 'z>1'}]
In [3]: rgx = re.compile(r'([<>=!]+)')
In [4]: [[rgx.split(v) for v in d.values()] for d in data]
Out[4]:
[[['x', '==', '1'], ['y', '==', '2'], ['z', '!=', '1']],
[['x', '==', '1'], ['y', '<=', '3'], ['z', '>', '1']]]
请注意,如果您将捕获组添加到拆分器正则表达式,它就会包含在内!
然后,完成它:
In [11]: ops = {'>': operator.gt,
...: '<': operator.lt,
...: '>=': operator.ge,
...: '<=': operator.le,
...: '==': operator.eq,
...: '!=': operator.ne}
...:
In [12]: parsed = [[rgx.split(v) for v in d.values()] for d in data]
In [13]: [[(x, ops[op], y) for x,op,y in ps] for ps in parsed]
Out[13]:
[[('x', <function _operator.eq>, '1'),
('y', <function _operator.eq>, '2'),
('z', <function _operator.ne>, '1')],
[('x', <function _operator.eq>, '1'),
('y', <function _operator.le>, '3'),
('z', <function _operator.gt>, '1')]]
将贪婪法第一个子串正则表达式更改为唯一的单词字符:
L = [{0: 'x==1', 1: 'y==2', 2: 'z!=1'}, {0: 'x==1', 1: 'y<=3', 2: 'z>1'}]
L = [[re.findall(r'(\w)([<>=!]+)(.*)', v)[0] for k, v in x.items()] for x in L]
[[(y[0],ops[y[1]],y[2]) for y in x] for x in L]
[[('x', <function _operator.eq>, '1'),
('y', <function _operator.eq>, '2'),
('z', <function _operator.ne>, '1')],
[('x', <function _operator.eq>, '1'),
('y', <function _operator.le>, '3'),
('z', <function _operator.gt>, '1')]]
或根据评论中的 jezrael
建议(1 行列表理解):
L = [[[(z[0], ops[z[1]], z[2]) for z in re.findall(r'(\w)([<>=!]+)(.*)', v)][0] for k, v in x.items()] for x in L]
或者我们不需要键所以直接使用值:
L = [[[(z[0], ops[z[1]], z[2]) for z in re.findall(r'(\w)([<>=!]+)(.*)', v)][0] for v in x.values()] for x in L]
问题是 *
是一个贪婪的字符匹配器。因此,在 x==1
中,如果 *
可以匹配多个字符,它会同时满足第二组 ([<>=!]+)
单个 =
字符。
解决方案:
假设非运算符组永远不会包括<
、>
、=
或!
,而不是使用*
, 使用否定字符集:
re.findall(r'([^<>=!]+)([<>=!]+)([^<>=!]+)', v)
使用垂直条的交替来捕获运算符:
re.findall(r'(.*)((?:>|<|<=|>=|==|!=))(.*)', v)
我有 list of dict
s:
print (L)
[{0: 'x==1', 1: 'y==2', 2: 'z!=1'}, {0: 'x==1', 1: 'y<=3', 2: 'z>1'}]
我想创建元组,值在运算符之前,运算符和值之后:
#first step
wanted = [[('x', '==', '1'), ('y', '==', '2'), ('z', '!=', '1')],
[('x', '==', '1'), ('y', '<=', '3'), ('z', '>', '1')]]
然后通过运算符映射第二个值:
import operator
ops = {'>': operator.gt,
'<': operator.lt,
'>=': operator.ge,
'<=': operator.le,
'==': operator.eq,
'!=': operator.ne}
#expected final output
wanted = [[('x', <built-in function eq>, '1'),
('y', <built-in function eq>, '2'),
('z', <built-in function ne>, '1')],
[('x', <built-in function eq>, '1'),
('y', <built-in function le>, '3'),
('z', <built-in function gt>, '1')]]
我试试:
L = [[re.findall(r'(.*)([<>=!]+)(.*)', v)[0] for k, v in x.items()] for x in L]
print (L)
[[('x=', '=', '1'), ('y=', '=', '2'), ('z!', '=', '1')],
[('x=', '=', '1'), ('y<', '=', '3'), ('z', '>', '1')]]
L = [[ops[y[1]] for y in x] for x in L]
但问题是错误匹配了中间子串 - 运算符,然后错误匹配了运算符的值。
正确匹配的正确正则表达式是什么?或者这是另一种可能的解决方案。例如通过 string.partition
?我打开所有可能的解决方案。
如果您的输入确实如此简单,我认为最直接的方法是 split 运算符字符:
In [1]: import re
In [2]: data = [{0: 'x==1', 1: 'y==2', 2: 'z!=1'}, {0: 'x==1', 1: 'y<=3', 2: 'z>1'}]
In [3]: rgx = re.compile(r'([<>=!]+)')
In [4]: [[rgx.split(v) for v in d.values()] for d in data]
Out[4]:
[[['x', '==', '1'], ['y', '==', '2'], ['z', '!=', '1']],
[['x', '==', '1'], ['y', '<=', '3'], ['z', '>', '1']]]
请注意,如果您将捕获组添加到拆分器正则表达式,它就会包含在内!
然后,完成它:
In [11]: ops = {'>': operator.gt,
...: '<': operator.lt,
...: '>=': operator.ge,
...: '<=': operator.le,
...: '==': operator.eq,
...: '!=': operator.ne}
...:
In [12]: parsed = [[rgx.split(v) for v in d.values()] for d in data]
In [13]: [[(x, ops[op], y) for x,op,y in ps] for ps in parsed]
Out[13]:
[[('x', <function _operator.eq>, '1'),
('y', <function _operator.eq>, '2'),
('z', <function _operator.ne>, '1')],
[('x', <function _operator.eq>, '1'),
('y', <function _operator.le>, '3'),
('z', <function _operator.gt>, '1')]]
将贪婪法第一个子串正则表达式更改为唯一的单词字符:
L = [{0: 'x==1', 1: 'y==2', 2: 'z!=1'}, {0: 'x==1', 1: 'y<=3', 2: 'z>1'}]
L = [[re.findall(r'(\w)([<>=!]+)(.*)', v)[0] for k, v in x.items()] for x in L]
[[(y[0],ops[y[1]],y[2]) for y in x] for x in L]
[[('x', <function _operator.eq>, '1'),
('y', <function _operator.eq>, '2'),
('z', <function _operator.ne>, '1')],
[('x', <function _operator.eq>, '1'),
('y', <function _operator.le>, '3'),
('z', <function _operator.gt>, '1')]]
或根据评论中的 jezrael
建议(1 行列表理解):
L = [[[(z[0], ops[z[1]], z[2]) for z in re.findall(r'(\w)([<>=!]+)(.*)', v)][0] for k, v in x.items()] for x in L]
或者我们不需要键所以直接使用值:
L = [[[(z[0], ops[z[1]], z[2]) for z in re.findall(r'(\w)([<>=!]+)(.*)', v)][0] for v in x.values()] for x in L]
问题是 *
是一个贪婪的字符匹配器。因此,在 x==1
中,如果 *
可以匹配多个字符,它会同时满足第二组 ([<>=!]+)
单个 =
字符。
解决方案:
假设非运算符组永远不会包括
<
、>
、=
或!
,而不是使用*
, 使用否定字符集:re.findall(r'([^<>=!]+)([<>=!]+)([^<>=!]+)', v)
使用垂直条的交替来捕获运算符:
re.findall(r'(.*)((?:>|<|<=|>=|==|!=))(.*)', v)