正则表达式匹配的解构绑定
Destructuring bind for regex matches
在 elisp 中,如何获得正则表达式匹配的解构绑定?
例如,
;; what is the equivalent of this with destructuring?
(with-temp-buffer
(save-excursion (insert "a b"))
(re-search-forward "\(a\) \(b\)")
(cons (match-string 1)
(match-string 2)))
;; trying to do something like the following
(with-temp-buffer
(save-excursion (insert "a b"))
(cl-destructuring-bind (a b) (re-search-forward "\(a\) \(b\)")
(cons a b)))
我在想,如果没有其他方法,我将不得不编写一个宏来扩展匹配项。
这是一种方法:首先扩展 pcase
以接受新的 re-match
模式,其定义如下:
(pcase-defmacro re-match (re)
"Matches a string if that string matches RE.
RE should be a regular expression (a string).
It can use the special syntax \(?VAR: to bind a sub-match
to variable VAR. All other subgroups will be treated as shy.
Multiple uses of this macro in a single `pcase' are not optimized
together, so don't expect lex-like performance. But in order for
such optimization to be possible in some distant future, back-references
are not supported."
(let ((start 0)
(last 0)
(new-re '())
(vars '())
(gn 0))
(while (string-match "\\(\(?:\?\([-[:alnum:]]*\):\)?" re start)
(setq start (match-end 0))
(let ((beg (match-beginning 0))
(name (match-string 1 re)))
;; Skip false positives, either backslash-escaped or within [...].
(when (subregexp-context-p re start last)
(cond
((null name)
(push (concat (substring re last beg) "\(?:") new-re))
((string-match "\`[0-9]" name)
(error "Variable can't start with a digit: %S" name))
(t
(let* ((var (intern name))
(id (cdr (assq var vars))))
(unless id
(setq gn (1+ gn))
(setq id gn)
(push (cons var gn) vars))
(push (concat (substring re last beg) (format "\(?%d:" id))
new-re))))
(setq last start))))
(push (substring re last) new-re)
(setq new-re (mapconcat #'identity (nreverse new-re) ""))
`(and (pred stringp)
(app (lambda (s)
(save-match-data
(when (string-match ,new-re s)
(vector ,@(mapcar (lambda (x) `(match-string ,(cdr x) s))
vars)))))
(,'\` [,@(mapcar (lambda (x) (list '\, (car x))) vars)])))))
完成后,您可以按如下方式使用它:
(pcase X
((re-match "\(?var:[[:alpha:]]*\)=\(?val:.*\)")
(cons var val)))
或
(pcase-let
(((re-match "\(?var:[[:alpha:]]*\)=\(?val:.*\)") X))
(cons var val))
这还没有经过严格测试,正如文档字符串中提到的那样,当同时将一个字符串与各种正则表达式进行匹配时,它的工作效率不如 (c|sh)。此外,您只会得到匹配的子字符串,而不是它们的位置。最后,它将正则表达式搜索应用于字符串,而在 manny/most 情况下,正则表达式搜索用于缓冲区。但您可能仍然会发现它很有用。
在 elisp 中,如何获得正则表达式匹配的解构绑定?
例如,
;; what is the equivalent of this with destructuring?
(with-temp-buffer
(save-excursion (insert "a b"))
(re-search-forward "\(a\) \(b\)")
(cons (match-string 1)
(match-string 2)))
;; trying to do something like the following
(with-temp-buffer
(save-excursion (insert "a b"))
(cl-destructuring-bind (a b) (re-search-forward "\(a\) \(b\)")
(cons a b)))
我在想,如果没有其他方法,我将不得不编写一个宏来扩展匹配项。
这是一种方法:首先扩展 pcase
以接受新的 re-match
模式,其定义如下:
(pcase-defmacro re-match (re)
"Matches a string if that string matches RE.
RE should be a regular expression (a string).
It can use the special syntax \(?VAR: to bind a sub-match
to variable VAR. All other subgroups will be treated as shy.
Multiple uses of this macro in a single `pcase' are not optimized
together, so don't expect lex-like performance. But in order for
such optimization to be possible in some distant future, back-references
are not supported."
(let ((start 0)
(last 0)
(new-re '())
(vars '())
(gn 0))
(while (string-match "\\(\(?:\?\([-[:alnum:]]*\):\)?" re start)
(setq start (match-end 0))
(let ((beg (match-beginning 0))
(name (match-string 1 re)))
;; Skip false positives, either backslash-escaped or within [...].
(when (subregexp-context-p re start last)
(cond
((null name)
(push (concat (substring re last beg) "\(?:") new-re))
((string-match "\`[0-9]" name)
(error "Variable can't start with a digit: %S" name))
(t
(let* ((var (intern name))
(id (cdr (assq var vars))))
(unless id
(setq gn (1+ gn))
(setq id gn)
(push (cons var gn) vars))
(push (concat (substring re last beg) (format "\(?%d:" id))
new-re))))
(setq last start))))
(push (substring re last) new-re)
(setq new-re (mapconcat #'identity (nreverse new-re) ""))
`(and (pred stringp)
(app (lambda (s)
(save-match-data
(when (string-match ,new-re s)
(vector ,@(mapcar (lambda (x) `(match-string ,(cdr x) s))
vars)))))
(,'\` [,@(mapcar (lambda (x) (list '\, (car x))) vars)])))))
完成后,您可以按如下方式使用它:
(pcase X
((re-match "\(?var:[[:alpha:]]*\)=\(?val:.*\)")
(cons var val)))
或
(pcase-let
(((re-match "\(?var:[[:alpha:]]*\)=\(?val:.*\)") X))
(cons var val))
这还没有经过严格测试,正如文档字符串中提到的那样,当同时将一个字符串与各种正则表达式进行匹配时,它的工作效率不如 (c|sh)。此外,您只会得到匹配的子字符串,而不是它们的位置。最后,它将正则表达式搜索应用于字符串,而在 manny/most 情况下,正则表达式搜索用于缓冲区。但您可能仍然会发现它很有用。