正则表达式匹配的解构绑定

Destructuring bind for regex matches

在 elisp 中,如何获得正则表达式匹配的解构绑定?

例如,

;; what is the equivalent of this with destructuring?
(with-temp-buffer
  (save-excursion (insert "a b"))
  (re-search-forward "\(a\) \(b\)")
  (cons (match-string 1)
        (match-string 2)))

;; trying to do something like the following
(with-temp-buffer
  (save-excursion (insert "a b"))
  (cl-destructuring-bind (a b) (re-search-forward "\(a\) \(b\)")
    (cons a b)))

我在想,如果没有其他方法,我将不得不编写一个宏来扩展匹配项。

这是一种方法:首先扩展 pcase 以接受新的 re-match 模式,其定义如下:

(pcase-defmacro re-match (re)
  "Matches a string if that string matches RE.
RE should be a regular expression (a string).
It can use the special syntax \(?VAR: to bind a sub-match
to variable VAR.  All other subgroups will be treated as shy.

Multiple uses of this macro in a single `pcase' are not optimized
together, so don't expect lex-like performance.  But in order for
such optimization to be possible in some distant future, back-references
are not supported."
  (let ((start 0)
        (last 0)
        (new-re '())
        (vars '())
        (gn 0))
    (while (string-match "\\(\(?:\?\([-[:alnum:]]*\):\)?" re start)
      (setq start (match-end 0))
      (let ((beg (match-beginning 0))
            (name (match-string 1 re)))
        ;; Skip false positives, either backslash-escaped or within [...].
        (when (subregexp-context-p re start last)          
          (cond
           ((null name)
            (push (concat (substring re last beg) "\(?:") new-re))
           ((string-match "\`[0-9]" name)
            (error "Variable can't start with a digit: %S" name))
           (t
            (let* ((var (intern name))
                   (id (cdr (assq var vars))))
              (unless id
                (setq gn (1+ gn))
                (setq id gn)
                (push (cons var gn) vars))
              (push (concat (substring re last beg) (format "\(?%d:" id))
                    new-re))))
          (setq last start))))
    (push (substring re last) new-re)
    (setq new-re (mapconcat #'identity (nreverse new-re) ""))
    `(and (pred stringp)
          (app (lambda (s)
                 (save-match-data
                   (when (string-match ,new-re s)
                     (vector ,@(mapcar (lambda (x) `(match-string ,(cdr x) s))
                                       vars)))))
               (,'\` [,@(mapcar (lambda (x) (list '\, (car x))) vars)])))))

完成后,您可以按如下方式使用它:

(pcase X
  ((re-match "\(?var:[[:alpha:]]*\)=\(?val:.*\)")
   (cons var val)))

(pcase-let
    (((re-match "\(?var:[[:alpha:]]*\)=\(?val:.*\)") X))
  (cons var val))

这还没有经过严格测试,正如文档字符串中提到的那样,当同时将一个字符串与各种正则表达式进行匹配时,它的工作效率不如 (c|sh)。此外,您只会得到匹配的子字符串,而不是它们的位置。最后,它将正则表达式搜索应用于字符串,而在 manny/most 情况下,正则表达式搜索用于缓冲区。但您可能仍然会发现它很有用。