列表的球拍模式匹配

Question

我正在尝试与列表进行模式匹配，但出于某种原因，我在执行以下操作时得到了意外的匹配：

> (define code '(h1 ((id an-id-here)) Some text here))
> (define code-match-expr '(pre ([class brush: python]) ...))
> (match code
    [code-match-expr #t]
    [_ #f])
#t

问题：为什么code匹配code-match-expr？

实际用例

我在Racket REPL中尝试了这个，因为我其实想解决另一个实际问题：使用Pollen的pygments包装函数来高亮代码，稍后将输出为HTML。为此我写了下面的代码，问题出现的地方：

(define (read-post-from-file path)
  (Post-from-content (replace-code-xexprs (parse-markdown path))))

(define (replace-code-xexprs list-of-xexprs)
  ;; define known languages
  (define KNOWN-LANGUAGE-SYMBOLS
    (list 'python
          'racket
          'html
          'css
          'javascript
          'erlang
          'rust))
  ;; check if it matches for a single language's match expression
  ;; if it mathces any language, return that language's name as a symbol
  (define (get-matching-language an-xexpr)
    (define (matches-lang-match-expr? an-xexpr lang-symbol)
      (display "XEXPR:") (displayln an-xexpr)
      (match an-xexpr
        [`(pre ([class brush: ,lang-symbol]) (code () ,more ...)) lang-symbol]
        [`(pre ([class brush: ,lang-symbol]) ,more ...) lang-symbol]
        [_ #f]))

    (ormap (lambda (lang-symbol)
             ;; (display "trying to match ")
             ;; (display an-xexpr)
             ;; (display " against ")
             ;; (displayln lang-symbol)
             (matches-lang-match-expr? an-xexpr lang-symbol))
           KNOWN-LANGUAGE-SYMBOLS))

  ;; replace code in an xexpr with highlightable code
  ;; TODO: What happens if the code is in a lower level of the xexpr?
  (define (replace-code-in-single-xexpr an-xexpr)
    (let ([matching-language (get-matching-language an-xexpr)])
      (cond [matching-language (code-highlight an-xexpr matching-language)]
            [else an-xexpr])))

  ;; apply the check to all xexpr
  (map replace-code-in-single-xexpr list-of-xexprs))

(define (code-highlight language code)
  (highlight language code))

在这个例子中，我正在解析一个包含以下内容的降价文件：

# Code Demo

```python
def hello():
    print("Hello World!")
```

我得到以下 xexprs:

1.

(h1 ((id code-demo)) Code Demo)

2.

(pre ((class brush: python)) (code () def hello():
    print("Hello World!")))

但是，由于某种原因，其中 none 个匹配。

Answer 1

match 是语法，不评估模式。由于 code-match-expr 是一个符号，它将整个表达式（计算 code 的结果）绑定到变量 code-match-expr 并将其余表达式计算为模式匹配。结果总是 #t.

注意第二个模式，符号_，是相同的模式。它也匹配整个表达式，但 _ 的特殊之处在于它不像 code-match-expr 那样被绑定。

永远不要使用您定义的变量 code-match-expr 很重要，但是由于 match 绑定了一个具有相同名称的变量，您的原始绑定将在 [=14= 的结果中被隐藏].

按您的预期工作的代码可能如下所示：

(define (test code)
  (match code 
    [`(pre ([class brush: python]) ,more ...) #t]
    [_ #f]))

(test '(h1 ((id an-id-here)) Some text here))
; ==> #f

(test '(pre ((class brush: python))))
; ==> #t

(test '(pre ((class brush: python)) a b c))
; ==> #t

如您所见，模式 ,more ... 表示 零个或多个 并且忽略了哪种括号，因为在 Racket [] 中与 () 和 {}.

编辑

你还是有点倒退了。在这段代码中：

(define (matches-lang-match-expr? an-xexpr lang-symbol)
  (display "XEXPR:") (displayln an-xexpr)
  (match an-xexpr
    [`(pre ([class brush: ,lang-symbol]) (code () ,more ...)) lang-symbol]
    [`(pre ([class brush: ,lang-symbol]) ,more ...) lang-symbol]
    [_ #f]))

当模式被处理时，由于 lang-symbol 未被引用，它将匹配 anything atomic 并作为该子句中的变量绑定到它。它将 nothing 与同名的绑定变量相关，因为 match 不使用变量，而是创建它们。你return这个变量。因此：

(matches-lang-match-expr? '(pre ([class brush: jiffy]) bla bla bla) 'ignored-argument)
; ==> jiffy

这里有一些东西可以满足您的需求：

 (define (get-matching-language an-xexpr)
    (define (get-language an-xexpr)
      (match an-xexpr
        [`(pre ([class brush: ,lang-symbol]) (code () ,more ...)) lang-symbol]
        [`(pre ([class brush: ,lang-symbol]) ,more ...) lang-symbol]
        [_ #f]))
    (let* ((matched-lang-symbol (get-language an-xexpr))
           (in-known-languages (memq matched-lang-symbol KNOWN-LANGUAGE-SYMBOLS)))
      (and in-known-languages (car in-known-languages))))

再次.. match 将准引用滥用到与创建列表结构完全不同的东西上。它使用它们来匹配文字并将未加引号的符号捕获为变量。

Answer 2

确保您清楚自己匹配的是什么。在 Racket x-expressions 中，属性名称是符号，但值是字符串。因此，您要匹配的表达式类似于 (pre ([class "brush: js"])) ___) -- not (pre ([class brush: js]) ___).

要匹配该字符串并提取 "brush: " 之后的部分，您可以使用 pregexp 匹配模式。 Here is a snippet that Frog uses to extract the language to give to Pygments:

(for/list ([x xs])
  (match x
    [(or `(pre ([class ,brush]) (code () ,(? string? texts) ...))
         `(pre ([class ,brush]) ,(? string? texts) ...))
     (match brush
       [(pregexp "\s*brush:\s*(.+?)\s*$" (list _ lang))
        `(div ([class ,(str "brush: " lang)])
              ,@(pygmentize (apply string-append texts) lang
                            #:python-executable python-executable
                            #:line-numbers? line-numbers?
                            #:css-class css-class))]
       [_ `(pre ,@texts)])]
    [x x])))

(这里 pygmentize 是在其他 Frog 源代码中定义的函数；它是运行 Pygments 的包装器，作为一个单独的进程和它之间的管道文本。但是你可以用另一种方式替代Pygments 或任何其他语法荧光笔。那是 N/A 用于您关于 match 的问题。我提到它只是为了避免分散注意力和另一个嵌入式问题。:))

列表的球拍模式匹配

Racket pattern matching of lists

list

ellipsis

pattern-matching

racket

实际用例