使用宏解析中缀字符串

Parsing an infixed string with a macro

我正在尝试计算字符串中的中缀表达式。

用于评估我的代码的一些示例数据:

(def data {:Location "US-NY-Location1"
           :Priority 3})

(def qual "(Location = \"US\")")

我希望将 qual 字符串转换为类似这种形式并由 clojure 评估:

(= (:Location data) "US")

我写了下面的宏来实现这个:

(defmacro parse-qual [[data-key op val] data-map]
  `(~op ((keyword (str (quote ~data-key))) ~data-map) ~val))

和辅助函数:

(defn eval-qual [qual-str data]
  (eval `(parse-qual ~(clojure.edn/read-string qual-str) ~data)))

(eval-qual qual data) 为我提供了预期的结果

这是我编写的第一个宏,我仍在努力解决所有引用和取消引用的问题。

  1. 我想知道是否有更高效的方法来实现以上? (甚至根本不需要宏)

  2. 如何扩展宏来处理嵌套表达式。处理像 ((Location = "US") or (Priority > 2)) 这样的表达式。任何指针将不胜感激。我目前正在尝试使用 tree-seq 来解决这个问题。

  3. 如果 qual 字符串无效,我怎样才能让它更健壮、更优雅。

我还编写了 parse-qual 宏的第二次迭代,如下所示:

(defmacro parse-qual-2 [qual-str data-map]
  (let [[data-key op val] (clojure.edn/read-string qual-str)]
    `(~op ((keyword (str (quote ~data-key))) ~data-map) ~val)))

并在 macroexpand 上抛出以下内容:

playfield.core> (macroexpand `(parse-qual-2 qual data))
java.lang.ClassCastException: clojure.lang.Symbol cannot be cast to java.lang.String

而且我不知道如何调试它!

一些额外信息:

macroexpand of parse-qual on the REPL 给我以下信息:

playfield.core> (macroexpand
 `(parse-qual ~(clojure.edn/read-string qual) data))

(= ((clojure.core/keyword (clojure.core/str (quote Location))) playfield.core/data) "US")

谢谢@Alan Thompson,我能够将其编写为如下函数,这也允许计算嵌套表达式。

(def qual "(Location = \"US\")")
(def qual2 "((Location = \"US\") or (Priority > 2))")
(def qual3 "(Priority > 2)")
(def qual4 "(((Location = \"US\") or (Priority > 2)) and (Active = true))")

(defn eval-qual-2 [qual-str data]
  (let [[l op r] (clojure.edn/read-string qual-str)]
    (cond
      (and (seq? l)
           (seq? r)) (eval (list op (list eval-qual-2 (str l) data) (list eval-qual-2 (str r) data)))
      (seq? l)       (eval (list op (list eval-qual-2 (str l) data) r))
      (seq? r)       (eval (list op (list (keyword  l) data) (list eval-qual-2 (str r) data)))
      :else          (eval (list op (list (keyword  l) data) r)))))

(eval-qual-2 qual data) ; => false
(eval-qual-2 qual2 data) ; => true
(eval-qual-2 qual3 data) ; => true
(eval-qual-2 qual3 data) ; => true

您不需要也不需要为此使用宏。普通函数可以像这样处理数据。

宏仅用于转换源代码 - 您在编写宏时实际上是在添加编译器扩展。

要转换数据,只需使用普通函数即可。

这里是您如何操作的概述:

(ns tst.demo.core
  (:use demo.core tupelo.core tupelo.test)
  (:require
    [clojure.tools.reader.edn :as edn] ))

(def data {:Location "US-NY-Location1"
           :Priority 3})

(def qual "(Location = \"US\")")

(dotest
  (let-spy [
        ast       (spyx (edn/read-string qual))
        ident-str (first ast)
        ident-kw  (keyword ident-str)
        op        (second ast)
        data-val  (last ast)
        expr      (list op (list ident-kw data) data-val)
        result (eval expr)
        ] 
    ))

结果:

----------------------------------
   Clojure 1.9.0    Java 10.0.1
----------------------------------

(edn/read-string qual) => (Location = "US")
ast => (Location = "US")
ident-str => Location
ident-kw => :Location
op => =
data-val => "US"
expr => (= (:Location {:Location "US-NY-Location1", :Priority 3}) "US")
result => false

请注意,您仍然需要修复位置的 "US" 部分,然后它才会为您提供 true 结果。

let-spyare here and here的文档。


更新

对于嵌套表达式,一般要use postwalk

还有,别忘了the Clojure CheatSheet!

下面是一个使用 Instaparse 定义条件语法并将字符串输入解析为语法树的示例:

(def expr-parser
  (p/parser
    "<S> = SIMPLE | COMPLEX
     SIMPLE = <'('> NAME <' '> OP <' '> VAL <')'>
     COMPLEX = <'('> S <' '> BOOLOP <' '> S <')'>
     <BOOLOP> = 'or' | 'and'
     NAME = #'[A-Za-z]+'
     VAL = #'[0-9]+' | #'\".+?\"' | 'true' | 'false'
     OP = '=' | '>'"))

还有一个函数来解析然后翻译已解析树的部分内容,以便以后更轻松地进行评估:

(defn parse [s]
  (pt/transform
    {:NAME keyword
     :OP   (comp resolve symbol)
     :VAL  edn/read-string}
    (expr-parser s)))

一些示例输出:

(parse "(Location = \"US\")")
=> ([:SIMPLE :Location #'clojure.core/= "US"])
(parse "(((Location = \"US\") or (Priority > 2)) and (Active = true))")
=>
([:COMPLEX
  [:COMPLEX [:SIMPLE :Location #'clojure.core/= "US"] "or" [:SIMPLE :Priority #'clojure.core/> 2]]
  "and"
  [:SIMPLE :Active #'clojure.core/= true]])

然后是根据地图评估标准的函数,不使用 eval:

(defn evaluate [m expr]
  (clojure.walk/postwalk
    (fn [v]
      (cond
        (and (coll? v) (= :SIMPLE (first v)))
        (let [[_ k f c] v]
          (f (get m k) c))

        (and (coll? v) (= :COMPLEX (first v)))
        (let [[_ lhs op rhs] v]
          (case op
            "or" (or lhs rhs)
            "and" (and lhs rhs)))

        :else v))
    (parse expr)))

(evaluate {:location "US"} "(location = \"US\")")
=> (true)

它也适用于嵌套表达式:

(evaluate
  {:distance 1 :location "MS"}
  "((distance > 0) and ((location = \"US\") or ((distance = 1) and (location = \"MS\"))))")
=> (true)

How can I make this more robust and be more graceful in case of an invalid qual string.

使用 Instaparse(或类似工具)的另一个好处是 "free" 的错误报告。 Instaparse 的错误将漂亮地打印在 REPL 中,但它们也可以被视为包含失败细节的地图。

(defn parse [s]
  (let [parsed (expr-parser s)]
    (or (p/get-failure parsed) ;; check for failure
        (pt/transform
          {:NAME keyword
           :OP   (comp resolve symbol)
           :VAL  edn/read-string}
          parsed))))

(parse "(distance > 2") ;; missing closing paren
=> Parse error at line 1, column 14:
(distance > 2
             ^
Expected:
")" (followed by end-of-string)

总的来说,只要您的解析器语法相对有限,这种方法应该比 eval-ing 任意输入更安全。