类似结构的树保姆匹配

tree-sitter match for similar structures

我正在尝试为 Minecraft 函数语法创建 tree-sitter

语言的结构如下所示:

command @e[key=value] other args

我对上例中第二个参数(目标选择器)的值有疑问。这个值可以是很多东西,比如字符串、数字、布尔值和两个类似的对象结构(NBT 和记分牌对象)。

以下是每个示例:

NBT

{key:value}

记分牌对象

{key=number} // where number is: N, ..N, N.., or N..N

我的语法文件包含以下代码:

// unrelated code removed

module.exports = grammar({
  name: "mcfunction",
  rules: {
    root: $ => repeat(
      choice(
        $.command
      )
    ),
    command: $ => prec.right(seq(
      field("command_name", $.identifier),
      repeat(
        choice(
          $.selector
        )
      ),
      "\n"
    )),
    identifier: $ => /[A-Za-z][\w-]+/,
    number: $ => prec(1, /-?\d+(\.\d+)?/),
    boolean: $ => choice(
      "true",
      "false"
    ),
    string: $ => seq(
      "\"",
      repeat(
        choice(
          $._escape_sequence,
          /[^"]/
        )
      ),
      "\""
    ),
    _escape_sequence: $ => seq("\", "\""),
    selector: $ => seq(
      token(
        seq(
          "@",
          choice(
            "p", "a", "e", "s", "r"
          )
        )
      ),
      optional(
        seq(
          token.immediate("["),
          optional(
            repeat(
              seq(
                $.selector_option,
                optional(",")
              )
            )
          ),
          "]"
        )
      ),
    ),
    selector_option: $ => seq(
      $.selector_key,
      "=",
      $.selector_value
    ),
    selector_key: $ => /[a-z_-]+/,
    selector_value: $ => choice(
      $.item,
      $.path,
      $.selector_key,
      $.selector_number,
      $.number,
      $.boolean,
      $.selector_object
    ),
    selector_number: $ => prec.right(1, choice(
      seq(
        "..",
        $.number
      ),
      seq(
        $.number,
        "..",
        $.number
      ),
      seq(
        $.number,
        ".."
      ),
      $.number
    )),
    selector_object: $ => choice(
      seq(
        "{",
        repeat(
          seq(
            $.selector_score,
            optional(",")
          )
        ),
        "}"
      ),
      seq(
        "{",
        repeat(
          seq(
            $.selector_nbt,
            optional(",")
          )
        ),
        "}"
      )
    ),
    selector_nbt: $ => seq(
      $.nbt_object_key,
      ":",
      $.nbt_object_value
    ),
    selector_score: $ => seq(
      field("selector_score_key", $.selector_key),
      "=",
      field("selector_score_value", $.selector_number)
    ),
    _namespace: $ => /[a-z_-]+:/,
    item: $ => seq(
      $._namespace,
      $.selector_key
    ),
    path: $ => seq(
      choice($.item, /[a-z_]+/),
      repeat1(
        token("/", /[a-z_]/)
      )
    ),
    nbt: $ => choice(
      $.nbt_array,
      $.nbt_object
    ),
    nbt_object: $ => seq(
      "{",
      repeat(
        seq(
          $.nbt_object_key,
          ":",
          $.nbt_object_value,
          optional(",")
        )
      ),
      "}"
    ),
    nbt_array: $ => seq(
      "[",
      repeat(
        seq(
          $.nbt_object_value,
          optional(",")
        )
      ),
      "]"
    ),
    nbt_object_key: $ => choice(
      $.string,
      $.number,
      $.identifier
    ),
    nbt_object_value: $ => choice(
      $.string,
      $.nbt_number,
      $.boolean,
      $.nbt
    ),
    nbt_number: $ => seq(
      $.number,
      field("nbt_number_suffix", optional(choice("l","s","d","f","b")))
    )
  }
});

但是,如果我编译和解析 test @e[scores={example=1..}],我得到:

(root [0, 0] - [6, 0]
  (command [0, 0] - [1, 0]
    command_name: (identifier [0, 0] - [0, 4])
    (selector [0, 5] - [0, 29]
      (selector_option [0, 8] - [0, 28]
        (selector_key [0, 8] - [0, 14])
        (selector_value [0, 15] - [0, 28]
          (selector_object [0, 15] - [0, 28]
            (ERROR [0, 16] - [0, 27]
              (nbt_object_key [0, 16] - [0, 23]
                (identifier [0, 16] - [0, 23]))))))))
tests/test.mcfunction  0 ms    (ERROR [0, 16] - [0, 27])

预期:而不是ERROR,应该是selector_score,并且应该有一个score_keyscore_value

如果我从 selector_object 中删除 selector_nbt 序列,则不会发生这种情况。但是,如果 运行 使用 nbt 数据对命令进行解析(使用两个序列或仅使用 selector_nbt),则没有错误。

我做错了什么?

我通过使用两个冲突键中的 choice 解决了这个问题,如下所示:

choice(
  alias($.key_1, $.key_2),
  $.key_2
)

ahlinc 在 GitHub 上回答:

You can fix your error for the above grammar by assigning lexer precedence for the selector_key terminal over the identifier terminal like:

selector_key: $ => token(prec(1, /[a-z_-]+/)),

But you need to note that you use regexps that clashes:

identifier: $ => /[A-Za-z][\w-]+/,
selector_key: $ => token(prec(1, /[a-z_-]+/)),

If it's impossible to rewrite the above regexps to don't have conflicts in them then you may need a workaround described here: #1287 (reply in thread)