如何从 GADT / 存在类型的有效负载中提取有用信息?

How do I extract useful information from the payload of a GADT / existential type?

我正在尝试在生成的解析器中使用 Menhir's incremental parsing API and introspection APIs。比方说,我想确定与特定 LR(1) 堆栈条目关联的 语义值 ;即,解析器先前使用的令牌。

给定一个抽象的解析检查点,封装在 Menhir 的类型 'a env 中,我可以从 LR 自动机中提取一个“堆栈元素”;它看起来像这样:

type element =
  | Element: 'a lr1state * 'a * position * position -> element

The type element describes one entry in the stack of the LR(1) automaton. In a stack element of the form Element (s, v, startp, endp), s is a (non-initial) state and v is a semantic value. The value v is associated with the incoming symbol A of the state s. In other words, the value v was pushed onto the stack just before the state s was entered. Thus, for some type 'a, the state s has type 'a lr1state and the value v has type 'a ...

In order to do anything useful with the value v, one must gain information about the type 'a, by inspection of the state s. So far, the type 'a lr1state is abstract, so there is no way of inspecting s. The inspection API (§9.3) offers further tools for this purpose.

好的,太棒了!所以我去深入检查 API:

The type 'a terminal is a generalized algebraic data type (GADT). A value of type 'a terminal represents a terminal symbol (without a semantic value). The index 'a is the type of the semantic values associated with this symbol ...

type _ terminal =
| T_A : unit terminal
| T_B : int terminal

The type 'a nonterminal is also a GADT. A value of type 'a nonterminal represents a nonterminal symbol (without a semantic value). The index 'a is the type of the semantic values associated with this symbol ...

type _ nonterminal =
| N_main : thing nonterminal

将这些拼凑在一起,我得到类似以下内容(其中“command”是我的语法的非终结符之一,因此 N_commandstring nonterminal):

let current_command (env : 'a env) =
   let rec f i =
      match Interpreter.get i env with
      | None -> None
      | Some Interpreter.Element (lr1state, v, _startp, _endp) ->
      match Interpreter.incoming_symbol lr1state with
      | Interpreter.N Interpreter.N_command -> Some v
      | _ -> f (i + 1)
   in
   f 0

不幸的是,这让我产生了非常混乱的类型错误:

File "src/incremental.ml", line 110, characters 52-53:
Error: This expression has type string but an expression was expected of type
         string
       This instance of string is ambiguous:
       it would escape the scope of its equation

这有点超出我的水平了!我很确定我明白为什么我 不能 做我上面尝试做的事情;但我不明白我的选择是什么。其实Menhir手册中特别提到了这个复杂度:

This function can be used to gain access to the semantic value v in a stack element Element (s, v, _, _). Indeed, by case analysis on the symbol incoming_symbol s, one gains information about the type 'a, hence one obtains the ability to do something useful with the value v.

好吧,但这就是我认为我 做了 ,上面:match'ing 在 incoming_symbol s 上进行的案例分析,提出了案例v 是单一的特定类型:string.

tl;dr:我如何从这个 GADT 中提取字符串有效负载,并用它做一些有用的事情?

如果您的错误听起来像

This instance of string is ambiguous: it would escape the scope of its equation

这意味着类型检查器不确定在模式匹配分支之外 v 的类型是否应该是字符串,或者等于 string 但仅在内部的其他类型分支。您只需要在离开分支时添加类型注释即可消除这种歧义:

 | Interpreter.(N N_command) -> Some (v:string)