如何从 GADT / 存在类型的有效负载中提取有用信息?
How do I extract useful information from the payload of a GADT / existential type?
我正在尝试在生成的解析器中使用 Menhir's incremental parsing API and introspection APIs。比方说,我想确定与特定 LR(1) 堆栈条目关联的 语义值 ;即,解析器先前使用的令牌。
给定一个抽象的解析检查点,封装在 Menhir 的类型 'a env
中,我可以从 LR 自动机中提取一个“堆栈元素”;它看起来像这样:
type element =
| Element: 'a lr1state * 'a * position * position -> element
The type element describes one entry in the stack of the LR(1) automaton. In a stack element of the form Element (s, v, startp, endp)
, s
is a (non-initial) state and v
is a semantic value. The value v
is associated with the incoming symbol A of the state s
. In other words, the value v
was pushed onto the stack just before the state s
was entered. Thus, for some type 'a
, the state s
has type 'a lr1state
and the value v
has type 'a
...
In order to do anything useful with the value v
, one must gain information about the type 'a
, by inspection of the state s
. So far, the type 'a lr1state
is abstract, so there is no way of inspecting s
. The inspection API (§9.3) offers further tools for this purpose.
好的,太棒了!所以我去深入检查 API:
The type 'a terminal is a generalized algebraic data type (GADT). A value of type 'a terminal represents a terminal symbol (without a semantic value). The index 'a is the type of the semantic values associated with this symbol ...
type _ terminal =
| T_A : unit terminal
| T_B : int terminal
The type 'a nonterminal is also a GADT. A value of type 'a nonterminal represents a nonterminal symbol (without a semantic value). The index 'a is the type of the semantic values associated with this symbol ...
type _ nonterminal =
| N_main : thing nonterminal
将这些拼凑在一起,我得到类似以下内容(其中“command”是我的语法的非终结符之一,因此 N_command
是 string nonterminal
):
let current_command (env : 'a env) =
let rec f i =
match Interpreter.get i env with
| None -> None
| Some Interpreter.Element (lr1state, v, _startp, _endp) ->
match Interpreter.incoming_symbol lr1state with
| Interpreter.N Interpreter.N_command -> Some v
| _ -> f (i + 1)
in
f 0
不幸的是,这让我产生了非常混乱的类型错误:
File "src/incremental.ml", line 110, characters 52-53:
Error: This expression has type string but an expression was expected of type
string
This instance of string is ambiguous:
it would escape the scope of its equation
这有点超出我的水平了!我很确定我明白为什么我 不能 做我上面尝试做的事情;但我不明白我的选择是什么。其实Menhir手册中特别提到了这个复杂度:
This function can be used to gain access to the semantic value v
in a stack element Element (s, v, _, _)
. Indeed, by case analysis on the symbol incoming_symbol s
, one gains information about the type 'a
, hence one obtains the ability to do something useful with the value v
.
好吧,但这就是我认为我 做了 ,上面:match
'ing 在 incoming_symbol s
上进行的案例分析,提出了案例v
是单一的特定类型:string
.
tl;dr:我如何从这个 GADT 中提取字符串有效负载,并用它做一些有用的事情?
如果您的错误听起来像
This instance of string is ambiguous:
it would escape the scope of its equation
这意味着类型检查器不确定在模式匹配分支之外 v
的类型是否应该是字符串,或者等于 string
但仅在内部的其他类型分支。您只需要在离开分支时添加类型注释即可消除这种歧义:
| Interpreter.(N N_command) -> Some (v:string)
我正在尝试在生成的解析器中使用 Menhir's incremental parsing API and introspection APIs。比方说,我想确定与特定 LR(1) 堆栈条目关联的 语义值 ;即,解析器先前使用的令牌。
给定一个抽象的解析检查点,封装在 Menhir 的类型 'a env
中,我可以从 LR 自动机中提取一个“堆栈元素”;它看起来像这样:
type element = | Element: 'a lr1state * 'a * position * position -> element
The type element describes one entry in the stack of the LR(1) automaton. In a stack element of the form
Element (s, v, startp, endp)
,s
is a (non-initial) state andv
is a semantic value. The valuev
is associated with the incoming symbol A of the states
. In other words, the valuev
was pushed onto the stack just before the states
was entered. Thus, for some type'a
, the states
has type'a lr1state
and the valuev
has type'a
...In order to do anything useful with the value
v
, one must gain information about the type'a
, by inspection of the states
. So far, the type'a lr1state
is abstract, so there is no way of inspectings
. The inspection API (§9.3) offers further tools for this purpose.
好的,太棒了!所以我去深入检查 API:
The type 'a terminal is a generalized algebraic data type (GADT). A value of type 'a terminal represents a terminal symbol (without a semantic value). The index 'a is the type of the semantic values associated with this symbol ...
type _ terminal = | T_A : unit terminal | T_B : int terminal
The type 'a nonterminal is also a GADT. A value of type 'a nonterminal represents a nonterminal symbol (without a semantic value). The index 'a is the type of the semantic values associated with this symbol ...
type _ nonterminal = | N_main : thing nonterminal
将这些拼凑在一起,我得到类似以下内容(其中“command”是我的语法的非终结符之一,因此 N_command
是 string nonterminal
):
let current_command (env : 'a env) =
let rec f i =
match Interpreter.get i env with
| None -> None
| Some Interpreter.Element (lr1state, v, _startp, _endp) ->
match Interpreter.incoming_symbol lr1state with
| Interpreter.N Interpreter.N_command -> Some v
| _ -> f (i + 1)
in
f 0
不幸的是,这让我产生了非常混乱的类型错误:
File "src/incremental.ml", line 110, characters 52-53:
Error: This expression has type string but an expression was expected of type
string
This instance of string is ambiguous:
it would escape the scope of its equation
这有点超出我的水平了!我很确定我明白为什么我 不能 做我上面尝试做的事情;但我不明白我的选择是什么。其实Menhir手册中特别提到了这个复杂度:
This function can be used to gain access to the semantic value
v
in a stack elementElement (s, v, _, _)
. Indeed, by case analysis on the symbolincoming_symbol s
, one gains information about the type'a
, hence one obtains the ability to do something useful with the valuev
.
好吧,但这就是我认为我 做了 ,上面:match
'ing 在 incoming_symbol s
上进行的案例分析,提出了案例v
是单一的特定类型:string
.
tl;dr:我如何从这个 GADT 中提取字符串有效负载,并用它做一些有用的事情?
如果您的错误听起来像
This instance of string is ambiguous: it would escape the scope of its equation
这意味着类型检查器不确定在模式匹配分支之外 v
的类型是否应该是字符串,或者等于 string
但仅在内部的其他类型分支。您只需要在离开分支时添加类型注释即可消除这种歧义:
| Interpreter.(N N_command) -> Some (v:string)