如何访问 `StringCvt.scanString (RE.find compiledComment) 输入中的 len 和 pos
How to access len and pos in `StringCvt.scanString (RE.find compiledComment) input`
背景:
我正在尝试使用正则表达式来解析一种语言的评论,whihc 以 //
:
开头
structure Main =
struct
structure RE = RegExpFn(
structure P = AwkSyntax
structure E = ThompsonEngine
)
val regexes = [
("[a-zA-z@= ]* *//.*", fn match => ("comment", match)),
("[0-9]*", fn match => ("2nd", match)),
("1tom|2jerry", fn match => ("3rd", match))
]
fun main () =
let
val input = "@=abs //sdfasdfdfa sdf as"
val comment = "[a-zA-z@= ]* *//"
val compiledComment = RE.compileString comment
in
(* #1 StringCvt.scanString (RE.match regexes) input *)
(* #2 StringCvt.scanString (RE.find compiledComment) input *)
(* #3 case ... of ... *)
end
end
input
是我的testcase,希望trim//sdfasdfdfa sdf as
只保留@=abs
。
以下是我的一些尝试:
- 设
StringCvt.scanString (RE.find compiledComment) input
为fun main
的return值:
- Main.main();
[autoloading]
[autoloading done]
val it = SOME (Match ({len=8,pos=0},[])) : StringCvt.cs Main.RE.match option
- 设
StringCvt.scanString (RE.match regexes) input
为return值:
- Main.main();
[autoloading]
[autoloading done]
val it = SOME ("comment",Match ({len=#,pos=#},[]))
: (string * StringCvt.cs Main.RE.match) option
这两个案例告诉我StringCvt.scanString (RE.find compiledComment) input
是我想要的,因为它的值包含{len=8,pos=0},[])
,可以用来trim所有评论。但我对它的值和类型有点困惑:val it = SOME (Match ({len=8,pos=0},[])) : StringCvt.cs Main.RE.match option
。如何在此处访问 len
和 pos
?为什么 StringCvt.cs
和 Main.RE.match
只分开 space?
谷歌搜索 sml 的文档后,我将获得的所有信息包括在下面:
#+BEGIN_SRC sml
StringCvt.scanString (RE.match regexes) input
val it = SOME (
"comment" , Match ({len=#,pos=#},[]))
: (string * StringCvt.cs Main.RE.match)
option
StringCvt.scanString (RE.find compiledComment) input
val it = SOME (
Match ({len=8,pos=0},[]))
: StringCvt.cs Main.RE.match
option
val find : regexp ->
(char,'a) StringCvt.reader -> ({pos : 'a, len : int} option MatchTree.match_tree,'a) StringCvt.reader
val scanString :
((char, cs) reader -> ('a, cs) reader) -> string -> 'a option
val match : (string * ({pos : 'a, len : int} option MatchTree.match_tree -> 'b)) list -> (char,'a) StringCvt.reader -> ('b,'a) StringCvt.reader
#+END_SRC
type cs
The abstract type of the character stream used by scanString. A value of this type represents the state of a character stream. The concrete type is left unspecified to allow implementations a choice of representations. Typically, cs will be an integer index into a string.
IIUC,Match
的类型应该是StringCvt.cs
,({len=8,pos=0},[]))
和({len=#,pos=#},[]))
的类型应该是Main.RE.match
。然后我开始模式匹配:
let
...
in
case StringCvt.scanString (RE.find compiledComment) input of
NONE => ""
| SOME (
StringCvt.cs ({len = b, pos = a}, _)) => String.substring (input a b)
不幸的是,
main.sml:23.19-23.39 Error: non-constructor applied to argument in pattern
main.sml:23.92 Error: unbound variable or constructor: a
main.sml:23.94 Error: unbound variable or constructor: b
main.sml:23.86-23.95 Error: operator is not a function [tycon mismatch]
operator: string
in expression:
input <errorvar>
[autoloading failed: unable to load module(s)]
stdIn:1.2-1.11 Error: unbound structure: Main in path Main.main
似乎我不能将 StringCvt.cs
用于模式,因为它不是构造函数。然后我尝试使用 wildcard
:
case StringCvt.scanString (RE.find compiledComment) input of
NONE => ""
| SOME (_ ({len = b, pos = a}, _)) => String.substring (input a b)
,
main.sml:23.19 Error: non-constructor applied to argument in pattern
那么,Match
的构造函数在这里是必须的吗?我不能再深入挖掘了。你有什么想法?提前致谢
已解决:
case StringCvt.scanString (RE.find compiledComment) input
of NONE => ""
| SOME match =>
let
val {pos, len} = MatchTree.root match
in
String.substring (input, 0, pos)
end
背景:
我正在尝试使用正则表达式来解析一种语言的评论,whihc 以 //
:
structure Main =
struct
structure RE = RegExpFn(
structure P = AwkSyntax
structure E = ThompsonEngine
)
val regexes = [
("[a-zA-z@= ]* *//.*", fn match => ("comment", match)),
("[0-9]*", fn match => ("2nd", match)),
("1tom|2jerry", fn match => ("3rd", match))
]
fun main () =
let
val input = "@=abs //sdfasdfdfa sdf as"
val comment = "[a-zA-z@= ]* *//"
val compiledComment = RE.compileString comment
in
(* #1 StringCvt.scanString (RE.match regexes) input *)
(* #2 StringCvt.scanString (RE.find compiledComment) input *)
(* #3 case ... of ... *)
end
end
input
是我的testcase,希望trim//sdfasdfdfa sdf as
只保留@=abs
。
以下是我的一些尝试:
- 设
StringCvt.scanString (RE.find compiledComment) input
为fun main
的return值:
- Main.main();
[autoloading]
[autoloading done]
val it = SOME (Match ({len=8,pos=0},[])) : StringCvt.cs Main.RE.match option
- 设
StringCvt.scanString (RE.match regexes) input
为return值:
- Main.main();
[autoloading]
[autoloading done]
val it = SOME ("comment",Match ({len=#,pos=#},[]))
: (string * StringCvt.cs Main.RE.match) option
这两个案例告诉我StringCvt.scanString (RE.find compiledComment) input
是我想要的,因为它的值包含{len=8,pos=0},[])
,可以用来trim所有评论。但我对它的值和类型有点困惑:val it = SOME (Match ({len=8,pos=0},[])) : StringCvt.cs Main.RE.match option
。如何在此处访问 len
和 pos
?为什么 StringCvt.cs
和 Main.RE.match
只分开 space?
谷歌搜索 sml 的文档后,我将获得的所有信息包括在下面:
#+BEGIN_SRC sml
StringCvt.scanString (RE.match regexes) input
val it = SOME (
"comment" , Match ({len=#,pos=#},[]))
: (string * StringCvt.cs Main.RE.match)
option
StringCvt.scanString (RE.find compiledComment) input
val it = SOME (
Match ({len=8,pos=0},[]))
: StringCvt.cs Main.RE.match
option
val find : regexp ->
(char,'a) StringCvt.reader -> ({pos : 'a, len : int} option MatchTree.match_tree,'a) StringCvt.reader
val scanString :
((char, cs) reader -> ('a, cs) reader) -> string -> 'a option
val match : (string * ({pos : 'a, len : int} option MatchTree.match_tree -> 'b)) list -> (char,'a) StringCvt.reader -> ('b,'a) StringCvt.reader
#+END_SRC
type cs
The abstract type of the character stream used by scanString. A value of this type represents the state of a character stream. The concrete type is left unspecified to allow implementations a choice of representations. Typically, cs will be an integer index into a string.
IIUC,Match
的类型应该是StringCvt.cs
,({len=8,pos=0},[]))
和({len=#,pos=#},[]))
的类型应该是Main.RE.match
。然后我开始模式匹配:
let
...
in
case StringCvt.scanString (RE.find compiledComment) input of
NONE => ""
| SOME (
StringCvt.cs ({len = b, pos = a}, _)) => String.substring (input a b)
不幸的是,
main.sml:23.19-23.39 Error: non-constructor applied to argument in pattern
main.sml:23.92 Error: unbound variable or constructor: a
main.sml:23.94 Error: unbound variable or constructor: b
main.sml:23.86-23.95 Error: operator is not a function [tycon mismatch]
operator: string
in expression:
input <errorvar>
[autoloading failed: unable to load module(s)]
stdIn:1.2-1.11 Error: unbound structure: Main in path Main.main
似乎我不能将 StringCvt.cs
用于模式,因为它不是构造函数。然后我尝试使用 wildcard
:
case StringCvt.scanString (RE.find compiledComment) input of
NONE => ""
| SOME (_ ({len = b, pos = a}, _)) => String.substring (input a b)
,
main.sml:23.19 Error: non-constructor applied to argument in pattern
那么,Match
的构造函数在这里是必须的吗?我不能再深入挖掘了。你有什么想法?提前致谢
已解决:
case StringCvt.scanString (RE.find compiledComment) input
of NONE => ""
| SOME match =>
let
val {pos, len} = MatchTree.root match
in
String.substring (input, 0, pos)
end