Str.global_replace 在 OCaml 中将克拉放在不应该的地方

Str.global_replace in OCaml putting carats where they shouldn't be

我正在努力将多行字符串转换为标记列表,这样我可能更容易使用。

根据我的项目的具体需要,我用空格填充出现在我输入中的任何克拉符号,以便 "^" 变成 " ^ "。我正在使用类似以下功能的功能来执行此操作:

let bad_function string = Str.global_replace (Str.regexp "^") " ^ " (string)

然后我使用类似于下面的函数将此多行字符串转换为标记列表(忽略空格)。

let string_to_tokens string = (Str.split (Str.regexp "[ \n\r\x0c\t]+") (string));;

出于某种原因,bad_function 将克拉添加到不应添加的地方。取下面一行代码:

(bad_function " This is some 
            multiline input 
            with newline characters 
            and tabs. When I convert this string
            into a list of tokens I get ^s showing up where 
            they shouldn't. ")

第一行字符串变成:

^  This is some \n ^

当我将 bad_function 的输出输入 string_to_tokens 时,我得到以下列表:

string_to_tokens (bad_function " This is some 
            multiline input 
            with newline characters 
            and tabs. When I convert this string
            into a list of tokens I get ^s showing up where 
            they shouldn't. ")

["^"; "This"; "is"; "some"; "^"; "multiline"; "input"; "^"; "with";
 "newline"; "characters"; "^"; "and"; "tabs."; "When"; "I"; "convert";
 "this"; "string"; "^"; "into"; "a"; "list"; "of"; "tokens"; "I"; "get";
 "^s"; "showing"; "up"; "where"; "^"; "they"; "shouldn't."]

为什么会发生这种情况,我该如何解决才能让这些函数按照我的意愿运行?

Str 模块中所述。

^ Matches at beginning of line: either at the beginning of the matched string, or just after a '\n' character.

因此您必须使用转义字符“\”来引用“^”字符。 但是,请注意(也来自文档)

any backslash character in the regular expression must be doubled to make it past the OCaml string parser.

这意味着您必须输入双“\”才能在不收到警告的情况下执行您想要的操作。

这应该可以完成工作:

let bad_function string = Str.global_replace (Str.regexp "\^") " ^ " (string);;