如何在不删除分隔符的情况下拆分字符串?
How do I split a string without deleting delimiters?
我的 AutoIt 脚本按句子解析文本。因为它们很可能以句号、问号或感叹号结尾,所以我用它来逐句拆分文本:
$LineArray = StringSplit($displayed_file, "!?.", 2)
问题;它会删除定界符(句号、问号和句末的感叹号)。例如,字符串 One. Two. Three.
被拆分为 One
、Two
和 Three
。
如何在保留句子结尾的句号、问号和感叹号的同时拆分成句子?
试试这个:
#include<Array.au3>
Global $str = "One. Two. Three. This is a test! Does it work? Yes, man! "
$re = StringRegExp($str, '(.*?[.!?])', 3)
_ArrayDisplay($re)
此模式在句子开头没有 space 的情况下有效
#include<Array.au3>
Global $str = "One. Two. Three.This is a test! Does it work? Yes, man! "
$re = StringRegExp($str, '(\S.*?[.!?])', 3)
_ArrayDisplay($re)
使用StringSplit()
the delimiters are consumed in the process (and so are lost for the result). Using StringRegExp()
:
#include <array.au3>
$string="This is a text. It has several sentences. Really? Of Course!"
$a = stringregexp($string,"(?U)(.*[.?!])",3)
_ArrayDisplay($a)
要删除前导 space(s),请将模式更改为 "(?U)[ ]*?(.*[.?!])"
。或 "(?U) *?(.*[.?!] )"
在 [.!?]
处拆分加上 <space>
(在最后一句中添加一个 space):
#include <array.au3>
$string = "Do you know Pi? Yes! What's it? It's 3.14159! That's correct."
$a = StringRegExp($string & " ", "(?U)[ ]*?(.*[.?!] )", 3)
_ArrayDisplay($a)
要在句子中保留 @CRLF
(\r\n
):
#include <array.au3>
$string = "Do you " & @CRLF & "know Pi? Yes! What's it? It's" & @CRLF & "3.14159! That's correct."
$a = StringRegExp($string & " ", "(?s)(?U)[ ]*?(.*[.?!][ \R] )", 3)
_ArrayDisplay($a,"Sentences") ;_ArrayDisplay doesn't show @CRLF
For $i In $a
;MsgBox(0,"",$i)
ConsoleWrite(StringStripWS($i, 3) & @CRLF & "---------" & @CRLF)
Next
当行尾与句尾相同时,这不会保留 @CRLF
:...line end!" & @CRLF & "Next line...
.