如何将分隔符设置为“\\p{Punct}”不包括引号?
How to set delimiter as "\\p{Punct}" excluding quotation mark?
如果我想将扫描器的定界符设置为 scanner.useDelimiter("\p{Punct}");
但不希望引号包含在该列表中,是否有简单的方法来排除它?
我试过 s.useDelimiter("(\p{Digit}|\s|\p{Punct}&&[^"])+");
但括号中的引号正在关闭 for 引号。
您可以调用方法Scanner#useDelimiter(Pattern)
:
scanner.useDelimiter(Pattern.compile("[\p{Punct}&&[^\"]]"))
[[\p{Punct}&&[^\"]]
匹配 \p{Punct}
覆盖的所有字符,除了已转义的双引号。
这叫做Character Class Subtraction, see Java Trail: [Regular Expression] Character Classes:
Finally, you can use subtraction to negate one or more nested character classes, such as [0-9&&[^345]]. This example creates a single character class that matches everything from 0 to 9, except the numbers 3, 4, and 5.
对于给定的请求,这是模式 [\p{Punct}&&[^"]]
(正常情况下对字符串文字进行转义)。
如果我想将扫描器的定界符设置为 scanner.useDelimiter("\p{Punct}");
但不希望引号包含在该列表中,是否有简单的方法来排除它?
我试过 s.useDelimiter("(\p{Digit}|\s|\p{Punct}&&[^"])+");
但括号中的引号正在关闭 for 引号。
您可以调用方法Scanner#useDelimiter(Pattern)
:
scanner.useDelimiter(Pattern.compile("[\p{Punct}&&[^\"]]"))
[[\p{Punct}&&[^\"]]
匹配 \p{Punct}
覆盖的所有字符,除了已转义的双引号。
这叫做Character Class Subtraction, see Java Trail: [Regular Expression] Character Classes:
Finally, you can use subtraction to negate one or more nested character classes, such as [0-9&&[^345]]. This example creates a single character class that matches everything from 0 to 9, except the numbers 3, 4, and 5.
对于给定的请求,这是模式 [\p{Punct}&&[^"]]
(正常情况下对字符串文字进行转义)。