如何在 java 保留标点符号中将字符串拆分为单词?
How to split string into words in java preserving punctuation?
这是输入
hello; this is cool?
great, awesome
我希望我的输出是
hello;
this
is
cool?
great,
awesome
我基本上认为一个词里面有标点符号。这是我对我的应用程序的定义。我想根据 space、tabspace 和换行符拆分单词。大多数 Whosebug 问题和答案都假设单词不包含标点符号,那么我该如何解决这个问题?
直接在代码中注释和解释:
//1st possibility: every single whitespace character (space, tab, newline, carriage return, vertical tab) will be treated as a separator
String s="hello; this is cool?\ngreat, awesome";
String[] array1 = s.split("\s");
System.out.println("======first case=====");
for(int i=0; i<array1.length; i++)
System.out.println(array1[i]);
//2nd possibility (groups of consecutive whitespace characters (space, tab, newline, carriage return, vertical tab) will be treated as a single separator
String[] array2 = s.split("\s+");
System.out.println("=====second case=====");
for(int i=0; i<array2.length; i++)
System.out.println(array2[i]);
//notice the difference in the output!!!
输出:
======first case=====
hello;
this
is
cool?
great,
<----- Notice the empty string
<----- Notice the empty string
awesome
=====second case=====
hello;
this
is
cool?
great,
awesome
这是输入
hello; this is cool?
great, awesome
我希望我的输出是
hello;
this
is
cool?
great,
awesome
我基本上认为一个词里面有标点符号。这是我对我的应用程序的定义。我想根据 space、tabspace 和换行符拆分单词。大多数 Whosebug 问题和答案都假设单词不包含标点符号,那么我该如何解决这个问题?
直接在代码中注释和解释:
//1st possibility: every single whitespace character (space, tab, newline, carriage return, vertical tab) will be treated as a separator
String s="hello; this is cool?\ngreat, awesome";
String[] array1 = s.split("\s");
System.out.println("======first case=====");
for(int i=0; i<array1.length; i++)
System.out.println(array1[i]);
//2nd possibility (groups of consecutive whitespace characters (space, tab, newline, carriage return, vertical tab) will be treated as a single separator
String[] array2 = s.split("\s+");
System.out.println("=====second case=====");
for(int i=0; i<array2.length; i++)
System.out.println(array2[i]);
//notice the difference in the output!!!
输出:
======first case=====
hello;
this
is
cool?
great,
<----- Notice the empty string
<----- Notice the empty string
awesome
=====second case=====
hello;
this
is
cool?
great,
awesome