在 Java 和 C# 中使用 POSIX 删除所有标点符号会产生不同的输出
Removing all punctuation using POSIX in Java and C# produce different output
这是我的尝试:
Java:
public static void main(String[] args) {
String text = "This && is **^^ a ~~@@ test.";
System.out.println(Pattern.compile("\p{Punct}").matcher(text).replaceAll(""));
// OUT: This is a test --> As I expected
}
C#:
static void Main(string[] args) {
string text = "This && is **^^ a ~~@@ test.";
Console.WriteLine(Regex.Replace(text, "\p{P}", ""));
// OUT: This is ^^ a ~~ test
// expected: This is a test
Console.ReadLine();
}
有什么想法吗?谢谢!
"\p{P}"
表示在 Java and C# 中相同,即匹配 Unicode 类别 P
(标点符号)。
Java的"\p{Punct}"
有别的意思,documented是:
Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
所以,等效的 C# 是 "[!\"#$%&'()*+,\-./:;<=>?@\[\\\]^_`{|}~]"
这是我的尝试:
Java:
public static void main(String[] args) {
String text = "This && is **^^ a ~~@@ test.";
System.out.println(Pattern.compile("\p{Punct}").matcher(text).replaceAll(""));
// OUT: This is a test --> As I expected
}
C#:
static void Main(string[] args) {
string text = "This && is **^^ a ~~@@ test.";
Console.WriteLine(Regex.Replace(text, "\p{P}", ""));
// OUT: This is ^^ a ~~ test
// expected: This is a test
Console.ReadLine();
}
有什么想法吗?谢谢!
"\p{P}"
表示在 Java and C# 中相同,即匹配 Unicode 类别 P
(标点符号)。
Java的"\p{Punct}"
有别的意思,documented是:
Punctuation: One of
!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
所以,等效的 C# 是 "[!\"#$%&'()*+,\-./:;<=>?@\[\\\]^_`{|}~]"