从rtf字符串中提取字符串内容java
Extract string content from rtf string java
我有以下 rtf 字符串:\af31507 \ltrch\fcs0 \insrsid6361256 Study Title: {Test for 14431 process\'27s \u8805 1000 Testing2 14432 \u8805 8000}}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid12283827
,我想提取 Study Title 的内容,即 (Study Title: {Test for 14431 process\'27s \u8805 1000 Testing2 14432 \u8805 8000}
)。下面是我的代码
String[] arr = value.split("\s+");
//System.out.println(arr.length);
for(int j=0; j<arr.length; j++) {
if(isNumeric(arr[j])) {
arr[j] = "\?" + arr[j];
}
}
在上面的代码中,我将字符串拆分为 space 并遍历数组以检查字符串中是否有任何数字,但是,isNumeric
函数无法处理 8000
在 \u8805
之后,因为它获取的内容是 8000}}{\rtlch\fcs1
。我不确定如何使用正则表达式搜索研究标题及其内容?
Study Title: {[^}]*}
将符合您的期望。演示:https://regex101.com/r/FZl1WL/1
String s = "{\af31507 \ltrch\fcs0 \insrsid6361256 Study Title: {Test for 14431 process\'27s \u8805 1000 Testing2 14432 \u8805 8000}}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid12283827";
Pattern p = Pattern.compile("Study Title: \{[^}]*\}");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group());
}
输出:
Study Title: {Test for 14431 process\'27s \u8805 1000 Testing2 14432 \u8805 8000}
根据 OP 要求更新
String s = "{\af31507 \ltrch\fcs0 \insrsid6361256 Study Title: {Test for 14431 process\'27s \u8805 1000 Testing2 14432 \u8805 8000}}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid12283827";
Pattern p = Pattern.compile("(?<=Study Title: \{)[^}]*(?=\})");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group());
}
Test for 14431 process\'27s \u8805 1000 Testing2 14432 \u8805 8000
我有以下 rtf 字符串:\af31507 \ltrch\fcs0 \insrsid6361256 Study Title: {Test for 14431 process\'27s \u8805 1000 Testing2 14432 \u8805 8000}}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid12283827
,我想提取 Study Title 的内容,即 (Study Title: {Test for 14431 process\'27s \u8805 1000 Testing2 14432 \u8805 8000}
)。下面是我的代码
String[] arr = value.split("\s+");
//System.out.println(arr.length);
for(int j=0; j<arr.length; j++) {
if(isNumeric(arr[j])) {
arr[j] = "\?" + arr[j];
}
}
在上面的代码中,我将字符串拆分为 space 并遍历数组以检查字符串中是否有任何数字,但是,isNumeric
函数无法处理 8000
在 \u8805
之后,因为它获取的内容是 8000}}{\rtlch\fcs1
。我不确定如何使用正则表达式搜索研究标题及其内容?
Study Title: {[^}]*}
将符合您的期望。演示:https://regex101.com/r/FZl1WL/1
String s = "{\af31507 \ltrch\fcs0 \insrsid6361256 Study Title: {Test for 14431 process\'27s \u8805 1000 Testing2 14432 \u8805 8000}}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid12283827";
Pattern p = Pattern.compile("Study Title: \{[^}]*\}");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group());
}
输出:
Study Title: {Test for 14431 process\'27s \u8805 1000 Testing2 14432 \u8805 8000}
根据 OP 要求更新
String s = "{\af31507 \ltrch\fcs0 \insrsid6361256 Study Title: {Test for 14431 process\'27s \u8805 1000 Testing2 14432 \u8805 8000}}{\rtlch\fcs1 \af31507 \ltrch\fcs0 \insrsid12283827";
Pattern p = Pattern.compile("(?<=Study Title: \{)[^}]*(?=\})");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group());
}
Test for 14431 process\'27s \u8805 1000 Testing2 14432 \u8805 8000