Java 扫描词后的关键字

Java Scan for a keywords following word

我要解决一个非常具体的问题:

所以我尝试的是:

  1. 接受传入的字符串,就像 “我们有一个新成员。他的名字是:Peter。他非常好。他是组的成员:devceloper。他的生日也是: 13.08.2001. 我们还有一个新成员...
  2. 扫描字符串中的一些关键字,这些关键字将按特定顺序排列,例如 "name"、"group" 和 "birthday"
  3. 识别关键字,后面的"non-needed"个字符(每次都一样)
  4. 提取相关信息,放入二维数组中。 所以我的输出应该是这样的 {{"peter", "developer", "13.08.2001"}, {"susan", "marketing", "02.03.1997"}...}

为了完成这个,我找到了一个基本的脚本,它应该 "extract" 分隔单词,但它仍然是错误的,所以它不是很有用。

int indexOfSpace = 0; 
int nextIndexOfSpace = 0;

String sentence = "This is a sentence";

int lastIndexOfSpace = sentence.lastIndexOf(" "); 
while(indexOfSpace != lastIndexOfSpace) { 
    nextIndexOfSpace = sentence.indexOf(" ",indexOfSpace);
    String word = sentence.substring(indexOfSpace,nextIndexOfSpace);
    System.out.println("Word: " + word + " Length: " + word.length());
    indexOfSpace = nextIndexOfSpace; }

String lastWord = sentence.substring(lastIndexOfSpace);
System.out.println("Word: " + lastWord + " Length: " + lastWord.length());

我不希望你给我现成的解决方案,但我可能需要一些编程步骤的提示 ;)

如果您确定字符串将始终遵循相同的形式,则可以实施正则表达式匹配。这个想法是使用组来捕获您感兴趣的子字符串。

例如,您可以使用 .*name is: (\w+) 从字符串中捕获 Peter。同样,您可以将其应用于其他令牌。

您可以使用与此类似的正则表达式:

name[^:]*:\s*(\w+).*?group[^:]*:\s*(\w+).*?birthday[^:]*:\s*(\d+\.\d+\.\d+)

对于输入字符串:

we got a new member. he's name is: Peter. He is pretty nice. he is a member of the group: devceloper. Also he's birthday is: 13.08.2001. As well we got a new member...

它将捕获以下组:

  • 彼得
  • 开发人员
  • 13.08.2001

对模式使用匹配器,您可以迭代所有匹配项。

示例代码:

String input = "we got a new member. he's name is: Peter. He is pretty nice. he "
            + "is a member of the group: devceloper. Also he's birthday is: 13.08.2001."
            + " As well we got a new member she's name is: Sara. She is pretty nice. "
            + "she is a member of the group: customer. Also her birthday is: 21.01.1998";

Pattern pattern = Pattern.compile("name[^:]*:\s*(\w+).*?group[^:]*:\s*(\w+).*?birthday[^:]*:\s*(\d+\.\d+\.\d+)");

Matcher matcher = pattern.matcher(input);

while(matcher.find()) {
    System.out.printf("Match found. name: %s, group: %s, birthday: %s %n", matcher.group(1), matcher.group(2), matcher.group(3));
}

输出:

Match found. name: Peter, group: devceloper, birthday: 13.08.2001 
Match found. name: Sara, group: customer, birthday: 21.01.1998