使用分词器提取句子
extracting a sentence using a tokenizer
有一个比较两个字符串的简单代码,如果它包含关键字,它就会执行一些操作。问题是我希望在检测到文本中的关键字后,它能以某种方式提取其中的句子。这是代码:
String keyword="Keyword(S)";
StringTokenizer tokenizer =new StringTokenizer(text) ;
if(tokenizer.hasMoreTokens())
{
tokenizer.nextToken();
for(final String s :text.split(" ")){
if(keyword.equals(s))
{
//get the whole sentence
}
}
}
编辑:
这是一个示例:
考虑到我们有以下文本:
Text summarization is the process of extracting salient information from the source text and to present that
information to the user in the form of summary. It is very difficult for human beings to manually
summarize large documents of text. Automatic abstractive summarization provides the required solution
but it is a challenging task because it requires deeper analysis of text. In this paper, a survey on abstractive
text summarization methods has been presented. Abstractive summarization methods are classified into two
categories i.e. structured based approach and semantic based approach.
现在我们正在寻找包含单词 abstractive
的所有句子,然后是 return 句子。也许我们应该在到达 .
时存储一个标记,然后每当我们找到关键字时,我们就使用该标记来获取句子的开头并继续直到我们到达另一个 .
或这听起来不合理?
我认为你应该在.
的基础上创建令牌,然后检查关键字如下:
String keyword="summarization";
StringTokenizer tokenizer =new StringTokenizer(text,"\.") ;
while(tokenizer.hasMoreTokens())
{
String x= tokenizer.nextToken();
for(final String s :x.split(" ")){
if(keyword.equals(s))
{
System.out.println(x);
}
}
}
有一个比较两个字符串的简单代码,如果它包含关键字,它就会执行一些操作。问题是我希望在检测到文本中的关键字后,它能以某种方式提取其中的句子。这是代码:
String keyword="Keyword(S)";
StringTokenizer tokenizer =new StringTokenizer(text) ;
if(tokenizer.hasMoreTokens())
{
tokenizer.nextToken();
for(final String s :text.split(" ")){
if(keyword.equals(s))
{
//get the whole sentence
}
}
}
编辑: 这是一个示例: 考虑到我们有以下文本:
Text summarization is the process of extracting salient information from the source text and to present that
information to the user in the form of summary. It is very difficult for human beings to manually
summarize large documents of text. Automatic abstractive summarization provides the required solution
but it is a challenging task because it requires deeper analysis of text. In this paper, a survey on abstractive
text summarization methods has been presented. Abstractive summarization methods are classified into two
categories i.e. structured based approach and semantic based approach.
现在我们正在寻找包含单词 abstractive
的所有句子,然后是 return 句子。也许我们应该在到达 .
时存储一个标记,然后每当我们找到关键字时,我们就使用该标记来获取句子的开头并继续直到我们到达另一个 .
或这听起来不合理?
我认为你应该在.
的基础上创建令牌,然后检查关键字如下:
String keyword="summarization";
StringTokenizer tokenizer =new StringTokenizer(text,"\.") ;
while(tokenizer.hasMoreTokens())
{
String x= tokenizer.nextToken();
for(final String s :x.split(" ")){
if(keyword.equals(s))
{
System.out.println(x);
}
}
}