标记化字符串中每个标记的字符数,Java
Char count of each token in Tokenized String, Java
我想知道我是否可以计算每个标记的字符数并显示该信息,例如:
天被标记化,我的输出将是:“天有 3 个字符。”并继续为每个令牌执行此操作。
我最后一个打印出每个标记中字符数的循环从未打印出来:
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
ArrayList<String> tokenizedInput = new ArrayList<>();
String sentenceRetrieved;
// getting the sentence from the user
System.out.println("Please type a sentence containing at least 4 words, with a maximum of 8 words: ");
sentenceRetrieved = sc.nextLine();
StringTokenizer strTokenizer = new StringTokenizer(sentenceRetrieved);
// checking to ensure the string has 4-8 words
while (strTokenizer.hasMoreTokens()) {
if (strTokenizer.countTokens() > 8) {
System.out.println("Please re-enter a sentence with at least 4 words, and a maximum of 8");
break;
} else {
while (strTokenizer.hasMoreTokens()) {
tokenizedInput.add(strTokenizer.nextToken());
}
System.out.println("Thank you.");
break;
}
}
// printing out the sentence
System.out.println("You entered: ");
System.out.println(sentenceRetrieved);
// print out each word given
System.out.println("Each word in your sentence is: " + tokenizedInput);
// count the characters in each word
// doesn't seem to run
int totalLength = 0;
while (strTokenizer.hasMoreTokens()) {
String token;
token = sentenceRetrieved;
token = strTokenizer.nextToken();
totalLength += token.length();
System.out.println("Word: " + token + " Length:" + token.length());
}
}
}
控制台示例:
请输入至少包含 4 个单词的句子,最多包含 8 个单词:
你好,这是一个测试
谢谢。
您输入了:
你好,这是一个测试
你句子中的每个词是:[你好,那里,这,是,a,测试]
首先,我添加了必要的导入并围绕这个主要方法构建了一个 class。这应该编译。
import java.util.ArrayList;
import java.util.Scanner;
import java.util.StringTokenizer;
public class SOQ_20200913_1
{
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
ArrayList<String> tokenizedInput = new ArrayList<>();
String sentenceRetrieved;
// getting the sentence from the user
System.out.println("Please type a sentence containing at least 4 words, with a maximum of 8 words: ");
sentenceRetrieved = sc.nextLine();
StringTokenizer strTokenizer = new StringTokenizer(sentenceRetrieved);
// checking to ensure the string has 4-8 words
while (strTokenizer.hasMoreTokens()) {
if (strTokenizer.countTokens() > 8) {
System.out.println("Please re-enter a sentence with at least 4 words, and a maximum of 8");
break;
} else {
while (strTokenizer.hasMoreTokens()) {
tokenizedInput.add(strTokenizer.nextToken());
}
System.out.println("Thank you.");
break;
}
}
// printing out the sentence
System.out.println("You entered: ");
System.out.println(sentenceRetrieved);
// print out each word given
System.out.println("Each word in your sentence is: " + tokenizedInput);
// count the characters in each word
// doesn't seem to run
int totalLength = 0;
while (strTokenizer.hasMoreTokens()) {
String token;
token = sentenceRetrieved;
token = strTokenizer.nextToken();
totalLength += token.length();
System.out.println("Word: " + token + " Length:" + token.length());
}
}
}
接下来,让我们看一下这个工作示例。在您的最后一个 while 循环(计算字符长度的循环)之前,一切似乎都很好。但如果您注意到,最后一个循环之前的 while 循环将继续循环,直到它 没有更多的标记 可以获取。因此,在它收集完所有令牌并且没有更多令牌可收集后,您尝试创建最终的 while 循环,要求它收集更多令牌。它不会到达 while 循环,直到它 运行 没有令牌可以收集!
最后,为了解决这个问题,您可以简单地遍历您在倒数第二个 while 循环中添加到的列表,然后简单地循环遍历该列表以进行最后一个循环!
例如:
int totalLength = 0;
for (String each : tokenizedInput) {
totalLength += each.length();
System.out.println("Word: " + each + " Length:" + each.length());
}
我想知道我是否可以计算每个标记的字符数并显示该信息,例如:
天被标记化,我的输出将是:“天有 3 个字符。”并继续为每个令牌执行此操作。
我最后一个打印出每个标记中字符数的循环从未打印出来:
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
ArrayList<String> tokenizedInput = new ArrayList<>();
String sentenceRetrieved;
// getting the sentence from the user
System.out.println("Please type a sentence containing at least 4 words, with a maximum of 8 words: ");
sentenceRetrieved = sc.nextLine();
StringTokenizer strTokenizer = new StringTokenizer(sentenceRetrieved);
// checking to ensure the string has 4-8 words
while (strTokenizer.hasMoreTokens()) {
if (strTokenizer.countTokens() > 8) {
System.out.println("Please re-enter a sentence with at least 4 words, and a maximum of 8");
break;
} else {
while (strTokenizer.hasMoreTokens()) {
tokenizedInput.add(strTokenizer.nextToken());
}
System.out.println("Thank you.");
break;
}
}
// printing out the sentence
System.out.println("You entered: ");
System.out.println(sentenceRetrieved);
// print out each word given
System.out.println("Each word in your sentence is: " + tokenizedInput);
// count the characters in each word
// doesn't seem to run
int totalLength = 0;
while (strTokenizer.hasMoreTokens()) {
String token;
token = sentenceRetrieved;
token = strTokenizer.nextToken();
totalLength += token.length();
System.out.println("Word: " + token + " Length:" + token.length());
}
}
}
控制台示例:
请输入至少包含 4 个单词的句子,最多包含 8 个单词:
你好,这是一个测试
谢谢。
您输入了:
你好,这是一个测试
你句子中的每个词是:[你好,那里,这,是,a,测试]
首先,我添加了必要的导入并围绕这个主要方法构建了一个 class。这应该编译。
import java.util.ArrayList;
import java.util.Scanner;
import java.util.StringTokenizer;
public class SOQ_20200913_1
{
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
ArrayList<String> tokenizedInput = new ArrayList<>();
String sentenceRetrieved;
// getting the sentence from the user
System.out.println("Please type a sentence containing at least 4 words, with a maximum of 8 words: ");
sentenceRetrieved = sc.nextLine();
StringTokenizer strTokenizer = new StringTokenizer(sentenceRetrieved);
// checking to ensure the string has 4-8 words
while (strTokenizer.hasMoreTokens()) {
if (strTokenizer.countTokens() > 8) {
System.out.println("Please re-enter a sentence with at least 4 words, and a maximum of 8");
break;
} else {
while (strTokenizer.hasMoreTokens()) {
tokenizedInput.add(strTokenizer.nextToken());
}
System.out.println("Thank you.");
break;
}
}
// printing out the sentence
System.out.println("You entered: ");
System.out.println(sentenceRetrieved);
// print out each word given
System.out.println("Each word in your sentence is: " + tokenizedInput);
// count the characters in each word
// doesn't seem to run
int totalLength = 0;
while (strTokenizer.hasMoreTokens()) {
String token;
token = sentenceRetrieved;
token = strTokenizer.nextToken();
totalLength += token.length();
System.out.println("Word: " + token + " Length:" + token.length());
}
}
}
接下来,让我们看一下这个工作示例。在您的最后一个 while 循环(计算字符长度的循环)之前,一切似乎都很好。但如果您注意到,最后一个循环之前的 while 循环将继续循环,直到它 没有更多的标记 可以获取。因此,在它收集完所有令牌并且没有更多令牌可收集后,您尝试创建最终的 while 循环,要求它收集更多令牌。它不会到达 while 循环,直到它 运行 没有令牌可以收集!
最后,为了解决这个问题,您可以简单地遍历您在倒数第二个 while 循环中添加到的列表,然后简单地循环遍历该列表以进行最后一个循环!
例如:
int totalLength = 0;
for (String each : tokenizedInput) {
totalLength += each.length();
System.out.println("Word: " + each + " Length:" + each.length());
}