阅读java中的文本文件时如何忽略括号、逗号和句号?
How to ignore brackets, commas and fullstops when reading a text file in java?
我有一个包含两个段落的文本文件。它在某些单词的末尾有句号和逗号,当我阅读文件时,这些逗号也被添加到阅读的单词列表中。
这是读取文件的代码
public static Scanner openTextFile(String fileName) {
Scanner data;
try{
data = new Scanner(new File(fileName));
return data;
}
catch(FileNotFoundException e){
System.out.println(fileName + " did not read correctly");
}
data = null;
return data;
}
但我希望它只读取单词并忽略任何逗号或句点或旁边的任何括号。
我怎样才能实现它
我使用了 replaceall 方法,但它根本不起作用
public static void readOtherFile(Scanner data, int g[][], Key[] hashTable, int[] keyWordCounter, int modValue) {
int lineCounter = 0, wordCounter = 0;
String x;
String []y;
while(data.hasNextLine()){
lineCounter += 1;
x = data.nextLine();
/*the following conditional statement takes care of the issue of their being an
* entirely blank line encountered before reaching the end of the text file.
*/
if(x.length() == 0) {
x = data.nextLine();
}
x = x.toLowerCase();
x = x.replaceAll("\p{Punct}", "");
y = x.split(" ");
wordCounter += y.length;
//method compares a token to a key word to see if they are identical.
checkForKeyWord(y, g, hashTable, keyWordCounter, modValue);
}
//method prints statistical results
printResults(lineCounter, wordCounter, hashTable, keyWordCounter);
}
示例文件
sample file link
为了实现这一点,您可以对从文件中解析为字符串的数据使用正则表达式。在进行字符串操作之前,您需要 return 从第一个读取的数据。在 while 循环中进行字符串操作是不好的做法。
static String readFile(String path, Charset encoding)
throws IOException
{
byte[] encoded = Files.readAllBytes(Paths.get(path));
return new String(encoded, encoding);
}
String data = d.replaceAll("[,.]", "");
我有一个包含两个段落的文本文件。它在某些单词的末尾有句号和逗号,当我阅读文件时,这些逗号也被添加到阅读的单词列表中。 这是读取文件的代码
public static Scanner openTextFile(String fileName) {
Scanner data;
try{
data = new Scanner(new File(fileName));
return data;
}
catch(FileNotFoundException e){
System.out.println(fileName + " did not read correctly");
}
data = null;
return data;
}
但我希望它只读取单词并忽略任何逗号或句点或旁边的任何括号。 我怎样才能实现它
我使用了 replaceall 方法,但它根本不起作用
public static void readOtherFile(Scanner data, int g[][], Key[] hashTable, int[] keyWordCounter, int modValue) {
int lineCounter = 0, wordCounter = 0;
String x;
String []y;
while(data.hasNextLine()){
lineCounter += 1;
x = data.nextLine();
/*the following conditional statement takes care of the issue of their being an
* entirely blank line encountered before reaching the end of the text file.
*/
if(x.length() == 0) {
x = data.nextLine();
}
x = x.toLowerCase();
x = x.replaceAll("\p{Punct}", "");
y = x.split(" ");
wordCounter += y.length;
//method compares a token to a key word to see if they are identical.
checkForKeyWord(y, g, hashTable, keyWordCounter, modValue);
}
//method prints statistical results
printResults(lineCounter, wordCounter, hashTable, keyWordCounter);
}
示例文件 sample file link
为了实现这一点,您可以对从文件中解析为字符串的数据使用正则表达式。在进行字符串操作之前,您需要 return 从第一个读取的数据。在 while 循环中进行字符串操作是不好的做法。
static String readFile(String path, Charset encoding)
throws IOException
{
byte[] encoded = Files.readAllBytes(Paths.get(path));
return new String(encoded, encoding);
}
String data = d.replaceAll("[,.]", "");