如何从文本文件 Java 中读取单个单词(或行)?
How to read a single word (or line) from a text file Java?
正如标题所说,我正在尝试编写一个程序,可以从文本文件中读取单个单词并将它们存储到 String
变量中。我知道如何使用 FileReader
或 FileInputStream
来读取单个 char
但对于我正在尝试的内容,这是行不通的。一旦我输入了单词,我就会尝试使用 .equals 将它们与我程序中的其他字符串变量进行比较,因此如果我可以作为字符串导入,那将是最好的。我也可以将文本文件中的整行作为字符串输入,在这种情况下,我只会在文件的每一行上输入一个词。如何从文本文件中输入单词并将它们存储到 String 变量中?
编辑:
好的,重复的排序有帮助。它可能对我有用,但我的问题有点不同的原因是因为副本只告诉如何阅读一行。我正在尝试阅读该行中的单个单词。所以基本上拆分行字符串。
你必须使用StringTokenizer!这是一个例子并阅读这个 String Tokenizer
private BufferedReader innerReader;
public void loadFile(Reader reader)
throws IOException {
if(reader == null)
{
throw new IllegalArgumentException("Reader not valid!");
}
this.innerReader = new BufferedReader(reader);
String line;
try
{
while((line = innerReader.readLine()) != null)
{
if (line == null || line.trim().isEmpty())
throw new IllegalArgumentException(
"line empty");
//StringTokenizer use delimiter for split string
StringTokenizer tokenizer = new StringTokenizer(line, ","); //delimiter is ","
if (tokenizer.countTokens() < 4)
throw new IllegalArgumentException(
"Token number not valid (<= 4)");
//You can change the delimiter if necessary, string example
/*
Hello / bye , hi
*/
//reads up "/"
String hello = tokenizer.nextToken("/").trim();
//reads up ","
String bye = tokenizer.nextToken(",").trim();
//reads up to end of line
String hi = tokenizer.nextToken("\n\r").trim();
//if you have to read but do not know if there will be a next token do this
while(tokenizer.hasMoreTokens())
{
String mayBe = tokenizer.nextToken(".");
}
}
} catch (Exception e) {
throw new IllegalArgumentException(e);
}
}
要从文本文件中读取行,您可以使用这个(使用 try-with-resources):
String line;
try (
InputStream fis = new FileInputStream("the_file_name");
InputStreamReader isr = new InputStreamReader(fis, Charset.forName("UTF-8"));
BufferedReader br = new BufferedReader(isr);
) {
while ((line = br.readLine()) != null) {
// Do your thing with line
}
}
同一事物的更紧凑、可读性更差的版本:
String line;
try (BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream("the_file_name"), Charset.forName("UTF-8")))) {
while ((line = br.readLine()) != null) {
// Do your thing with line
}
}
要将一行分成单个单词,您可以使用 String.split:
while ((line = br.readLine()) != null) {
String[] words = line.split(" ");
// Now you have a String array containing each word in the current line
}
这些都是非常复杂的答案。我相信它们都很有用。但我更喜欢优雅简单 Scanner
:
public static void main(String[] args) throws Exception{
Scanner sc = new Scanner(new File("fileName.txt"));
while(sc.hasNext()){
String s = sc.next();
//.....
}
}
在 java8 中,您可以执行以下操作:
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.stream.Collectors;
public class Foo {
public List<String> readFileIntoListOfWords() {
try {
return Files.readAllLines(Paths.get("somefile.txt"))
.stream()
.map(l -> l.split(" "))
.flatMap(Arrays::stream)
.collect(Collectors.toList());
}
catch (IOException e) {
e.printStackTrace();
}
return Collections.emptyList();
}
}
尽管我怀疑可能需要更改 split 的参数,例如从单词末尾开始 trim 标点符号
正如标题所说,我正在尝试编写一个程序,可以从文本文件中读取单个单词并将它们存储到 String
变量中。我知道如何使用 FileReader
或 FileInputStream
来读取单个 char
但对于我正在尝试的内容,这是行不通的。一旦我输入了单词,我就会尝试使用 .equals 将它们与我程序中的其他字符串变量进行比较,因此如果我可以作为字符串导入,那将是最好的。我也可以将文本文件中的整行作为字符串输入,在这种情况下,我只会在文件的每一行上输入一个词。如何从文本文件中输入单词并将它们存储到 String 变量中?
编辑: 好的,重复的排序有帮助。它可能对我有用,但我的问题有点不同的原因是因为副本只告诉如何阅读一行。我正在尝试阅读该行中的单个单词。所以基本上拆分行字符串。
你必须使用StringTokenizer!这是一个例子并阅读这个 String Tokenizer
private BufferedReader innerReader;
public void loadFile(Reader reader)
throws IOException {
if(reader == null)
{
throw new IllegalArgumentException("Reader not valid!");
}
this.innerReader = new BufferedReader(reader);
String line;
try
{
while((line = innerReader.readLine()) != null)
{
if (line == null || line.trim().isEmpty())
throw new IllegalArgumentException(
"line empty");
//StringTokenizer use delimiter for split string
StringTokenizer tokenizer = new StringTokenizer(line, ","); //delimiter is ","
if (tokenizer.countTokens() < 4)
throw new IllegalArgumentException(
"Token number not valid (<= 4)");
//You can change the delimiter if necessary, string example
/*
Hello / bye , hi
*/
//reads up "/"
String hello = tokenizer.nextToken("/").trim();
//reads up ","
String bye = tokenizer.nextToken(",").trim();
//reads up to end of line
String hi = tokenizer.nextToken("\n\r").trim();
//if you have to read but do not know if there will be a next token do this
while(tokenizer.hasMoreTokens())
{
String mayBe = tokenizer.nextToken(".");
}
}
} catch (Exception e) {
throw new IllegalArgumentException(e);
}
}
要从文本文件中读取行,您可以使用这个(使用 try-with-resources):
String line;
try (
InputStream fis = new FileInputStream("the_file_name");
InputStreamReader isr = new InputStreamReader(fis, Charset.forName("UTF-8"));
BufferedReader br = new BufferedReader(isr);
) {
while ((line = br.readLine()) != null) {
// Do your thing with line
}
}
同一事物的更紧凑、可读性更差的版本:
String line;
try (BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream("the_file_name"), Charset.forName("UTF-8")))) {
while ((line = br.readLine()) != null) {
// Do your thing with line
}
}
要将一行分成单个单词,您可以使用 String.split:
while ((line = br.readLine()) != null) {
String[] words = line.split(" ");
// Now you have a String array containing each word in the current line
}
这些都是非常复杂的答案。我相信它们都很有用。但我更喜欢优雅简单 Scanner
:
public static void main(String[] args) throws Exception{
Scanner sc = new Scanner(new File("fileName.txt"));
while(sc.hasNext()){
String s = sc.next();
//.....
}
}
在 java8 中,您可以执行以下操作:
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.stream.Collectors;
public class Foo {
public List<String> readFileIntoListOfWords() {
try {
return Files.readAllLines(Paths.get("somefile.txt"))
.stream()
.map(l -> l.split(" "))
.flatMap(Arrays::stream)
.collect(Collectors.toList());
}
catch (IOException e) {
e.printStackTrace();
}
return Collections.emptyList();
}
}
尽管我怀疑可能需要更改 split 的参数,例如从单词末尾开始 trim 标点符号