FileInputStream 只读取文件中的第一个单词
FileInputStream only reads the first word in a file
我想逐个标记地读取 file.txt
文件中的单词,并为每个单词添加词性标记并将其写入 file2.text
文件。 file.txt
内容已标记化。所以这是我的代码。
public class PoSTagging {
@SuppressWarnings("resource")
public static void PoStagMethod() throws IOException {
FileInputStream fin= new FileInputStream("C:\Users\dell\Desktop\file.txt");
DataInputStream in = new DataInputStream(fin);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strline=br.readLine();
System.out.println(strline+"first");
try{
POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
POSTaggerME tagger = new POSTaggerME(model);
String input = strline;
@SuppressWarnings("deprecation")
ObjectStream<String> lineStream =new PlainTextByLineStream(new StringReader(input));
perfMon.start();
String line;
while ((line = lineStream.read()) != null) {
String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
String[] tags = tagger.tag(whitespaceTokenizerLine);
POSSample sample = new POSSample(whitespaceTokenizerLine, tags);
System.out.println(sample.toString()+"second");
//String t=sample.toString();
FileOutputStream fout=new FileOutputStream("C:\Users\dell\Desktop\file2.txt");
//fout.write(t.getBytes());
perfMon.incrementCounter();
fout.close();
}
perfMon.stopAndPrintFinalResult();
}
catch (IOException e) {
e.printStackTrace();
}
}
}
当从另一个 class 调用 PoStagMethod()
时,只有 file.txt
文件中的第一个单词被写入 file2.txt
文件。为什么它不读取文件中的其他单词?我的代码有什么问题?
您可以使用 BufferedReader
逐行阅读 file.txt
。然后使用 POSModel
处理每一行,然后使用 BufferedWriter
将输出写入 file2.txt
。下面的代码片段可能会有所帮助:
POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
POSTaggerME tagger = new POSTaggerME(model);
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("C:\Users\dell\Desktop\file2.txt"));
BufferedReader bufferedReader = new BufferedReader(new FileReader("C:\Users\dell\Desktop\file.txt"));
String line = "";
while((line = bufferedReader.readLine()) != null){
String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
String[] tags = tagger.tag(whitespaceTokenizerLine);
// Do your work with your tags and tokenized words
bufferedWriter.write(/* the string which is needed to be written to your output */);
// for adding new-lines in the output file, uncomment the following line:
//bufferedWriter.newLine();
}
//Do not forget to flush() and close() the streams after your job is done:
bufferedWriter.flush();
bufferedWriter.close();
bufferedReader.close();
如果你能做到这一点,用 try-with-resource 替换老式的 try-catch 子句也不错,它是在 java 1.7 中添加的自动关闭资源。
此外,如果您需要将每个单词及其标签写在不同的行中,您可能需要一个内部循环来写入文件。它会像下面这样:
POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
POSTaggerME tagger = new POSTaggerME(model);
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("C:\Users\dell\Desktop\file2.txt"));
BufferedReader bufferedReader = new BufferedReader(new FileReader("C:\Users\dell\Desktop\file.txt"));
String line = "";
while((line = bufferedReader.readLine()) != null){
String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
String[] tags = tagger.tag(whitespaceTokenizerLine);
for(String word: whitespaceTokenizerLine){
// Do your work with your tags and tokenized words
bufferedWriter.write(/* the string which is needed to be written to your output */);
// for adding new-lines in the output file, uncomment the following line:
//bufferedWriter.newLine();
}
}
//Do not forget to flush() and close() the streams after your job is done:
bufferedWriter.flush();
bufferedWriter.close();
bufferedReader.close();
希望这会有所帮助,
祝你好运。
我想逐个标记地读取 file.txt
文件中的单词,并为每个单词添加词性标记并将其写入 file2.text
文件。 file.txt
内容已标记化。所以这是我的代码。
public class PoSTagging {
@SuppressWarnings("resource")
public static void PoStagMethod() throws IOException {
FileInputStream fin= new FileInputStream("C:\Users\dell\Desktop\file.txt");
DataInputStream in = new DataInputStream(fin);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strline=br.readLine();
System.out.println(strline+"first");
try{
POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
POSTaggerME tagger = new POSTaggerME(model);
String input = strline;
@SuppressWarnings("deprecation")
ObjectStream<String> lineStream =new PlainTextByLineStream(new StringReader(input));
perfMon.start();
String line;
while ((line = lineStream.read()) != null) {
String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
String[] tags = tagger.tag(whitespaceTokenizerLine);
POSSample sample = new POSSample(whitespaceTokenizerLine, tags);
System.out.println(sample.toString()+"second");
//String t=sample.toString();
FileOutputStream fout=new FileOutputStream("C:\Users\dell\Desktop\file2.txt");
//fout.write(t.getBytes());
perfMon.incrementCounter();
fout.close();
}
perfMon.stopAndPrintFinalResult();
}
catch (IOException e) {
e.printStackTrace();
}
}
}
当从另一个 class 调用 PoStagMethod()
时,只有 file.txt
文件中的第一个单词被写入 file2.txt
文件。为什么它不读取文件中的其他单词?我的代码有什么问题?
您可以使用 BufferedReader
逐行阅读 file.txt
。然后使用 POSModel
处理每一行,然后使用 BufferedWriter
将输出写入 file2.txt
。下面的代码片段可能会有所帮助:
POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
POSTaggerME tagger = new POSTaggerME(model);
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("C:\Users\dell\Desktop\file2.txt"));
BufferedReader bufferedReader = new BufferedReader(new FileReader("C:\Users\dell\Desktop\file.txt"));
String line = "";
while((line = bufferedReader.readLine()) != null){
String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
String[] tags = tagger.tag(whitespaceTokenizerLine);
// Do your work with your tags and tokenized words
bufferedWriter.write(/* the string which is needed to be written to your output */);
// for adding new-lines in the output file, uncomment the following line:
//bufferedWriter.newLine();
}
//Do not forget to flush() and close() the streams after your job is done:
bufferedWriter.flush();
bufferedWriter.close();
bufferedReader.close();
如果你能做到这一点,用 try-with-resource 替换老式的 try-catch 子句也不错,它是在 java 1.7 中添加的自动关闭资源。
此外,如果您需要将每个单词及其标签写在不同的行中,您可能需要一个内部循环来写入文件。它会像下面这样:
POSModel model = new POSModelLoader().load(new File("en-pos-maxent.bin"));
PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
POSTaggerME tagger = new POSTaggerME(model);
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter("C:\Users\dell\Desktop\file2.txt"));
BufferedReader bufferedReader = new BufferedReader(new FileReader("C:\Users\dell\Desktop\file.txt"));
String line = "";
while((line = bufferedReader.readLine()) != null){
String whitespaceTokenizerLine[] = WhitespaceTokenizer.INSTANCE.tokenize(line);
String[] tags = tagger.tag(whitespaceTokenizerLine);
for(String word: whitespaceTokenizerLine){
// Do your work with your tags and tokenized words
bufferedWriter.write(/* the string which is needed to be written to your output */);
// for adding new-lines in the output file, uncomment the following line:
//bufferedWriter.newLine();
}
}
//Do not forget to flush() and close() the streams after your job is done:
bufferedWriter.flush();
bufferedWriter.close();
bufferedReader.close();
希望这会有所帮助,
祝你好运。