尝试用 Rust 迭代 2 个文件
Trying to iterate 2 files in rust
我正在尝试读取 2 个文件并比较每个文件中的每一项以查看它们是否相等。
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let file2 = File::open(filename2).unwrap();
let mut reader2 = BufReader::new(file2);
// Read the file line by line using the lines() iterator from std::io::BufRead.
for line1 in reader.lines() {
let line = line.unwrap(); // Ignore errors.
for line2 in reader2.lines() {
let line2 = line2.unwrap(); // Ignore errors.
if line2 == line1 {
println!("{}",line2)
}
}
}
}
但是,这不起作用。如何使用缓冲区在循环上应用循环?
虽然我找到了解决方案,但速度非常慢。如果有人有更好的解决方案来查找 2 个文件中相似的项目,请告诉我。
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let mut vec2 = findvec("file1.txt".to_string());
let mut vec3 = &findvec("file2.txt".to_string());
for line in vec2 {
for line2 in vec3 {
if line.to_string() == line2.to_string() {
println!("{}",line.to_string());
}
}
}
}
fn findvec(filename: String) -> Vec<String> {
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename).unwrap();
let reader = BufReader::new(file);
// blank vector
let mut myvec = Vec::new();
// Read the file line by line using the lines() iterator from std::io::BufRead.
for (index, line) in reader.lines().enumerate() {
let line = line.unwrap(); // Ignore errors.
// Show the line and its number.
myvec.push(line);
}
myvec
}
你的第一个问题是 this question. TLDR: you need to call by_ref
的副本,如果你希望能够在调用其 lines
方法后重用 reader2
(例如,在下一个循环迭代中)。
这样你的代码将编译但不会工作,因为一旦你处理了第一个文件的第一行你就在第二个文件的末尾,所以在处理后续文件时第二个文件将显示为空线。您可以通过为每一行倒带第二个文件来解决这个问题。使您的代码正常工作的最小更改集是:
use std::io::Read;
use std::io::Seek;
use std::io::SeekFrom;
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let file2 = File::open(filename2).unwrap();
let mut reader2 = BufReader::new(file2);
// Read the file line by line using the lines() iterator from std::io::BufRead.
for line1 in reader.lines() {
let line1 = line1.unwrap(); // Ignore errors.
reader2.seek (SeekFrom::Start (0)).unwrap(); // <-- Add this line
for line2 in reader2.by_ref().lines() { // <-- Use by_ref here
let line2 = line2.unwrap(); // Ignore errors.
if line2 == line1 {
println!("{}",line2)
}
}
}
}
但是这会很慢。您可以通过读取 HashSet
中的一个文件并检查另一个文件的每一行是否在集合中来使其更快:
use std::collections::HashSet;
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let file2 = File::open(filename2).unwrap();
let reader2 = BufReader::new(file2);
let lines2 = reader2.lines().collect::<Result<HashSet<_>, _>>().unwrap();
// Read the file line by line using the lines() iterator from std::io::BufRead.
for line1 in reader.lines() {
let line1 = line1.unwrap(); // Ignore errors.
if lines2.contains (&line1) {
println!("{}", line1)
}
}
}
最后你也可以将两个文件读入 HashSet
s 并打印出交集:
use std::collections::HashSet;
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let lines1 = reader.lines().collect::<Result<HashSet<_>, _>>().unwrap();
let file2 = File::open(filename2).unwrap();
let reader2 = BufReader::new(file2);
let lines2 = reader2.lines().collect::<Result<HashSet<_>, _>>().unwrap();
for l in lines1.intersection (&lines2) {
println!("{}", l)
}
}
作为奖励,最后一个解决方案将删除重复的行。 OTOH 它不会保留行的顺序。
我正在尝试读取 2 个文件并比较每个文件中的每一项以查看它们是否相等。
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let file2 = File::open(filename2).unwrap();
let mut reader2 = BufReader::new(file2);
// Read the file line by line using the lines() iterator from std::io::BufRead.
for line1 in reader.lines() {
let line = line.unwrap(); // Ignore errors.
for line2 in reader2.lines() {
let line2 = line2.unwrap(); // Ignore errors.
if line2 == line1 {
println!("{}",line2)
}
}
}
}
但是,这不起作用。如何使用缓冲区在循环上应用循环?
虽然我找到了解决方案,但速度非常慢。如果有人有更好的解决方案来查找 2 个文件中相似的项目,请告诉我。
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let mut vec2 = findvec("file1.txt".to_string());
let mut vec3 = &findvec("file2.txt".to_string());
for line in vec2 {
for line2 in vec3 {
if line.to_string() == line2.to_string() {
println!("{}",line.to_string());
}
}
}
}
fn findvec(filename: String) -> Vec<String> {
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename).unwrap();
let reader = BufReader::new(file);
// blank vector
let mut myvec = Vec::new();
// Read the file line by line using the lines() iterator from std::io::BufRead.
for (index, line) in reader.lines().enumerate() {
let line = line.unwrap(); // Ignore errors.
// Show the line and its number.
myvec.push(line);
}
myvec
}
你的第一个问题是 this question. TLDR: you need to call by_ref
的副本,如果你希望能够在调用其 lines
方法后重用 reader2
(例如,在下一个循环迭代中)。
这样你的代码将编译但不会工作,因为一旦你处理了第一个文件的第一行你就在第二个文件的末尾,所以在处理后续文件时第二个文件将显示为空线。您可以通过为每一行倒带第二个文件来解决这个问题。使您的代码正常工作的最小更改集是:
use std::io::Read;
use std::io::Seek;
use std::io::SeekFrom;
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let file2 = File::open(filename2).unwrap();
let mut reader2 = BufReader::new(file2);
// Read the file line by line using the lines() iterator from std::io::BufRead.
for line1 in reader.lines() {
let line1 = line1.unwrap(); // Ignore errors.
reader2.seek (SeekFrom::Start (0)).unwrap(); // <-- Add this line
for line2 in reader2.by_ref().lines() { // <-- Use by_ref here
let line2 = line2.unwrap(); // Ignore errors.
if line2 == line1 {
println!("{}",line2)
}
}
}
}
但是这会很慢。您可以通过读取 HashSet
中的一个文件并检查另一个文件的每一行是否在集合中来使其更快:
use std::collections::HashSet;
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let file2 = File::open(filename2).unwrap();
let reader2 = BufReader::new(file2);
let lines2 = reader2.lines().collect::<Result<HashSet<_>, _>>().unwrap();
// Read the file line by line using the lines() iterator from std::io::BufRead.
for line1 in reader.lines() {
let line1 = line1.unwrap(); // Ignore errors.
if lines2.contains (&line1) {
println!("{}", line1)
}
}
}
最后你也可以将两个文件读入 HashSet
s 并打印出交集:
use std::collections::HashSet;
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let lines1 = reader.lines().collect::<Result<HashSet<_>, _>>().unwrap();
let file2 = File::open(filename2).unwrap();
let reader2 = BufReader::new(file2);
let lines2 = reader2.lines().collect::<Result<HashSet<_>, _>>().unwrap();
for l in lines1.intersection (&lines2) {
println!("{}", l)
}
}
作为奖励,最后一个解决方案将删除重复的行。 OTOH 它不会保留行的顺序。