Java 文本文件比较 - 多余和缺失
Java Text File comparison - Excess and Missing
我需要比较 2 个文本文件(MasterCopy.txt
和 ClientCopy.txt)
。我想获取 ClientCopy.txt 中缺少的字符串列表。还需要获取超出的字符串列表。
MasterCopy.txt
的内容
- 伦敦
- 巴黎
- 罗马
ClientCopy.txt
的内容
- 伦敦
- 柏林
- 罗马
- 阿姆斯特丹
我想得到这些结果
缺失:
- 巴黎
过剩:
- 柏林
- 阿姆斯特丹
想到的两个想法是获取两个文件的差异:
https://code.google.com/p/java-diff-utils/
来自他们的 wiki
Task 1: Compute the difference between to files and print its deltas
Solution:
import difflib.*;
public class BasicJavaApp_Task1 {
// Helper method for get the file content
private static List<String> fileToLines(String filename) {
List<String> lines = new LinkedList<String>();
String line = "";
try {
BufferedReader in = new BufferedReader(new FileReader(filename));
while ((line = in.readLine()) != null) {
lines.add(line);
}
} catch (IOException e) {
e.printStackTrace();
}
return lines;
}
public static void main(String[] args) {
List<String> original = fileToLines("originalFile.txt");
List<String> revised = fileToLines("revisedFile.xt");
// Compute diff. Get the Patch object. Patch is the container for computed deltas.
Patch patch = DiffUtils.diff(original, revised);
for (Delta delta: patch.getDeltas()) {
System.out.println(delta);
}
}
}
或使用哈希集:
http://docs.oracle.com/javase/7/docs/api/java/util/HashSet.html
修改@Nic的答案使用HashSet:
Scanner s = new Scanner(new File(“MasterCopy.txt”));
HashSet<String> masterlist = new HashSet<String>();
while (s.hasNext()){
masterlist.put(s.next());
}
s.close();
s = new Scanner(new File(“ClientCopy.txt”));
HashSet<String> clientlist = new HashSet<String>();
while (s.hasNext()){
clientlist.put(s.next());
}
s.close();
//Do the comparison
ArrayList<String> missing = new ArrayList<String>();
ArrayList<String> excess = new ArrayList<String>();
//Check for missing or excess
for(String line : masterlist){
if(clientlist.get(line) == null) missing.add(line);
}
for(String line : clientlist){
if(masterlist.get(line) == null) excess.add(line);
}
如果执行时间不是一个重要因素,您可以这样做,假设您只比较每一行:
//Get the files into lists
Scanner s = new Scanner(new File(“MasterCopy.txt”));
HashSet<String> masterlist = new HashSet<String>();
while (s.hasNext()){
masterlist.add(s.next());
}
s.close();
s = new Scanner(new File(“ClientCopy.txt”));
HashSet<String> clientlist = new HashSet<String>();
while (s.hasNext()){
clientlist.add(s.next());
}
s.close();
//Do the comparison
HashSet<String> missing = new HashSet<String>();
HashSet<String> excess = new HashSet<String>();
//Check for missing or excess
for(String s : masterlist){
if(!clientlist.contains(s)) missing.add(s);
}
for(String s : clientlist){
if(!masterlist.contains(s)) excess.add(s);
}
我需要比较 2 个文本文件(MasterCopy.txt
和 ClientCopy.txt)
。我想获取 ClientCopy.txt 中缺少的字符串列表。还需要获取超出的字符串列表。
MasterCopy.txt
的内容- 伦敦
- 巴黎
- 罗马
ClientCopy.txt
的内容- 伦敦
- 柏林
- 罗马
- 阿姆斯特丹
我想得到这些结果
缺失:
- 巴黎
过剩:
- 柏林
- 阿姆斯特丹
想到的两个想法是获取两个文件的差异:
https://code.google.com/p/java-diff-utils/
来自他们的 wiki
Task 1: Compute the difference between to files and print its deltas Solution:
import difflib.*; public class BasicJavaApp_Task1 { // Helper method for get the file content private static List<String> fileToLines(String filename) { List<String> lines = new LinkedList<String>(); String line = ""; try { BufferedReader in = new BufferedReader(new FileReader(filename)); while ((line = in.readLine()) != null) { lines.add(line); } } catch (IOException e) { e.printStackTrace(); } return lines; } public static void main(String[] args) { List<String> original = fileToLines("originalFile.txt"); List<String> revised = fileToLines("revisedFile.xt"); // Compute diff. Get the Patch object. Patch is the container for computed deltas. Patch patch = DiffUtils.diff(original, revised); for (Delta delta: patch.getDeltas()) { System.out.println(delta); } } }
或使用哈希集:
http://docs.oracle.com/javase/7/docs/api/java/util/HashSet.html
修改@Nic的答案使用HashSet:
Scanner s = new Scanner(new File(“MasterCopy.txt”));
HashSet<String> masterlist = new HashSet<String>();
while (s.hasNext()){
masterlist.put(s.next());
}
s.close();
s = new Scanner(new File(“ClientCopy.txt”));
HashSet<String> clientlist = new HashSet<String>();
while (s.hasNext()){
clientlist.put(s.next());
}
s.close();
//Do the comparison
ArrayList<String> missing = new ArrayList<String>();
ArrayList<String> excess = new ArrayList<String>();
//Check for missing or excess
for(String line : masterlist){
if(clientlist.get(line) == null) missing.add(line);
}
for(String line : clientlist){
if(masterlist.get(line) == null) excess.add(line);
}
如果执行时间不是一个重要因素,您可以这样做,假设您只比较每一行:
//Get the files into lists
Scanner s = new Scanner(new File(“MasterCopy.txt”));
HashSet<String> masterlist = new HashSet<String>();
while (s.hasNext()){
masterlist.add(s.next());
}
s.close();
s = new Scanner(new File(“ClientCopy.txt”));
HashSet<String> clientlist = new HashSet<String>();
while (s.hasNext()){
clientlist.add(s.next());
}
s.close();
//Do the comparison
HashSet<String> missing = new HashSet<String>();
HashSet<String> excess = new HashSet<String>();
//Check for missing or excess
for(String s : masterlist){
if(!clientlist.contains(s)) missing.add(s);
}
for(String s : clientlist){
if(!masterlist.contains(s)) excess.add(s);
}