将 BufferedReader 用于包含计数器的二维数组

Question

我在学校做的一些代码有问题。试图将其保持在我的逻辑范围内（并且本质上失败了）。只是想知道是否有任何关于完成这项工作的提示；

public static String[][] sortWords(BufferedReader in, int n) throws IOException{
    String line = "";
    int ctr = 0;
    String[][] words = new String[n][2];

    for(int m = 0; m < n; m++) {
        words[m][1] = "1"; 
    }

    while((line=in.readLine())!=null) {
        String a[]=line.split(" ");    
        for(int i = 0; i < a.length; i++) {
            a[i] = a[i].toUpperCase();
            for(int h = ctr; h < n; h++) {
                if (words[h][0].equals(a[i])) {
                    words[h][1] = "" + (Integer.parseInt(words[h][1])+1);
                } else{
                    words[ctr][0] = a[i];
                    ctr++;
                    break;
                }
            }
        } 
        line=in.readLine();
    }
    return words;
}

我想做的是获取一个非常大（70k 字）的 txt 文件并对其进行剖析。我认为这种方法可以做到以下几点； - 找到文件中的所有单词 - 找出每个单词出现的次数 - 将这两个值存储在二维数组中以便于访问。

如果我偏离了基地，我理解。提前谢谢你。

Answer 1

所有的评论都是正确的，但我会尝试将它们翻译成代码。在每一步，我都注释掉了每一行未修改的内容，这样更改就更清楚了。

首先，搞砸那个二维数组。使用它既有限制又麻烦。让我们改用地图：

public static Map<String, Integer> sortWords(BufferedReader in) throws IOException{
//    String line = "";
    Map<String, Integer> wordsCount = new HashMap<>();
//
//    while((line=in.readLine())!=null) {
//        String a[]=line.split(" ");
//        for(int i = 0; i < a.length; i++) {
//            a[i] = a[i].toUpperCase();
            Integer count = wordsCount.get(a[i]); // Get current count for this word
            if (count == null) count = 0; // Initialize on first appearance
            count++; // Update counter
            wordsCount.put(a[i], count); // Save the updated value
//        }
//        line=in.readLine();
//    }
//    return words;
//}

不需要初始化数组，不需要额外的循环，不需要 String 到 int 的转换...只需获取与该单词关联的值并更新它。现在我们不需要事先知道单词数，所以第二个 int n 参数可以安全地删除！

现在，我看到您使用的是非常基本的、类似于 C 的 2000 年前的习惯用法（包括所有 for(;;) 和数组等）。它完全有效，但您错过了更现代、更有用的结构。那么我们使用自 2004 年以来可用的 enhanced for loop 怎么样？

//public static Map<String, Integer> sortWords(BufferedReader in) throws IOException{
//    String line = "";
//    Map<String, Integer> wordsCount = new HashMap<>();
//
//    while((line=in.readLine())!=null) {
//        String a[]=line.split(" ");
        for(String word : a) {
            word = word.toUpperCase();
            Integer count = wordsCount.get(word); // Get current count for this word
//            if (count == null) count = 0; // Initialize on first appearance
//            count++; // Update counter
            wordsCount.put(word, count); // Save the updated value
//        }
//        line=in.readLine();
//    }
//    return wordsCount;
//}

更清晰的语法，我们确切地知道我们在循环中处理的对象类型......最重要的是，它允许您 inline 您的一些代码让它更干净。像这样：

//public static Map<String, Integer> sortWords(BufferedReader in) throws IOException{
//    String line = "";
//    Map<String, Integer> wordsCount = new HashMap<>();
//
//    while((line=in.readLine())!=null) {
        for(String word : line.toUpperCase().split(" ")) {
//            Integer count = wordsCount.get(word); // Get current count for this word
//            if (count == null) count = 0; // Initialize on first appearance
//            count++; // Update counter
//            wordsCount.put(word, count); // Save the updated value
//        }
//        line=in.readLine();
//    }
//    return wordsCount;
//}

现在 toUpperCase() 方法每行只调用一次，而不是每个单词调用一次，我们摆脱了伤害每个人眼睛的 String a[] ;-P

剩下要做的最后一件事就是删除最后多余的 readLine()。这样做，现在您的代码应该如下所示：

public static Map<String, Integer> sortWords(BufferedReader in) throws IOException {
    String line = "";
    Map<String, Integer> wordsCount = new HashMap<>();

    while ((line = in.readLine()) != null) {
        for(String word : line.toUpperCase().split(" ")) {
            Integer count = wordsCount.get(word); // Get current count for this word
            if (count == null) count = 0; // Initialize on first appearance
            count++; // Update counter
            wordsCount.put(word, count); // Save the updated value
        }
    }
    return wordsCount;
}

好多了！
您可以使用这样的方法：

BufferedReader in = new BufferedReader(new FileReader("myWords.txt"));
Map words = sortWords(in);
int numberOfHellos = words.get("Hello");
int numberOfGreetings = numberOfHellos + words.get("Hi") + words.get("Howdy");

将 BufferedReader 用于包含计数器的二维数组

Using BufferedReader to two dimensional array including a counter

java

arrays

bufferedreader