Java:字符串中出现次数最多的数字

Java: Most frequent numbers in a string

我正在尝试编写一个程序来接收一些命令行参数,将它们保存为一个字符串并找到该字符串中出现频率最高的数字(它必须忽略字母和其他符号并只检查数字)。然后它应该打印字符串、最频繁的数字以及它在该字符串中出现的次数。如果两个或更多数字在一个字符串中出现的次数相同,则应按升序写入它们。

我看到了一些使用库和映射的类似程序的解决方案,但作为一个完全的初学者,我还远远没有涵盖这些,所以我尝试用以下代码解决问题:

import java.util.*;

public class trying5 {
    public static void main(String[] args) {

        //string of arguments

        String s = "";
        if (args.length != 0) {
            for (int i = 0; i < args.length; i++) {
                s = s + args[i] + " ";

            }
        }

        //I tried using an array

        /*char[] c = new char[s.length()];
        for (int i = 0; i < s.length(); i++) {
            c[i] = s.charAt(i);
        }*/

        //Decided to use an arraylist

        ArrayList<Character> maxTimesNum = new ArrayList<Character>();
        //char[] maxTimesNum = new char[s.length()];

        int maxTimesNumRepeats = 0; //the  repeats of the most frequent number
        int index = 0;


        for (int i = 0; i < s.length(); i++) {

            if (Character.isDigit(s.charAt(i))) {

                //variables for the possible number
                char x = s.charAt(i);
                int xRepeats = 0;

                for (int k = 0; k < s.length(); k++) {
                    if (s.charAt(k) == x) {
                        xRepeats = xRepeats + 1;
                    }
                }
                if (maxTimesNumRepeats < xRepeats) {
                    maxTimesNum.add(index, x);
                    maxTimesNumRepeats = xRepeats;

                } else if (maxTimesNumRepeats == xRepeats) {
                    if (!maxTimesNum.contains(x)){               
                    /*here I wanted it to check if we already have that number in the ArrayList so it doesn't print it multiple times*/
                            index = index + 1;
                            maxTimesNum.add(index, x);
                    }
                }
            }
        }
        if (maxTimesNumRepeats > 0) {
            System.out.print("'" + s.trim() + "'" + " -> ");
            for (int p = 0; p < maxTimesNum.size(); p++) {

                System.out.printf("%c ",maxTimesNum.get(p));

            }
            System.out.printf("(%d)\n", maxTimesNumRepeats);
        } else {
            System.out.println("The string " + "'" + s + "'" + " has no numbers.");
        }
    }
}

不幸的是,我的代码不正确,因为它没有按升序写入数字,而且我似乎也无法弄清楚为什么它在某些情况下有效但在其他情况下无效。为了更好地描述问题,这些是我尝试过的一些输入:

Input: +386 40 253 987

Desired output: '+386 40 253 987' -> 3 8 (2)

My output: '+386 40 253 987' -> 3 8 (2)

Input: hdch44 fg1t525 j6s99

Desired output: 'hdch44 fg1t525 j6s99' -> 4 5 9 (2)

My output: 'hdch44 fg1t525 j6s99' -> 4 5 9 (2)

所以这些工作。输入(没有数字)和(1234321)也是如此。但如果最频繁的数字没有出现在从最小到最大的字符串中,这一切都会走下坡路:

Input: d8d 82 a1810y51

Desired output: 'd8d 82 a1810y51' -> 1 8 (3)

My output: 'd8d 82 a1810y51' -> 8 1 (3)

或者在这种情况下,我什至不明白为什么它写了数字 4 和 2,因为它们出现的次数不同:

Input: This(4) cat(3) and(3) friend(6), in(2) thought(7) seem(7) to(3), be(2), present(5) instead(5) they(4), roam(6) around(2) not(5) consciously(2) knowing(7), that(2) they(5) are(5) there(2) like(4), they(2) are(4), day(2) dreaming(5) together(2) today(5)!

Desired output: 'This(4) cat(3) and(3) friend(6), in(2) thought(7) seem(7) to(3), be(2), present(5) instead(5) they(4), roam(6) around(2) not(5) consciously(2) knowing(7), that(2) they(5) are(5) there(2) like(4), they(2) are(4), day(2) dreaming(5) together(2) today(5)!' -> 2 (9)

My output: 'This(4) cat(3) and(3) friend(6), in(2) thought(7) seem(7) to(3), be(2), present(5) instead(5) they(4), roam(6) around(2) not(5) consciously(2) knowing(7), that(2) they(5) are(5) there(2) like(4), they(2) are(4), day(2) dreaming(5) together(2) today(5)!' -> 2 4 (9)

我更新了我的代码,以便检查字符串中每个数字的重复并将它们保存到 ArrayList numRepeats。我继续尝试使用多维数组键入结果。我的新密码:

import java.util.*;
public class trying7 {
    public static void main(String[] args) {
        //string of arguments
        String s = "";
        if (args.length != 0) {
            for (int i = 0; i < args.length; i++) {
                s = s + args[i] + " ";

            }
        }

        //preštevanje ponovitev
        ArrayList<Integer> numRepeats = new ArrayList<Integer>(); // to save repeats of each number

        //int index = 0;
        int indexInt = 0;

        //Counting and saving repetitions of numbers
        for (int i = 0; i < s.length(); i++) {

            if (Character.isDigit(s.charAt(i))) {
                char x = s.charAt(i);
                int xRepeats = 0;
                for (int k = 0; k < s.length(); k++) {
                    if (s.charAt(k) == x) {
                        xRepeats = xRepeats + 1;
                    }
                }
                numRepeats.add(indexInt, xRepeats);
                indexInt++;
            }
        }

        //2d array
        int [][] tab = new int[2][numRepeats.size()];
        //saving number of repetitions under column with index 0 and rows j
        for (int j = 0; j < numRepeats.size(); j++){
            tab [0][j] = numRepeats.get(j);
        }
        //saving numbers in a string under column with index 1
        for (int p = 0; p < s.length();p++){
            tab [1][p]= s.charAt(p);
        }

        for (int d = 0; d < tab[0].length; d++){
            int max = 0;
            for (int f= 0; f < tab.length; f++){
                //here I am attempting to print the numbers with most repetitions 
            }
        }


        }//main
    }//class

有什么问题?

根据我的理解,该算法旨在接收字符串和 return 出现次数最多的字符(或字符,如果是平局)。提供的代码中的问题是如何将这些字符实际确定为最常见的字符。 特别是在这个(稍微重新格式化的)代码中:

    int maxTimesNumRepeats = 0; //the  repeats of the most frequent number
    int index = 0;

    for (int i = 0; i < s.length(); i++) {
        if (Character.isDigit(s.charAt(i))) {
            //variables for the possible number
            char x = s.charAt(i);
            int xRepeats = 0;

            for (int k = 0; k < s.length(); k++) {
                if (s.charAt(k) == x) {
                    xRepeats++;
                }
            }
            if (maxTimesNumRepeats < xRepeats) {
                maxTimesNum.add(index, x);
                maxTimesNumRepeats = xRepeats;
            } 
            else if (maxTimesNumRepeats == xRepeats) {
                if (!maxTimesNum.contains(x)){               
                /*here I wanted it to check if we already have that number in the ArrayList so it doesn't print it multiple times*/
                    index = index + 1;
                    maxTimesNum.add(index, x);
                }
            }
        }
    }

如果我们站在程序的角度去想,第一次跑遍外层for-loop,for (int i = 0; i < s.length(); i++),最频繁的字符不知道出现了多少次。它只是检查它正在查看的当前字符是否大于或等于 maxTimesNumRepeats,它从 0 开始。这意味着第一个字符将始终(正确或错误地)计为 maxTimesNum,因为它会出现0次以上!这解释了为什么您的输出总是包含出现的第一个数字。

请记住,这个问题比不小心数到第一个数字会导致更多问题。例如,如果给定当前程序的输入:"122333",它将错误地输出 1、2 和 3。因为每个数字出现的次数都比它目前看到的任何数字都多。

注意:由于该算法的性质,我交替使用术语 characternumber,希望本文能消除任何混淆。

另一种方法:

要执行此特定任务,需要克服几个障碍。第一个当然是将所有数字与 Command-Line 参数分开。这可以通过使用 String#replaceAll() method with a small Regular Expression(正则表达式)来完成,例如:

假设我们有一个串联的 command-line 字符串,其中包含:d8d 82 a1810y51:

String cmdLine = "d8d 82 a1810y51";
String numbers = cmdLine.replaceAll("\D|\s+", "");

System.out.println(numbers);

如果 运行 将在控制台 window 中显示以下内容:882181051。正则表达式在 String#replaceAll() 方法中的作用是告诉方法替换所有 non-digit 字符 (\D) 或 any 空格 (|\s+) 在它正在处理的字符串中使用空字符串 (""),这实际上删除了这些字符。通过这样做,它只会留下一串数字。如果 command-line 中没有数字,那么显然会保留一个空字符串 ("")。

第二个障碍是找出每个数字在 numbers 字符串中出现的次数。您可以为此使用整数 (int[]) 数组和 for 循环,这是一个示例:

int[] occurred = new int[10];
    
/* Integer variable to hold the highest number of times a
   particular digit is found to be in the 'numbers' string. */
int high = 0;
    
/* Check for digit occurrences within the 'numbers' string. */
int idx;
for (int i = 0; i < numbers.length(); i++) {
    idx = Integer.valueOf(numbers.substring(i, i+1));
    occurred[idx] += 1;
    if (occurred[idx] > high) {
        high = occurred[idx];
    }
}

现在您几乎拥有构建输出字符串所需的所有信息,您只需要找到哪些数字包含相同的最大出现并将其打印到控制台 window。下面是一个 运行nable,它演示了上面讨论的概念。代码真的不多。它主要是供您阅读然后删除的评论:

public class HighestCommandLineDigitDemo {

    
    public static void main(String[] args) {
        // A methodical approach to solving this task:
        
        /* Concantenate all Command-Line Arguments into a single String. */
        StringBuilder sb = new StringBuilder("");
        if (args.length != 0) {
            for (String arg : args) {
                if (!sb.toString().isEmpty()) {
                    sb.append(" ");
                }
                sb.append(arg);
            }
        }
        else {
            System.err.println("Application Error! No Command-Line Argument(s) To Process!");
            return;
        }

        // Print the args string that was created...
        String cmdLine = sb.toString();
        // Start the output string...
        String outputString = "'" + cmdLine + "' -> "; 
        
        /* Using the String#replaceAll() method and a small Regular
           Expression (regex), replace ALL non-numerical characters
           and whitespaces from the created args String.        */
        String numbers = cmdLine.replaceAll("\D|\s+", "");

        /* If it's found that there are no numbers within the args 
           string then we print "(0)" and exit the main() method
           which will effectively end the application.        */
        if (numbers.isEmpty()) {
            System.out.println("(0)");
            return;
        }

        /* Create an int[] array (occurred) and initialize it
           to hold 10 elements, 1 for each digit in the base 10 
           numerical system which is: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. 
           This array will hold the number of times each digit is 
           contained within the 'numbers' string.            */
        int[] occurred = new int[10];
        
        /* Integer variable to hold the highest number of times a
           particular digit is found to be in the 'numbers' string. */
        int high = 0;
        
        /* Check for digit occurrences within the 'numbers' string.
           Look at this loop very carefully! Can you see what it's
           doing and how it works? Read the comment for the next 
           `for` loop to get a hint.                          */
        int idx;
        for (int i = 0; i < numbers.length(); i++) {
            idx = Integer.valueOf(numbers.substring(i, i+1));
            occurred[idx] += 1;
            if (occurred[idx] > high) {
                high = occurred[idx];
            }
        }
        
        /* Reset the StringBuilder object we used earlier so that
           it's empty. We do this so we don't need to declare a 
           new StringBuilder object.                        */
        sb.setLength(0); 
        
        /* Now we take the information we have and build a string which 
           will contain the all the digits that were contained within 
           the created args string that there are the highest number of.
           All digits which there is the same highest of will be placed 
           into the string build. Keep in mind, since we're dealing with 
           a base 10 numerical system and the occurred[] array contains 
           a length of 10 because of that fact (elements in index range 
           of 0 to 9) the variable 'i' actually holds the digit which 
           relates the value contained within the occurred[] array at any 
           given iteration. This then means that each element within the
           occurred[] array is related to the digit which is the index 
           value to access that element. Any element within the array that 
           contains a value of 0 means that the index value to access that 
           element does not exist within the 'numbers' string and therefore 
           not in the Command-Line arguments. Read this 10 times if you have to :)                      */
        for (int i = 0; i < occurred.length; i++) {
            if (occurred[i] == high) {
                sb.append(i).append(" ");
            }
        }
        sb.append("(").append(high).append(")");
        outputString += sb.toString();
        System.out.println(outputString);  // Finish the output to User.
    }
}

通过代码运行时的例子:

#1:
Concated Command-Line String:   12 34 5
Output:                         '12 34 5' -> 1 2 3 4 5 (1)

#2:
Concated Command-Line String:   hdch44 fg1t525 j6s99
Output:                         '12 34 5' -> 1 2 3 4 5 (1)

#3:
Concated Command-Line String:   +386 40 253 987
Output:                         '+386 40 253 987' -> 3 8 (2)
    
#4:
Concated Command-Line String:   Has no numbers
Output:                         'Has no numbers' -> (0)
    
#5:
Concated Command-Line String:   1234321
Output:                         '1234321' -> 1 2 3 (2)
    
#6:
Concated Command-Line String:   d8d 82 a1810y51
Output:                         'd8d 82 a1810y51' -> 1 8 (3)
    
#7:
Concated Command-Line String:   This(4) cat(3) and(3) friend(6), in(2) thought(7) seem(7) to(3), be(2), present(5) instead(5) they(4), roam(6) around(2) not(5) consciously(2) knowing(7), that(2) they(5) are(5) there(2) like(4), they(2) are(4), day(2) dreaming(5) together(2) today(5)!
Output:                         'This(4) cat(3) and(3) friend(6), in(2) thought(7) seem(7) to(3), be(2), present(5) instead(5) they(4), roam(6) around(2) not(5) consciously(2) knowing(7), that(2) they(5) are(5) there(2) like(4), they(2) are(4), day(2) dreaming(5) together(2) today(5)!' -> 2 (9)