正则表达式获取选项卡值及其频率

REGEX to get tabbed values and its frequency

我正在使用的 Java 库将 'tab' 分隔值作为(每行)单个字符串输出显示如下

ID1 John
ID2 Jerry
ID3 John
ID4 Mary
ID5 John

我正在尝试获取 names 及其频率

John  3
Jerry 1
Mary  1

有没有办法使用正则表达式实现这一点(子字符串匹配然后计算频率)

Is there a way to achieve this using regex (substring match then take the frequency count)?

这不是 100% 可能,如果不是不可能,那么您可以创建自己的简单程序来解决这个问题。

这里有一段简单的代码可以解决你的问题:

public static void main(String[] args) {
    String str = "ID1 John\n"
            + "ID2 Jerry\n"
            + "ID3 John\n"
            + "ID4 Mary\n"
            + "ID5 John";

    //replace all the first part which contain (ID_Number_Space)
    //And split with \n
    String spl[] = str.replaceAll("(ID\d+\s)", "").split("\n");

    //result of this array is [John, Jerry, John, Mary, John]

    //create a map, which contain your key (name) value (nbr occurrence)
    Map<String, Integer> map = new HashMap<>();
    for (String s : spl) {
        if (map.containsKey(s)) {
            map.put(s, map.get(s) + 1);
        } else {
            map.put(s, 1);
        }
    }

    //Print your array
    for (Map.Entry entry : map.entrySet()) {
        System.out.println(entry.getKey() + " - " + entry.getValue());
    }
}

OutPut

John - 3
Jerry - 1
Mary - 1