根据列表优先级合并多个列表

Merge multiple lists based on List priority

我有几个列表,每个列表都有不同的优先级。 当我合并这些列表时,我希望首先看到 "most common" 项目(那些出现在 all 列表中的项目),然后是不太常见的项目 按优先级排序他们来自的名单。

下面是三个内容重叠的列表的示例用例。

lStr    lStr2   lStr3

111             111
112     112
113     113     113
114
115
        118
        119     119
                120

合并后的列表应该是这样的:

113 -- this should come on top as it is common in all 3 
112 -- this should come next as it is common to lStr and lStr2
111 -- this should come next as it is common to lStr and lStr3
114 -- this is not common to any but has priority 1
115 -- this is not common to any but has priority 1
119 -- this is common with lstr3 and lstr2
118 -- this is not common but has priority 2
120 -- this is not common but has priority 3

下面的示例代码与此用例相匹配,构建了三个输入列表。

如何根据列表的优先级和重复性按照说明合并这些列表?

注意:请牢记性能问题,列表大小和列表数量也可能有所不同。

import java.util.ArrayList;
import java.util.List;

public class ListMerge {

    public static void main(String args[]){
        List<String> lStr = new ArrayList<String>(); // has priority 1
        List<String> lStr2 = new ArrayList<String>(); // has priority 2
        List<String> lStr3 = new ArrayList<String>(); // has priority 3

        // find another use case whith equal priority
        List<String> lStr4 = new ArrayList<String>(); // has priority 2 // has equal priority to lStr2
        List<String> lStr5 = new ArrayList<String>(); // has priority 1 // has equal priority to lStr

        lStr.add("111");
        lStr.add("112");
        lStr.add("113");// common
        lStr.add("114");
        lStr.add("115");

        System.out.println(lStr);

        lStr2.add("112");
        lStr2.add("113"); // common
        lStr2.add("118");
        lStr2.add("119");

        System.out.println(lStr2);      

        lStr3.add("113");// common
        lStr3.add("119");// common to lsr2 
        lStr3.add("111");// common to lsr1
        lStr3.add("120");// new         

        // when the merge happens the result list should like like the following 
        // use case 1 with different priorities 
        // sorted data should look similar to follow
        /*
            113 -- this should come on top as it is common in all 3 
            112 -- this should come next as it is common to lStr and lStr2
            111 -- this should come next as it is common to lStr and lStr3
            114 -- this is not common to any but has priority 1
            115 -- this is not common to any but has priority 1
            119 -- this is common with lstr3 and lstr2
            118 -- this has priority than any lstr2
            120 -- this has the lowest priority
        */


        // use case 2 with some cases with similar priorities 
    }
}

这会产生您期望的结果。它不会一次处理超过 63 个列表。该算法基于 2 的幂组合的权重,每个列表与另一个 2 的幂相关联。因此,来自第一个列表(n 个列表)的权重为 2^(n-1) 的元素超过出现在 n-1 个列表 n-2,...1, 0.

中的另一个元素
class Pair implements Comparable<Pair> {
    private String value;
    private long   weight;
    public Pair( String v, long w ){
        value = v;
        weight = w;
    }
    public void addWeight( long w ){
        weight += w;
    }
    public String getValue(){
        return value;
    }
    public long getWeight(){
        return weight;
    }
    public int compareTo(Pair other){
        return this.weight > other.weight ? -1 :
            this.weight == other.weight ? 0 : 1;
    }
}

public static List<String> merge( List<String>... lists ){
    Map<String,Long> v2w = new HashMap<>();
    // combine the lists, adding the weights according to list priorities.
    long w = 1 << lists.length - 1;
    for( List<String> list: lists ){
        for( String s: list ){
            Long weight = v2w.get(s);
            if( weight == null ){
                weight = w;
            } else {
                weight += w;
            }
            v2w.put( s, weight );
        }
        w = w >> 1;
    }
    // create the list of Pair values: String+weight
    List<Pair> pairs = new ArrayList<>();
    for( Map.Entry<String,Long> vw: v2w.entrySet() ){
        pairs.add( new Pair( vw.getKey(), vw.getValue() ) );
    }
    // sort
    Collections.sort( pairs );
    // extract result list
    List<String> res = new ArrayList<>();
    for( Pair pair: pairs ){
        res.add( pair.getValue() );
    }
    return res;
}

你可以这样称呼:

List<String> ml = merge( lStr1, lStr2, lStr3 );
for( String s: ml ){
    System.out.println( s );
}

很难把这个算法写成文字。您维护一个结果列表,从最低优先级到最高优先级遍历列表,并在填充结果列表时提升项目(如果您已经在较低优先级列表中找到它们)。

  public static List<String> priorityMerge(List<String> ... reversePriority) {
    List<String> result = new ArrayList<>();
    for( List<String> list : reversePriority){
        List<String> intersection = intersection(result, list);
        result.removeAll(intersection);
        result.addAll(0,difference(list, intersection));
        result.addAll(0,intersection);
    }
    return result;
}

private static List<String> difference(List<String> list, List<String> list2) {
    List<String> result = new ArrayList<>();
    result.addAll(list);
    result.removeAll(list2);
    return result;
}

private static List<String> intersection(List<String> list, List<String> list2) {
    List<String> result = new ArrayList<>();
    result.addAll(list);
    result.retainAll(list2);
    return result;
}

不完全清楚组合优先级的规则是怎样的,但我建议您尝试将项目在不同列表中的所有优先级相加,然后按组合优先级对它们进行排序。

public static <T> List<T> priorize(Map<List<T>, Integer> listsToPriority) {
    Map<T, Integer> totalPriority = new HashMap<T, Integer>();
    List<T> allElements = new ArrayList<>();

    for (List<T> list : listsToPriority.keySet()) {
        int priority = listsToPriority.get(list);
        for (T x : list) {
            if (totalPriority.containsKey(x)) {
                totalPriority.put(x, totalPriority.get(x) + priority);
            } else {
                totalPriority.put(x, priority);
                allElements.add(x);
            }
        }
    }
    // sort by total priority in reverse order
    allElements.sort(new Comparator<T> () {
        public int compare(T x, T y) {
            return totalPriority.get(y).compareTo(totalPriority.get(x));    
        }
    });
    return allElements;
}

用法:

Map<List<String>, Integer> priorizedLists = new HashMap<>();
priorizedLists.put(lStr,  3); // 3 == high priority
priorizedLists.put(lStr2, 2);
priorizedLists.put(lStr3, 1); // 1 == low priority
priorizedLists.put(lStr4, 2);
priorizedLists.put(lStr5, 3);

List<String> priorized = priorize(priorizedLists);
System.out.println(priorized);

输出:

[113, 112, 111, 119, 114, 115, 118, 120]

一些注意事项:

  • 对于这种方法,具有 high 优先级的项目实际上应该具有 high 优先级编号,即在您的示例中具有优先级 1 的项目,我的优先级为 3,但这应该很容易翻译。
  • 输出与您的输出不完全匹配,但正如我所说,组合优先级的规则并非 100% 明确。您可以将 + 替换为 * 以计算所有优先级的乘积;这实际上会得到你想要的结果,但我不知道这在你的设置中是否有意义。