如何根据条件从优先级队列中轮询值

Question

我有地图 Map，其中队列根据分数排序（反向）。我从列表中填充地图，其中键为 data.getGroup，值为数据本身。

现在我的用例是，

如果地图的大小<=3，我只想return数据对象，所以我只是为每个键做一个轮询最高值（数据对象）和
如果地图的大小大于 3，那么我需要根据分数从地图中获取 3 个值 (1 value/key)。

例如：

// output should be just Data(17.0, "five", "D"), Data(4.0, "two", "A"), Data(3.0, "three", "B") though there will be only 4 keys (A,B,C,D) 
      ArrayList<Data> dataList = new ArrayList<Data>();
        dataList.add(new Data(1.0, "one", "A"));
        dataList.add(new Data(4.0, "two", "A"));
        dataList.add(new Data(3.0, "three", "B"));
        dataList.add(new Data(2.0, "four", "C"));
        dataList.add(new Data(7.0, "five", "D"));
        dataList.add(new Data(17.0, "five", "D"));
        
// output should be just Data(5.0, "six", "A"), Data(3.14, "two", "B"), Data(3.14, "three", "C") as there will be only 3 keys (A,B,C)
      ArrayList<Data> dataList2 = new ArrayList<Data>();
        dataList2.add(new Data(3.0, "one", "A"));
        dataList2.add(new Data(5.0, "six", "A"));
        dataList2.add(new Data(3.14, "two", "B"));
        dataList2.add(new Data(3.14, "three", "C"));

我尝试了下面的方法，但是在 Java 中是否有 better/smarter（优化的）方法？

// n = 3
public List<Data> getTopN(final List<Data> dataList, final int n) {

   private static final Comparator< Data > comparator = Comparator.comparing(Data::getScore).reversed();

   Map<String, PriorityQueue<Data>> map = Maps.newHashMap();

   for (Data data : dataList) {
            String key = data.getGroup();

            if (key != null) {
                if (!map.containsKey(key)) {
                    map.put(key, new PriorityQueue<>(comparator));
                }
                map.get(key).add(data);
            }
     } 
     
     if (map.size <= n) {
         List<Data> result = new ArrayList<Data>();
      
         for (Map.Entry<String, PriorityQueue<Data>> entrySet: map.entrySet()){

               PriorityQueue<Data> priorityQueue = entrySet.getValue();
               result.add(priorityQueue.peek());
               
         }
      return result;
     } else if (map.size > n) {
    
              List<Data> result = new ArrayList<Data>();
      
         for (Map.Entry<String, PriorityQueue<Data>> entrySet: map.entrySet()){

               PriorityQueue<Data> priorityQueue = entrySet.getValue();
               result.add(priorityQueue.peek());
               
         }

         return result.stream()
               .sorted(Comparator.comparingDouble(Data::getScore).reversed())
               .limit(n)
               .collect(Collectors.toList());
  }
}

数据对象如下所示：

public class Data {
     double score;
     String name; 
     String group;
     
      
    public void setName(String name) {
        this.name = name;
    }
     
    public void setGroup(String group) {
        this.group = group;
    }
    
    public void setScore(double score) {
        this.score = score;
    }
    
    public String getName() {
        return name;
    }
     
    public String getGroup() {
        return group;
    }
    
    public double getScore() {
        return score;
    }
    }

Answer 1

由于您的起点是 List<Data>，当您只对一个值（即每个键的最大值）感兴趣时，将元素添加到 Map<String, PriorityQueue<Data>> 没有多大意义.在这种情况下，您可以简单地存储最大值。

此外，值得考虑映射方法 keySet()、values() 和 entrySet() 之间的差异。仅当您对循环体内的键和值都感兴趣时，使用后者才有用。否则，使用 keySet() 或 values() 来简化操作。

仅当尝试从地图中获取前 n 个值时，使用 PriorityQueue 可能会提高性能：

private static final Comparator<Data> BY_SCORE = Comparator.comparing(Data::getScore);
private static final BinaryOperator<Data> MAX = BinaryOperator.maxBy(BY_SCORE);

public List<Data> getTopN(List<Data> dataList, int n) {
    Map<String, Data> map = new HashMap<>();

    for(Data data: dataList) {
        String key = data.getGroup();
        if(key != null) map.merge(key, data, MAX);
    }

    if(map.size() <= n) {
        return new ArrayList<>(map.values());
    }
    else {
        PriorityQueue<Data> top = new PriorityQueue<>(n, BY_SCORE);
        for(Data d: map.values()) {
            top.add(d);
            if(top.size() > n) top.remove();
        }
        return new ArrayList<>(top);
    }
}

请注意，BinaryOperator.maxBy(…) 使用升序作为基础，并且优先级队列现在也需要升序，因为我们正在删除最小的元素，例如顶部 n 留在结果队列中。因此，reversed() 已从此处的 Comparator 中删除。

如果 n 很小，特别是与地图的大小相比，使用优先队列会带来好处。如果 n 相当大或预计接近地图的大小，则使用

可能更有效

List<Data> top = new ArrayList<>(map.values());
top.sort(BY_SCORE.reversed());
top.subList(n, top.size()).clear();
return top;

它按降序对地图的所有值进行排序并删除多余的元素。这可以与处理 map.size() <= n 场景的代码结合使用：

public List<Data> getTopN(List<Data> dataList, int n) {
    Map<String, Data> map = new HashMap<>();

    for(Data data: dataList) {
        String key = data.getGroup();
        if(key != null) map.merge(key, data, MAX);
    }

    List<Data> top = new ArrayList<>(map.values());
    if(top.size() > n) {
        top.sort(BY_SCORE.reversed());
        top.subList(n, top.size()).clear();
    }
    return top;
}

如何根据条件从优先级队列中轮询值

How to do poll values from Priority queue based on a condition

priority-queue

treemap

treeset

top-n

java-8