在 HashMap 中 remove() 比 get() 快吗？

Question

我已经为 HashMap 的 get 和 remove 写了一个基准测试如下：

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class HashMapBenchmark {

  @State(Scope.Benchmark)
  public static class Mystate {
      HashMap<String,String> hashmapVar = new HashMap<String,String>();
      String key0 = "bye";

      @Setup(Level.Iteration)
      public void setup(){
          hashmapVar.put(key0,"bubye");
      }
  }

  @Benchmark
  public void hashmapGet(Mystate state ,Blackhole bh) {
      bh.consume(state.hashmapVar.get(state.key0));
  }

  @Benchmark
  public void hashmapRemove(Mystate state ,Blackhole bh) {
      bh.consume(state.hashmapVar.remove(state.key0));
  }
}

它产生这个结果：

Benchmark                             Mode  Samples  Score  Score error  Units
c.b.HashMapBenchmark.hashmapGet       avgt       60  6.348        0.320  ns/op
c.b.HashMapBenchmark.hashmapRemove    avgt       60  5.180        0.074  ns/op

根据结果，remove() 比 get() 稍快。即使要删除一个元素，它也必须首先检索该元素，不是吗？

如何remove()更快？还是我遗漏了什么？

更新使用最新的 JMH (1.11.3) 后，结果如下：

Benchmark                                 Mode  Cnt  Score   Error  Units
HashMapBenchmark.hashmapGet               avgt   60  9.713 ± 0.277  ns/op
HashMapBenchmark.hashmapRemove            avgt   60  7.677 ± 0.166  ns/op

Answer 1

当您在第一次迭代后调用 remove() 时，没有任何要删除的内容，您不必在任何地方复制结果（或者更确切地说是对结果的引用）（它只是简单地返回 null).但是当调用 get() 时，你必须在某处复制或存储返回的引用（没有查找 Blackhole 的实现），这需要内存分配，因此比简单地返回 null 更昂贵在几次迭代后可能会被 JIT 优化。

Answer 2

所以问题是，这些基准衡量的是不同的东西：get() 来自填充地图，remove() 来自（最终）空地图。比较没有意义，你可能会把基准扔掉。

您必须保证操作是针对同一个 HashMap 完成的。不幸的是，这需要使用 @Setup(Invocation)，这本身就很糟糕（阅读 Javadoc！），或者将 HashMap 构建成本吸收到基准测试本身中：

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class HashMapBenchmark {

    @Benchmark
    public String get() {
        HashMap<String, String> hm = createMap();
        return hm.get("bye");
    }

    @Benchmark
    public String remove() {
        HashMap<String, String> hm = createMap();
        return hm.remove("bye");
    }

    // extra protection from optimization
    @CompilerControl(CompilerControl.Mode.DONT_INLINE)
    private HashMap<String, String> createMap() {
        HashMap<String, String> hm = new HashMap<>();
        hm.put("bye", "bye");
        return hm;
    }
}

您可以 extra-careful 并将映射创建剥离到一个单独的 non-inlineable 方法中：今天的编译器不会跨调用进行优化。在我的 i7-4790K 上，4.0 GHz，Linux x86_64，JDK 8u66：

Benchmark                Mode  Cnt   Score   Error  Units
HashMapBenchmark.get     avgt   15  24.343 ± 0.351  ns/op
HashMapBenchmark.remove  avgt   15  24.611 ± 0.369  ns/op

没有太大区别。事实上，如果您使用 -prof perfasm 查看生成的代码，它会在其中产生一些可量化的差异。或者，您可以使用 -prof perfnorm.

快速表征这两种工作负载

请注意，这种情况不回答在真实地图上哪种方法更好。可以为两者提出论点：get 不修改映射，因此不会导致内存存储，remove 可能有助于加载因子，以便下一个 remove 会变得更快，等等。单一的基准和一段文字离任何富有成果的讨论还很遥远。

在 HashMap 中 remove() 比 get() 快吗？

Is remove() faster than get() in HashMap?

java

performance

benchmarking

hashmap

jmh