Java：对象池和哈希集

Question

让我们假设以下 class...

class Foo {

  private Bar1 bar1;
  private Bar2 bar2;

  // many other fields

  @Override
  public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;
    Foo foo = (Foo) o;
    if (!bar1.equals(foo.getBar1()) return false;
    if (!bar2.equals(foo.getBar2()) return false;
    // etc...
  }

  @Override
  public int hashCode() {
    int result = bar1.hashCode();
    result = 31 * result + bar2.hashCode();
    // etc...
  }

  // setters & getters follow...
}

每分钟有数千个 Foo 实例被创建、处理并随后在池中回收。工作流程如下：

Set<Foo> foos = new THashSet<>();
while (there-is-data) {

  String serializedDataFromApi = api.getData();
  Set<Foo> buffer = pool.deserializeAndCreate(serializedDataFromApi);
  foos.addAll(buffer);
}

processor.process(foos);
pool.recycle(foos);

问题在于不同缓冲区之间可能存在重复的 foo 对象（具有相同的值）。它们被具体化为 Foo 的不同实例，但是在调用 foos.addAll(buffer) 时它们被认为是相等的。

我的问题是：

那些 "duplicate" 个实例发生了什么？
它们 "lost" 和垃圾被收集了吗？
如果我想让这些实例在池中可用，在使用 addAll 和回收实例插入之前测试重复项的最有效方法是什么？

Answer 1

What happened with those "duplicate" instances? Are they "lost" and garbage collected?

是的，在 while (there-is-data) 的当前迭代完成后，这些将立即符合 GC 条件

If I wanted to keep those instances available in pool, what would be the most effective way to test for duplicates before inserting using addAll and recycling instances?

Set.add returns true 如果元素被插入，false 如果它是重复的。所以你可以用

替换 addAll

for (Foo f : buffer) {
  if (!foos.add(f)) {
    // handle duplicate
  }
}

不会影响性能，因为 addAll 执行相同的操作 - 逐一迭代和添加。

Java：对象池和哈希集

Java: Object pooling and hash sets

java

garbage-collection

set

pooling

duplicates