如何保证 Set.removeAll 具有不同类型集合的行为?

How to guarantee the behavior of Set.removeAll with different type of sets?

我对 HashSet 和 TreeSet 操作有疑问。 这是解释我的问题的简单 JUnit 4 测试:

import java.util.HashSet;
import java.util.TreeSet;
import java.util.concurrent.atomic.AtomicReference;

import org.junit.Assert;
import org.junit.Test;

public class TreeSetTest<T> {

    @Test
    public void test() {
        final HashSet<Object> set1 = new HashSet<>();
        final TreeSet<Object> set2 = new TreeSet<>((a, b) -> a.toString().compareTo(b.toString()));
        Assert.assertEquals(0, set1.size()); // OK
        Assert.assertEquals(0, set2.size()); // OK
        set1.add(new AtomicReference<>("A"));
        set1.add(new AtomicReference<>("B"));
        set1.add(new AtomicReference<>("C"));
        Assert.assertEquals(3, set1.size()); // OK
        Assert.assertEquals(0, set2.size()); // OK
        set1.removeAll(set2);
        Assert.assertEquals(3, set1.size()); // OK
        Assert.assertEquals(0, set2.size()); // OK
        set2.add(new AtomicReference<>("A"));
        set1.removeAll(set2);
        Assert.assertEquals(3, set1.size()); // OK Nothing has been removed
        Assert.assertEquals(1, set2.size());
        set2.add(new AtomicReference<>("B"));
        set1.removeAll(set2);
        Assert.assertEquals(3, set1.size()); // OK Nothing has been removed
        Assert.assertEquals(2, set2.size());
        set2.add(new AtomicReference<>("C"));
        set1.removeAll(set2);
        Assert.assertEquals(3, set1.size()); // Error Everything has been removed and size is now 0
        Assert.assertEquals(3, set2.size());
    }

}

set1 中删除 set2 的所有元素时,我希望使用 set1 的相等比较器,只要 set2 具有小于 set1 的大小,但如果 set2 的大小大于或等于 set1 的大小,则从 set2.[= 进行比较30=]

这对我来说非常糟糕,因为它使程序变得不可预测。

我认为它可以被视为 java 实现中的一个错误,但我担心的是: 如何在不重写所有内容的情况下保证预期的行为?

在@FedericoPeraltaSchaffner 评论后编辑 1:

AtomicReference 只是为了提供一个简单的例子。事实上,我使用的是库中的最终版本 class,因此我无法轻易改进它。 但即使考虑到有效的 class 正确实施 hashCodeequals,我的问题仍然存在。现在考虑一下:

package fr.ncenerar.so;

import java.util.HashSet;
import java.util.TreeSet;

import org.junit.Assert;
import org.junit.Test;

public class TreeSetTest<T> {

    public static class MyObj {
        private final int value1;
        private final int value2;

        public MyObj(final int v1, final int v2) {
            super();
            this.value1 = v1;
            this.value2 = v2;
        }

        public int getValue1() {
            return this.value1;
        }

        public int getValue2() {
            return this.value2;
        }

        @Override
        public int hashCode() {
            final int prime = 31;
            int result = 1;
            result = prime * result + this.value1;
            result = prime * result + this.value2;
            return result;
        }

        @Override
        public boolean equals(final Object obj) {
            if (this == obj) {
                return true;
            }
            if (obj == null) {
                return false;
            }
            if (this.getClass() != obj.getClass()) {
                return false;
            }
            final MyObj other = (MyObj) obj;
            if (this.value1 != other.value1) {
                return false;
            }
            if (this.value2 != other.value2) {
                return false;
            }
            return true;
        }
    }

    @Test
    public void test() {
        final HashSet<MyObj> set1 = new HashSet<>();
        final TreeSet<MyObj> set2 = new TreeSet<>((a, b) -> a.getValue1() - b.getValue1());
        Assert.assertEquals(0, set1.size()); // OK
        Assert.assertEquals(0, set2.size()); // OK
        set1.add(new MyObj(0, 0));
        set1.add(new MyObj(1, 1));
        set1.add(new MyObj(2, 2));
        Assert.assertEquals(3, set1.size()); // OK
        Assert.assertEquals(0, set2.size()); // OK
        set1.removeAll(set2);
        Assert.assertEquals(3, set1.size()); // OK
        Assert.assertEquals(0, set2.size()); // OK
        set2.add(new MyObj(0, 1));
        set1.removeAll(set2);
        Assert.assertEquals(3, set1.size()); // OK Nothing has been removed
        Assert.assertEquals(1, set2.size());
        set2.add(new MyObj(1, 2));
        set1.removeAll(set2);
        Assert.assertEquals(3, set1.size()); // OK Nothing has been removed
        Assert.assertEquals(2, set2.size());
        set2.add(new MyObj(2, 3));
        set1.removeAll(set2);
        Assert.assertEquals(3, set1.size()); // Error Everything has been removed
        Assert.assertEquals(3, set2.size());
    }

}

问题仍然存在,MyObj实现是正确的。问题来自于我从两个不同的方面使用对象这一事实。在一组中,我想根据每个对象的相等性(如对象的 equals 方法)保留每个对象的一个​​实例,而在另一组中,我想要第一组的一个子集,其中每个 value1, 我只想保留第一个插入的元素。

使用 TreeSet 似乎有效。

编辑 2:

糟糕,我错过了 TreeSet 文档的那一部分:

Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent withequals.) This is so because the Set interface is defined interms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.

如果我理解正确,我可以使用 TreeSet 来达到我的目的,但不能指望它会像我想要的那样表现。

谢谢大家的帮助。

 AtomicReference a = new AtomicReference<>("A");
 AtomicReference a2 = new AtomicReference<>("A");
 System.out.println(a==a2);

你的答案就在里面。

如果您使用自定义 class 而不是 Object 并且您覆盖 equals 方法,它将按预期工作。

让它发挥作用

class AtomicString{
private AtomicReference<String> s;

public AtomicString(String s) {
    this.s = new AtomicReference<>(s);
}

@Override public boolean equals(Object o) {
    if (this == o)
        return true;
    if (o == null || getClass() != o.getClass())
        return false;
    AtomicString that = (AtomicString) o;
    return this.s.get().equals(that.getS().get());
}

public AtomicReference<String> getS() {
    return s;
}

@Override public int hashCode() {
    return Objects.hash(s.get());
}

问题确实是 "equality" 逻辑不一致。 TreeSetHashSet 都继承了 AbstractSet#removeAll,后者遍历较小的集合,因此使用该集合的对象比较。事实证明,可以使用 TreeSet.

覆盖 "equality" 逻辑

这个问题可以通过选择两个 Set 实现之一来避免。如果您选择TreeSet,那么您也必须使用相同的比较器。

在这种情况下您不能真正使用 HashSet,因为 AtomicReference 没有适合您的 equals/hashCode 的实现。所以你唯一实际的选择是使用 TreeSet:

Comparator<Object> comparator = (a, b) -> a.toString().compareTo(b.toString());
final Set<Object> set1 = new TreeSet<>(comparator);
final Set<Object> set2 = new TreeSet<>(comparator);

这会破坏您当前的测试,但是因为元素现在被删除了(根据您的比较器的逻辑)。

在正确搜索和阅读有关 TreeSet 的文档后,结果是:

Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparableor Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.

这意味着示例中使用的TreeSet不能用作Set。因此,最简单的解决方案是通过替换每个

来为 removeAll 操作创建一个 HashSet
set1.removeAll(set2);

来自

set1.removeAll(new HashSet<>(set2));

从性能的角度来看,这可能不是最佳解决方案,但却是一个可行的解决方案。

谢谢大家!