TreeSet 给出错误的输出 - Java8
TreeSet giving incorrect output - Java8
在处理树集时,我发现了非常奇怪的行为。
根据我的理解,以下程序应该打印两行相同的内容:
public class TestSet {
static void test(String... args) {
Set<String> s = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
s.addAll(Arrays.asList("a", "b"));
s.removeAll(Arrays.asList(args));
System.out.println(s);
}
public static void main(String[] args) {
test("A");
test("A", "C");
}
}
但奇怪的是它打印:
[b]
[a, b]
我无法理解 - 为什么树集会这样?
好吧,这让我很惊讶,我不知道我是否正确,但是看看 AbstractSet
中的这个实现:
public boolean removeAll(Collection<?> c) {
Objects.requireNonNull(c);
boolean modified = false;
if (size() > c.size()) {
for (Iterator<?> i = c.iterator(); i.hasNext(); )
modified |= remove(i.next());
} else {
for (Iterator<?> i = iterator(); i.hasNext(); ) {
if (c.contains(i.next())) {
i.remove();
modified = true;
}
}
}
return modified;
}
基本上在您的示例中,set 的大小等于您要删除的参数的大小,因此将调用 else 条件。在这种情况下,会检查您的参数集合是否删除 contains
迭代器的当前元素,并且该检查区分大小写,因此它会检查 c.contains("a")
和 returns 是否为假,因为 c
包含 "A"
,而不是 "a"
,所以不会删除该元素。请注意,当您将元素添加到集合 s.addAll(Arrays.asList("a", "b", "d"));
时,它可以正常工作,因为 size() > c.size()
现在为真,因此不再需要 contains
检查。
这是因为一个SortedSet的Comparator用于排序,但是removeAll依赖于每个元素的equals
方法。来自 SortedSet documentation:
Note that the ordering maintained by a sorted set (whether or not an explicit comparator is provided) must be consistent with equals if the sorted set is to correctly implement the Set
interface. (See the Comparable
interface or Comparator
interface for a precise definition of consistent with equals.) This is so because the Set
interface is defined in terms of the equals
operation, but a sorted set performs all element comparisons using its compareTo
(or compare
) method, so two elements that are deemed equal by this method are, from the standpoint of the sorted set, equal. The behavior of a sorted set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set
interface.
中定义了“consistent with equals”的解释
The natural ordering for a class C
is said to be consistent with equals if and only if e1.compareTo(e2) == 0
has the same boolean value as e1.equals(e2)
for every e1
and e2
of class C
. Note that null
is not an instance of any class, and e.compareTo(null)
should throw a NullPointerException
even though e.equals(null)
returns false
.
It is strongly recommended (though not required) that natural orderings be consistent with equals. This is so because sorted sets (and sorted maps) without explicit comparators behave "strangely" when they are used with elements (or keys) whose natural ordering is inconsistent with equals. In particular, such a sorted set (or sorted map) violates the general contract for set (or map), which is defined in terms of the equals
method.
总而言之,您的 Set 的比较器的行为与元素的 equals
方法不同,导致异常(尽管可预测)行为。
这很有趣,所以这里有一些输出测试:
static void test(String... args) {
Set<String> s =new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
s.addAll(Arrays.asList( "a","b","c"));
s.removeAll(Arrays.asList(args));
System.out.println(s);
}
public static void main(String[] args) {
test("C"); output: [a, b]
test("C", "A"); output: [b]
test("C", "A","B"); output: [a, b, c]
test("B","C","A"); output: [a, b, c]
test("K","C"); output: [a, b]
test("C","K","M"); output: [a, b, c] !!
test("C","K","A"); output: [a, b, c] !!
}
现在没有比较器,它就像排序的一样工作 HashSet<String>()
:
static void test(String... args) {
Set<String> s = new TreeSet<String>();//
s.addAll(Arrays.asList( "a","b","c"));
s.removeAll(Arrays.asList(args));
System.out.println(s);
}
public static void main(String[] args) {
test("c"); output: [a, b]
test("c", "a"); output: [b]
test("c", "a","b"); output: []
test("b","c","a"); output: []
test("k","c"); output: [a, b]
test("c","k","m"); output: [a, b]
test("c","k","m"); output: [a, b]
}
现在来自文档:
public boolean removeAll(Collection c)
Removes from this set all of its elements that are contained in the
specified collection (optional operation). If the specified collection
is also a set, this operation effectively modifies this set so that
its value is the asymmetric set difference of the two sets.
This implementation determines which is the smaller of this set and
the specified collection, by invoking the size method on each. If this
set has fewer elements, then the implementation iterates over this
set, checking each element returned by the iterator in turn to see if
it is contained in the specified collection. If it is so contained, it
is removed from this set with the iterator's remove method. If the
specified collection has fewer elements, then the implementation
iterates over the specified collection, removing from this set each
element returned by the iterator, using this set's remove method.
添加一些信息,说明为什么 TreeSet
的 remove
实际上会在您的示例中不区分大小写地删除(并且前提是您遵循 if (size() > c.size())
路径,如答案中所述@Shadov):
这是TreeSet
中的remove
方法:
public boolean remove(Object o) {
return m.remove(o)==PRESENT;
}
它从其内部调用 remove
TreeMap
:
public V remove(Object key) {
Entry<K,V> p = getEntry(key);
if (p == null)
return null;
V oldValue = p.value;
deleteEntry(p);
return oldValue;
}
调用 getEntry
final Entry<K,V> getEntry(Object key) {
// Offload comparator-based version for sake of performance
if (comparator != null)
return getEntryUsingComparator(key);
if (key == null)
throw new NullPointerException();
@SuppressWarnings("unchecked")
Comparable<? super K> k = (Comparable<? super K>) key;
Entry<K,V> p = root;
while (p != null) {
int cmp = k.compareTo(p.key);
if (cmp < 0)
p = p.left;
else if (cmp > 0)
p = p.right;
else
return p;
}
return null;
}
如果有一个Comparator
(如你的例子),条目是根据这个Comparator
搜索的(这是由getEntryUsingComparator
完成的),这就是为什么它实际上是找到(然后删除),尽管大小写不同。
在处理树集时,我发现了非常奇怪的行为。
根据我的理解,以下程序应该打印两行相同的内容:
public class TestSet {
static void test(String... args) {
Set<String> s = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
s.addAll(Arrays.asList("a", "b"));
s.removeAll(Arrays.asList(args));
System.out.println(s);
}
public static void main(String[] args) {
test("A");
test("A", "C");
}
}
但奇怪的是它打印:
[b]
[a, b]
我无法理解 - 为什么树集会这样?
好吧,这让我很惊讶,我不知道我是否正确,但是看看 AbstractSet
中的这个实现:
public boolean removeAll(Collection<?> c) {
Objects.requireNonNull(c);
boolean modified = false;
if (size() > c.size()) {
for (Iterator<?> i = c.iterator(); i.hasNext(); )
modified |= remove(i.next());
} else {
for (Iterator<?> i = iterator(); i.hasNext(); ) {
if (c.contains(i.next())) {
i.remove();
modified = true;
}
}
}
return modified;
}
基本上在您的示例中,set 的大小等于您要删除的参数的大小,因此将调用 else 条件。在这种情况下,会检查您的参数集合是否删除 contains
迭代器的当前元素,并且该检查区分大小写,因此它会检查 c.contains("a")
和 returns 是否为假,因为 c
包含 "A"
,而不是 "a"
,所以不会删除该元素。请注意,当您将元素添加到集合 s.addAll(Arrays.asList("a", "b", "d"));
时,它可以正常工作,因为 size() > c.size()
现在为真,因此不再需要 contains
检查。
这是因为一个SortedSet的Comparator用于排序,但是removeAll依赖于每个元素的equals
方法。来自 SortedSet documentation:
中定义了“consistent with equals”的解释Note that the ordering maintained by a sorted set (whether or not an explicit comparator is provided) must be consistent with equals if the sorted set is to correctly implement the
Set
interface. (See theComparable
interface orComparator
interface for a precise definition of consistent with equals.) This is so because theSet
interface is defined in terms of theequals
operation, but a sorted set performs all element comparisons using itscompareTo
(orcompare
) method, so two elements that are deemed equal by this method are, from the standpoint of the sorted set, equal. The behavior of a sorted set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of theSet
interface.
The natural ordering for a class
C
is said to be consistent with equals if and only ife1.compareTo(e2) == 0
has the same boolean value ase1.equals(e2)
for everye1
ande2
of classC
. Note thatnull
is not an instance of any class, ande.compareTo(null)
should throw aNullPointerException
even thoughe.equals(null)
returnsfalse
.It is strongly recommended (though not required) that natural orderings be consistent with equals. This is so because sorted sets (and sorted maps) without explicit comparators behave "strangely" when they are used with elements (or keys) whose natural ordering is inconsistent with equals. In particular, such a sorted set (or sorted map) violates the general contract for set (or map), which is defined in terms of the
equals
method.
总而言之,您的 Set 的比较器的行为与元素的 equals
方法不同,导致异常(尽管可预测)行为。
这很有趣,所以这里有一些输出测试:
static void test(String... args) {
Set<String> s =new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
s.addAll(Arrays.asList( "a","b","c"));
s.removeAll(Arrays.asList(args));
System.out.println(s);
}
public static void main(String[] args) {
test("C"); output: [a, b]
test("C", "A"); output: [b]
test("C", "A","B"); output: [a, b, c]
test("B","C","A"); output: [a, b, c]
test("K","C"); output: [a, b]
test("C","K","M"); output: [a, b, c] !!
test("C","K","A"); output: [a, b, c] !!
}
现在没有比较器,它就像排序的一样工作 HashSet<String>()
:
static void test(String... args) {
Set<String> s = new TreeSet<String>();//
s.addAll(Arrays.asList( "a","b","c"));
s.removeAll(Arrays.asList(args));
System.out.println(s);
}
public static void main(String[] args) {
test("c"); output: [a, b]
test("c", "a"); output: [b]
test("c", "a","b"); output: []
test("b","c","a"); output: []
test("k","c"); output: [a, b]
test("c","k","m"); output: [a, b]
test("c","k","m"); output: [a, b]
}
现在来自文档:
public boolean removeAll(Collection c)
Removes from this set all of its elements that are contained in the specified collection (optional operation). If the specified collection is also a set, this operation effectively modifies this set so that its value is the asymmetric set difference of the two sets.
This implementation determines which is the smaller of this set and the specified collection, by invoking the size method on each. If this set has fewer elements, then the implementation iterates over this set, checking each element returned by the iterator in turn to see if it is contained in the specified collection. If it is so contained, it is removed from this set with the iterator's remove method. If the specified collection has fewer elements, then the implementation iterates over the specified collection, removing from this set each element returned by the iterator, using this set's remove method.
添加一些信息,说明为什么 TreeSet
的 remove
实际上会在您的示例中不区分大小写地删除(并且前提是您遵循 if (size() > c.size())
路径,如答案中所述@Shadov):
这是TreeSet
中的remove
方法:
public boolean remove(Object o) {
return m.remove(o)==PRESENT;
}
它从其内部调用 remove
TreeMap
:
public V remove(Object key) {
Entry<K,V> p = getEntry(key);
if (p == null)
return null;
V oldValue = p.value;
deleteEntry(p);
return oldValue;
}
调用 getEntry
final Entry<K,V> getEntry(Object key) {
// Offload comparator-based version for sake of performance
if (comparator != null)
return getEntryUsingComparator(key);
if (key == null)
throw new NullPointerException();
@SuppressWarnings("unchecked")
Comparable<? super K> k = (Comparable<? super K>) key;
Entry<K,V> p = root;
while (p != null) {
int cmp = k.compareTo(p.key);
if (cmp < 0)
p = p.left;
else if (cmp > 0)
p = p.right;
else
return p;
}
return null;
}
如果有一个Comparator
(如你的例子),条目是根据这个Comparator
搜索的(这是由getEntryUsingComparator
完成的),这就是为什么它实际上是找到(然后删除),尽管大小写不同。