为什么这个使用 Collections.sort 的程序只对大小为 32 或更大的列表失败?
Why does this program using Collections.sort only fail for lists of size 32 or more?
以下程序抛出以下异常:
java.lang.IllegalArgumentException: Comparison method violates its general contract!
我了解 Comparator
的问题。参见
我不明白为什么只有 List
大小为 32 或更大时才会失败。谁能解释一下?
class Experiment {
private static final class MyInteger {
private final Integer num;
MyInteger(Integer num) {
this.num = num;
}
}
private static final Comparator<MyInteger> COMPARATOR = (r1, r2) -> {
if (r1.num == null || r2.num == null)
return 0;
return Integer.compare(r1.num, r2.num);
};
public static void main(String[] args) {
MyInteger[] array = {new MyInteger(0), new MyInteger(1), new MyInteger(null)};
Random random = new Random();
for (int length = 0;; length++) {
for (int attempt = 0; attempt < 100000; attempt++) {
List<MyInteger> list = new ArrayList<>();
int[] arr = new int[length];
for (int k = 0; k < length; k++) {
int rand = random.nextInt(3);
arr[k] = rand;
list.add(array[rand]);
}
try {
Collections.sort(list, COMPARATOR);
} catch (Exception e) {
System.out.println(arr.length + " " + Arrays.toString(arr));
return;
}
}
}
}
}
Java 8 使用TimSort算法对输入进行排序。在 TimSort
中,当长度至少为 32 时会发生一个合并阶段。当长度小于 32 时,将使用一个简单的排序算法,该算法可能不会检测到合同违规。让TimSort.java
的源码注释不言而喻:
class TimSort<T> {
/**
* This is the minimum sized sequence that will be merged. Shorter
* sequences will be lengthened by calling binarySort. If the entire
* array is less than this length, no merges will be performed.
*
* This constant should be a power of two. It was 64 in Tim Peter's C
* implementation, but 32 was empirically determined to work better in
* this implementation. In the unlikely event that you set this constant
* to be a number that's not a power of two, you'll need to change the
* {@link #minRunLength} computation.
*
* If you decrease this constant, you must change the stackLen
* computation in the TimSort constructor, or you risk an
* ArrayOutOfBounds exception. See listsort.txt for a discussion
* of the minimum stack length required as a function of the length
* of the array being sorted and the minimum merge sequence length.
*/
private static final int MIN_MERGE = 32;
这取决于实现,但在 openjdk 8 中,数组的大小是根据 MIN_MERGE 检查的,它等于 32。这避免了对 mergeLo
/[ 的调用=12=] 抛出异常。
From JDK / jdk / openjdk / 7u40-b43 8-b132 7-b147 - 8-b132
/ java.util.TimSort:
static <T> void sort(T[] a, int lo, int hi, Comparator<? super T> c,
T[] work, int workBase, int workLen) {
assert c != null && a != null && lo >= 0 && lo <= hi && hi <= a.length;
int nRemaining = hi - lo;
if (nRemaining < 2)
return; // Arrays of size 0 and 1 are always sorted
// If array is small, do a "mini-TimSort" with no merges
if (nRemaining < MIN_MERGE) {
int initRunLen = countRunAndMakeAscending(a, lo, hi, c);
binarySort(a, lo, hi, lo + initRunLen, c);
return;
}
/**
* March over the array once, left to right, finding natural runs,
* extending short natural runs to minRun elements, and merging runs
* to maintain stack invariant.
*/
TimSort<T> ts = new TimSort<>(a, c, work, workBase, workLen);
int minRun = minRunLength(nRemaining);
do {
// Identify next run
int runLen = countRunAndMakeAscending(a, lo, hi, c);
// If run is short, extend to min(minRun, nRemaining)
if (runLen < minRun) {
int force = nRemaining <= minRun ? nRemaining : minRun;
binarySort(a, lo, lo + force, lo + runLen, c);
runLen = force;
}
// Push run onto pending-run stack, and maybe merge
ts.pushRun(lo, runLen);
ts.mergeCollapse();
// Advance to find next run
lo += runLen;
nRemaining -= runLen;
} while (nRemaining != 0);
// Merge all remaining runs to complete sort
assert lo == hi;
ts.mergeForceCollapse();
assert ts.stackSize == 1;
}
以下程序抛出以下异常:
java.lang.IllegalArgumentException: Comparison method violates its general contract!
我了解 Comparator
的问题。参见
我不明白为什么只有 List
大小为 32 或更大时才会失败。谁能解释一下?
class Experiment {
private static final class MyInteger {
private final Integer num;
MyInteger(Integer num) {
this.num = num;
}
}
private static final Comparator<MyInteger> COMPARATOR = (r1, r2) -> {
if (r1.num == null || r2.num == null)
return 0;
return Integer.compare(r1.num, r2.num);
};
public static void main(String[] args) {
MyInteger[] array = {new MyInteger(0), new MyInteger(1), new MyInteger(null)};
Random random = new Random();
for (int length = 0;; length++) {
for (int attempt = 0; attempt < 100000; attempt++) {
List<MyInteger> list = new ArrayList<>();
int[] arr = new int[length];
for (int k = 0; k < length; k++) {
int rand = random.nextInt(3);
arr[k] = rand;
list.add(array[rand]);
}
try {
Collections.sort(list, COMPARATOR);
} catch (Exception e) {
System.out.println(arr.length + " " + Arrays.toString(arr));
return;
}
}
}
}
}
Java 8 使用TimSort算法对输入进行排序。在 TimSort
中,当长度至少为 32 时会发生一个合并阶段。当长度小于 32 时,将使用一个简单的排序算法,该算法可能不会检测到合同违规。让TimSort.java
的源码注释不言而喻:
class TimSort<T> {
/**
* This is the minimum sized sequence that will be merged. Shorter
* sequences will be lengthened by calling binarySort. If the entire
* array is less than this length, no merges will be performed.
*
* This constant should be a power of two. It was 64 in Tim Peter's C
* implementation, but 32 was empirically determined to work better in
* this implementation. In the unlikely event that you set this constant
* to be a number that's not a power of two, you'll need to change the
* {@link #minRunLength} computation.
*
* If you decrease this constant, you must change the stackLen
* computation in the TimSort constructor, or you risk an
* ArrayOutOfBounds exception. See listsort.txt for a discussion
* of the minimum stack length required as a function of the length
* of the array being sorted and the minimum merge sequence length.
*/
private static final int MIN_MERGE = 32;
这取决于实现,但在 openjdk 8 中,数组的大小是根据 MIN_MERGE 检查的,它等于 32。这避免了对 mergeLo
/[ 的调用=12=] 抛出异常。
From JDK / jdk / openjdk / 7u40-b43 8-b132 7-b147 - 8-b132 / java.util.TimSort:
static <T> void sort(T[] a, int lo, int hi, Comparator<? super T> c, T[] work, int workBase, int workLen) { assert c != null && a != null && lo >= 0 && lo <= hi && hi <= a.length; int nRemaining = hi - lo; if (nRemaining < 2) return; // Arrays of size 0 and 1 are always sorted // If array is small, do a "mini-TimSort" with no merges if (nRemaining < MIN_MERGE) { int initRunLen = countRunAndMakeAscending(a, lo, hi, c); binarySort(a, lo, hi, lo + initRunLen, c); return; } /** * March over the array once, left to right, finding natural runs, * extending short natural runs to minRun elements, and merging runs * to maintain stack invariant. */ TimSort<T> ts = new TimSort<>(a, c, work, workBase, workLen); int minRun = minRunLength(nRemaining); do { // Identify next run int runLen = countRunAndMakeAscending(a, lo, hi, c); // If run is short, extend to min(minRun, nRemaining) if (runLen < minRun) { int force = nRemaining <= minRun ? nRemaining : minRun; binarySort(a, lo, lo + force, lo + runLen, c); runLen = force; } // Push run onto pending-run stack, and maybe merge ts.pushRun(lo, runLen); ts.mergeCollapse(); // Advance to find next run lo += runLen; nRemaining -= runLen; } while (nRemaining != 0); // Merge all remaining runs to complete sort assert lo == hi; ts.mergeForceCollapse(); assert ts.stackSize == 1; }