Java UUID compareTo 对于 Type1 UUID 无法正常工作
Java UUID compareTo not working correctly for Type1 UUIDs
在处理需要根据 UUID 对数据进行排序的用例时,这些数据都是类型 1 或基于时间的,并使用 Datastax Cassandra Java 驱动程序库 (UUIDS.timebased()) 生成,我发现 UUID.compareTo 没有正确排序一些 UUID。
compareTo 中的逻辑是
/**
* Compares this UUID with the specified UUID.
*
* <p> The first of two UUIDs is greater than the second if the most
* significant field in which the UUIDs differ is greater for the first
* UUID.
*
* @param val
* {@code UUID} to which this {@code UUID} is to be compared
*
* @return -1, 0 or 1 as this {@code UUID} is less than, equal to, or
* greater than {@code val}
*
*/
public int compareTo(UUID val) {
// The ordering is intentionally set up so that the UUIDs
// can simply be numerically compared as two numbers
return (this.mostSigBits < val.mostSigBits ? -1 :
(this.mostSigBits > val.mostSigBits ? 1 :
(this.leastSigBits < val.leastSigBits ? -1 :
(this.leastSigBits > val.leastSigBits ? 1 :
0))));
}
我使用 datastax cassandra 驱动程序为 java.
生成了以下 2 个 UUID
UUID uuid1 = java.util.UUID.fromString("7fff5ab0-43be-11ea-8fba-0f6f28968a17")
UUID uuid2 = java.util.UUID.fromString("80004510-43be-11ea-8fba-0f6f28968a17")
uuid1.timestamp() //137997224058510000
uuid2.timestamp() //137997224058570000
从上面可以看出uuid1比uuid2小,但是当我们使用UUID compareTo方法比较它们时,我们得到不同的输出。我们应该得到 -1 的输出,因为它应该小于但我们得到的答案是 1,这表明这个 uuid1 大于 uuid2
uuid1.compareTo(uuid2) //output - 1
进一步分析,发现uuid2的msb转换为负数,而uuid1的msb为正数。因此,compareTo 中的逻辑返回值 1 而不是 -1。
u_7fff5ab0 = {UUID@2623} "7fff5ab0-43be-11ea-8fba-0f6f28968a17"
mostSigBits = 9223190274975338986
leastSigBits = -8090136810520933865
u_80004510 = {UUID@2622} "80004510-43be-11ea-8fba-0f6f28968a17"
mostSigBits = -9223296100696452630
leastSigBits = -8090136810520933865
这种行为对于 UUID 及其相互比较是否正常?
如果是这样,那么我们如何处理此类基于时间的 UUID 的排序?
谢谢
请注意,比较基于时间的 UUID 需要特别小心,From the docs:
Lastly, please note that Cassandra's timeuuid sorting is not compatible with UUID.compareTo(java.util.UUID) and hence the UUID created by this method are not necessarily lower bound for that latter method.
不应将基于时间的 UUID 与 java.util.UUID#compareTo
进行比较。要比较两个基于时间的 UUID,您应该比较 时间;这两个UUID里面包含。您需要自定义 Utility 方法实现或只比较两个时间戳。这是一个如何做的例子:
// must be timebased UUID
int compareTo(UUID a, UUID b){
return Long.compare(UUIDs.unixTimestamp(a),UUIDs.unixTimestamp(b));
}
要了解更多信息,请阅读此 DOCS。
在处理需要根据 UUID 对数据进行排序的用例时,这些数据都是类型 1 或基于时间的,并使用 Datastax Cassandra Java 驱动程序库 (UUIDS.timebased()) 生成,我发现 UUID.compareTo 没有正确排序一些 UUID。 compareTo 中的逻辑是
/**
* Compares this UUID with the specified UUID.
*
* <p> The first of two UUIDs is greater than the second if the most
* significant field in which the UUIDs differ is greater for the first
* UUID.
*
* @param val
* {@code UUID} to which this {@code UUID} is to be compared
*
* @return -1, 0 or 1 as this {@code UUID} is less than, equal to, or
* greater than {@code val}
*
*/
public int compareTo(UUID val) {
// The ordering is intentionally set up so that the UUIDs
// can simply be numerically compared as two numbers
return (this.mostSigBits < val.mostSigBits ? -1 :
(this.mostSigBits > val.mostSigBits ? 1 :
(this.leastSigBits < val.leastSigBits ? -1 :
(this.leastSigBits > val.leastSigBits ? 1 :
0))));
}
我使用 datastax cassandra 驱动程序为 java.
生成了以下 2 个 UUIDUUID uuid1 = java.util.UUID.fromString("7fff5ab0-43be-11ea-8fba-0f6f28968a17")
UUID uuid2 = java.util.UUID.fromString("80004510-43be-11ea-8fba-0f6f28968a17")
uuid1.timestamp() //137997224058510000
uuid2.timestamp() //137997224058570000
从上面可以看出uuid1比uuid2小,但是当我们使用UUID compareTo方法比较它们时,我们得到不同的输出。我们应该得到 -1 的输出,因为它应该小于但我们得到的答案是 1,这表明这个 uuid1 大于 uuid2
uuid1.compareTo(uuid2) //output - 1
进一步分析,发现uuid2的msb转换为负数,而uuid1的msb为正数。因此,compareTo 中的逻辑返回值 1 而不是 -1。
u_7fff5ab0 = {UUID@2623} "7fff5ab0-43be-11ea-8fba-0f6f28968a17"
mostSigBits = 9223190274975338986
leastSigBits = -8090136810520933865
u_80004510 = {UUID@2622} "80004510-43be-11ea-8fba-0f6f28968a17"
mostSigBits = -9223296100696452630
leastSigBits = -8090136810520933865
这种行为对于 UUID 及其相互比较是否正常? 如果是这样,那么我们如何处理此类基于时间的 UUID 的排序?
谢谢
请注意,比较基于时间的 UUID 需要特别小心,From the docs:
Lastly, please note that Cassandra's timeuuid sorting is not compatible with UUID.compareTo(java.util.UUID) and hence the UUID created by this method are not necessarily lower bound for that latter method.
不应将基于时间的 UUID 与 java.util.UUID#compareTo
进行比较。要比较两个基于时间的 UUID,您应该比较 时间;这两个UUID里面包含。您需要自定义 Utility 方法实现或只比较两个时间戳。这是一个如何做的例子:
// must be timebased UUID
int compareTo(UUID a, UUID b){
return Long.compare(UUIDs.unixTimestamp(a),UUIDs.unixTimestamp(b));
}
要了解更多信息,请阅读此 DOCS。