有什么方法可以在 Java 中生成与 C# 中生成的 UUID 相同的 UUID？

Question

我正在将 C# 脚本移植到 Spark (Scala) 中，我运行遇到了 Scala 中 UUID 生成与 C# 中 GUID 生成的问题。

有什么方法可以在 Java 中生成与在 C# 中生成的 UUID 相同的 UUID？

我通过从字符串的 MD5 哈希创建 Guid 来生成数据库的主键。最后，我想在 Java/Scala 中生成与 C# 脚本中的 UUID 相匹配的 UUID，因此数据库中使用 C# 实现进行散列的现有数据不需要重新散列。

C# 到端口：

String ex = "Hello World";
Console.WriteLine("String to Hash: {0}", ex);
byte[] md5 = GetMD5Hash(ex);
Console.WriteLine("Hash: {0}", BitConverter.ToString(md5));
Guid guid = new Guid(md5);
Console.WriteLine("Guid: {0}", guid);

private static byte[] GetMD5Hash(params object[] values) {
  using (MD5 md5 = MD5.Create())
    return md5.ComputeHash(Encoding.UTF8.GetBytes(s));
}

Scala 移植代码：

val to_encode = "Hello World"
val md5hash = MessageDigest.getInstance("MD5")
 .digest(to_encode.trim().getBytes())
val md5string = md5hash.map("%02x-".format(_)).mkString
val uuid_bytes = UUID.nameUUIDFromBytes(to_encode.trim().getBytes())
printf("String to encode: %s\n", to_encode)
printf("MD5: %s\n", md5string)
printf("UUID: %s\n", uuid_bytes.toString)

C# 的结果

要散列的字符串：Hello World
MD5: B1-0A-8D-B1-64-E0-75-41-05-B7-A9-9B-E7-2E-3F-E5
指南：b18d0ab1-e064-4175-05b7-a99be72e3fe5

Scala 的结果

要散列的字符串：Hello World
MD5: b10a8db164e0754105b7a99be72e3fe5
UUID: b10a8db1-64e0-3541-85b7-a99be72e3fe5

什么有效：

MD5 哈希（GUID 和 UUID 基于）匹配

什么没有：

前三个字段在 C# 中切换了字节顺序（橙色）
- C# 的 GUID 为前三个字段 (4, 2, 2) 选择本机字节顺序，在本例中为最后一个字段 (8) 是小端和大端，而 Java UUID 对所有四个字段使用 Big Endian 排序；这解释了 C# 中前三个字段的字节顺序。
第四和第五个字节不同（红色）
- Java 切换 6-7 位以表示 UUID 的版本和变体，这可能解释了字节 4 和 5 的差异。这似乎是障碍。
我理解 Java 使用有符号字节，而 C# 使用无符号字节；这可能也相关。

缺少对字节的操作，还有其他方法可以解决这个问题吗？

Answer 1

TL;DR

如果您希望您的 C# 和 Java 以完全相同的方式运行（并且您对现有的 C# 行为感到满意），您需要手动重新排序 uuid_bytes（即交换一些您确定为乱序的条目）。

此外，您不应使用：

UUID.nameUUIDFromBytes(to_encode.trim().getBytes())

而是使用：

public static String getGuidFromByteArray(byte[] bytes) {
    ByteBuffer bb = ByteBuffer.wrap(bytes);
    long high = bb.getLong();
    long low = bb.getLong();
    UUID uuid = new UUID(high, low);
    return uuid.toString();
}

无耻地从偷来的:)

其他背景

如果你不知道，在处理 C# 时 GUIDs:

Note that the order of bytes in the returned byte array is different from the string representation of a Guid value. The order of the beginning four-byte group and the next two two-byte groups is reversed, whereas the order of the last two-byte group and the closing six-byte group is the same. The example provides an illustration.

And:

The order of hexadecimal strings returned by the ToString method depends on whether the computer architecture is little-endian or big-endian.

在您的 C# 中，而不是使用：

Console.WriteLine("Guid: {0}", guid);

您可能要考虑使用：

Console.WriteLine(BitConverter.ToString(guid.ToByteArray()));

您现有的代码在幕后调用 ToString。唉，ToString 和 ToByteArray 不 return same order.

中的字节

有什么方法可以在 Java 中生成与 C# 中生成的 UUID 相同的 UUID？

Is there any way to generate a UUID in Java that is identical to that of the one generated in C#?

c#

java

hash

scala

guid

TL;DR

其他背景