BigQuery FARM_FINGERPRINT 碰撞案例
BigQuery FARM_FINGERPRINT Collision case
BigQuery 中的 farm_fingerprint 值对于两个不同的字符串是相同的。任何想法为什么?它returns-2660876244907183769
SELECT id1, id2, id1=id2 AS is_equal
FROM (SELECT FARM_FINGERPRINT(TO_JSON_STRING(STRUCT('19BD0AF0854E2B90E10080000A802438','599D7E2A47B31E20E10080000A7824B8','001','020','100'))) AS id1,
FARM_FINGERPRINT(TO_JSON_STRING(STRUCT('DCE500729B5800F0E10080010A7824BA','5AF0A97293195320E10080010A782421','001','001','110'))) AS id2)
一般来说,在任何 64 位散列中查找冲突都是微不足道的。因此,none 的 64 位哈希可以在索引大量值时保证唯一性。 FARM_FINGERPRINT 使用 Fingerprint64 function in farmhash library which is a 64bit hash algorithm, so you might as well use a different hashing function like MD5, SHA256, SHA512, etc. as it's more standardized. See more hashing functions.
还有一个public issue tracker was opened regarding this similar issue but it was eventually closed since collisions using any hash algorithm is bound to happen. But it might still be a very long time. See https://crypto.stackexchange.com/questions/47809/why-havent-any-sha-256-collisions-been-found-yet
BigQuery 中的 farm_fingerprint 值对于两个不同的字符串是相同的。任何想法为什么?它returns-2660876244907183769
SELECT id1, id2, id1=id2 AS is_equal
FROM (SELECT FARM_FINGERPRINT(TO_JSON_STRING(STRUCT('19BD0AF0854E2B90E10080000A802438','599D7E2A47B31E20E10080000A7824B8','001','020','100'))) AS id1,
FARM_FINGERPRINT(TO_JSON_STRING(STRUCT('DCE500729B5800F0E10080010A7824BA','5AF0A97293195320E10080010A782421','001','001','110'))) AS id2)
一般来说,在任何 64 位散列中查找冲突都是微不足道的。因此,none 的 64 位哈希可以在索引大量值时保证唯一性。 FARM_FINGERPRINT 使用 Fingerprint64 function in farmhash library which is a 64bit hash algorithm, so you might as well use a different hashing function like MD5, SHA256, SHA512, etc. as it's more standardized. See more hashing functions.
还有一个public issue tracker was opened regarding this similar issue but it was eventually closed since collisions using any hash algorithm is bound to happen. But it might still be a very long time. See https://crypto.stackexchange.com/questions/47809/why-havent-any-sha-256-collisions-been-found-yet