Guava Hasher 有时会对同一对象给出不同的结果

Guava Hasher some times give diff result for same object

我需要使用键(自定义对象)和值作为自定义对象集从 Map 创建哈希码,我使用 Guava 18.0

@Getter
   public final class StockKey {

    @ValidIsin
    private final String isin;

    @ValidExchangeId
    private final Integer exchangeId;

    @ValidCurrency
    private final String currency
   }


   @EqualsAndHashCode
    public final class ClientAssetPosition {

    public static final double EPSILON = 0.0001;

    @NotNull
    private final PositionType type;

    @NotNull
    private final Double quantity;

    @Nullable
    @Getter
    private Double coveredOptions;

    @Nullable
    @Getter
    private Double blockedCoveringUnderlyings;

    @Getter
    @Setter
    private Boolean excluded;
    }

所以我有一个创建哈希码的函数

 public static HashCode getHashCodeWithSha256(Map<StockKey, Set<ClientAssetPosition>> positions) {
        final Hasher hasher = Hashing.sha256().newHasher();
        for (Map.Entry<StockKey, Set<ClientAssetPosition>> positionsEntry : positions.entrySet()) {
            hasher.putObject(positionsEntry.getKey(), STOCK_KEY_FUNNEL);
            for (ClientAssetPosition asset : positionsEntry.getValue()) {
                hasher.putObject(asset, CLIENT_ASSET_POSITION_FUNNEL);
            }
        }
        return hasher.hash();
    }

我用的就是这样的漏斗

public static final Funnel<StockKey> STOCK_KEY_FUNNEL = new Funnel<StockKey>() {
    @Override
    public void funnel(StockKey from, PrimitiveSink into) {
        into.putString(from.getIsin()).putString(from.getCurrency()).putInt(from.getExchangeId());
    }
};
public static final Funnel<ClientAssetPosition> CLIENT_ASSET_POSITION_FUNNEL = new Funnel<ClientAssetPosition>() {
    @Override
    public void funnel(ClientAssetPosition from, PrimitiveSink into) {
        into.putDouble(from.getQuantity()).putString(from.getType().name());
    }
};

并且对于同一个 Map 这个函数有时 return 不同的 HashCode 我通过这个单元测试找到了它。如果 运行 它来自 maven,则此测试失败,但不是每次都失败。

@Test
    public void testSamePortfolioSameHAshCodeOrdersASC(){
        Map<StockKey, Set<ClientAssetPosition>> positions = new HashMap<>();
        positions.put(PredefinedStockKeys.UBS, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.ORD_B101));
        positions.put(PredefinedStockKeys.UBS_FEB_12_17_C, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.ORD_B101));
        positions.put(PredefinedStockKeys.UBSN_MAR12_12_5_C, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.ORD_M10));
        positions.put(PredefinedStockKeys.UBSN_MAR12_12_5_P, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.ORD_B101));
        positions.put(PredefinedStockKeys.UBSN_MAR12_13_C, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.ORD_B101));
        positions.put(PredefinedStockKeys.UBSN_MAY12_13_C, Sets.newHashSet(PredefinedAssetPosition.C_ORD_M1, PredefinedAssetPosition.ORD_M10));
        positions.put(PredefinedStockKeys.UBSN_MAR12_13_P, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.ORD_B101));
        positions.put(PredefinedStockKeys.UBS_JAN_12_17_C, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.C_ORD_M1));
        positions.put(PredefinedStockKeys.UBS_JAN_12_17_P, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.ORD_B101));
        positions.put(PredefinedStockKeys.UBSH_APR12, Sets.newHashSet(PredefinedAssetPosition.C_ORD_M1, PredefinedAssetPosition.ORD_B101));
        positions.put(PredefinedStockKeys.UBSH_MAR12, Sets.newHashSet(PredefinedAssetPosition.C_ORD_M1, PredefinedAssetPosition.ORD_B101));
        positions.put(PredefinedStockKeys.UBSH_MAY12, Sets.newHashSet(PredefinedAssetPosition.C_ORD_M1, PredefinedAssetPosition.ORD_B101));

        Map<StockKey, Set<ClientAssetPosition>> positionsv2 = new HashMap<>();
        positionsv2.put(PredefinedStockKeys.UBS, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.ORD_B101));
        positionsv2.put(PredefinedStockKeys.UBSN_MAR12_12_5_C, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.ORD_M10));
        positionsv2.put(PredefinedStockKeys.UBSN_MAR12_12_5_P, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.ORD_B101));
        positionsv2.put(PredefinedStockKeys.UBSH_APR12, Sets.newHashSet(PredefinedAssetPosition.C_ORD_M1, PredefinedAssetPosition.ORD_B101));
        positionsv2.put(PredefinedStockKeys.UBSN_MAY12_13_C, Sets.newHashSet(PredefinedAssetPosition.C_ORD_M1, PredefinedAssetPosition.ORD_M10));
        positionsv2.put(PredefinedStockKeys.UBSN_MAR12_13_P, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.ORD_B101));
        positionsv2.put(PredefinedStockKeys.UBS_JAN_12_17_C, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.C_ORD_M1));
        positionsv2.put(PredefinedStockKeys.UBS_JAN_12_17_P, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.ORD_B101));
        positionsv2.put(PredefinedStockKeys.UBS_FEB_12_17_C, Sets.newHashSet(PredefinedAssetPosition.OWN_1, PredefinedAssetPosition.ORD_B101));
        positionsv2.put(PredefinedStockKeys.UBSH_MAR12, Sets.newHashSet(PredefinedAssetPosition.C_ORD_M1, PredefinedAssetPosition.ORD_B101));
        positionsv2.put(PredefinedStockKeys.UBSN_MAR12_13_C, Sets.newHashSet(PredefinedAssetPosition.ORD_B101, PredefinedAssetPosition.OWN_1));
        positionsv2.put(PredefinedStockKeys.UBSH_MAY12, Sets.newHashSet(PredefinedAssetPosition.C_ORD_M1, PredefinedAssetPosition.ORD_B101));
        HashCode hashCodeWithSha256Expected = HashHelper.getHashCodeWithSha256(positions);
        HashCode hashCodeWithSha256Exist = HashHelper.getHashCodeWithSha256(positionsv2);
        Assert.assertArrayEquals(hashCodeWithSha256Expected.asBytes(), hashCodeWithSha256Exist.asBytes());
    }

谁能解释一下我做错了什么?

我认为问题与订购有关。即使从一次调用到另一次调用,如果您将相同的 key/value 对或值分别放入 HashMapHashSet 中,也无法保证条目的顺序在两者之间保持不变两次调用。当然,跨 JVM 运行的情况要少得多。

因此您需要重写您的哈希计算方法,以便它在计算哈希之前执行一个命令...

因为您使用 Guava,所以很简单:使用 Ordering.sortedCopy()

应该这样做;但是,请注意,这假设您的 StockKeyClientAssetPosition 类 实现了 Comparable:

public static HashCode getHashCodeWithSha256(Map<StockKey, Set<ClientAssetPosition>> positions) 
{
    final Hasher hasher = Hashing.sha256().newHasher();

    final Iterable<StockKey> orderedKeys
        = Ordering.sortedCopy(positions.keySet());

    Iterable<ClientAssetPosition> orderedAssets;

    for (final StockKey key: orderedKeys) {
        hasher.putObject(key, STOCK_KEY_FUNNEL);

        orderedAssets = Ordering.sortedCopy(positions.get(key));

        for (final ClientAssetPosition asset: orderedAssets)
            hasher.putObject(asset, CLIENT_ASSET_POSITION_FUNNEL);
    }
    return hasher.hash();
}

HOWEVER:真的考虑换成 Multimap。回想一下,如果您在某些时候需要向后兼容,它也有一个 .asMap() 方法。

如果您查看 Hasher documentation,您会发现:

Warning: Chunks of data that are put into the Hasher are not delimited. The resulting HashCode is dependent only on the bytes inserted, and the order in which they were inserted...

作为顺序问题,具有不同 key/value 顺序的散列映射会产生不同的散列。

您可以通过在 getHashCodeWithSha256 中使用 ordered/sorted 结构或哈希预处理来解决此问题。