如何使用 Chronicle Map 在随机索引上 serialise/deserialise long[] 值 get/set ？

Question

我是编年史地图的新手。我正在尝试使用 chronicle-map 对堆外映射进行建模，其中键是原始短数组，值是原始长数组。对于给定的地图，长数组值的最大大小是已知的。但是，我将有多个此类映射，每个映射的长数组值可能具有不同的最大大小。我的问题与键和值的 serialisation/deserialisation 有关。

通过阅读文档，我了解到对于密钥，我可以使用值类型 ShortValue 并重用该接口的实现实例。关于价值，我发现页面谈论 DataAccess and SizedReader，它给出了 byte[] 的示例，但我不确定如何将其适应 long[]。我还有一个额外的要求是，我需要在长数组中的任意索引处获取和设置值，而无需每次支付整个值的完整 serialisation/deserialisation 的成本。

所以我的问题是：在构造地图时如何对值类型建模，如果每个地图的最大大小已知并且我需要能够在每次没有 serialising/deserialising 整个值负载的情况下读取和写入随机索引？理想情况下，long[] 将 encoded/decoded 直接 to/from 离开堆，而无需进行到 byte[] 的堆上中间转换，并且 chronicle-map 代码不会在运行时分配。谢谢。

Answer 1

首先，我建议使用某种 LongList 接口抽象而不是 long[]，这将使处理大小可变性、提供替代享元实现等变得更容易。

如果你只想 read/write 大列表中的单个元素，你应该使用 advanced contexts API:

/** This method is entirely garbage-free, deserialization-free, and thread-safe. */
void putOneValue(ChronicleMap<ShortValue, LongList> map, ShortValue key, int index,
        long element) {
    if (index < 0) throw throw new IndexOutOfBoundsException(...);
    try (ExternalMapQueryContext<ShortValue, LongList, ?> c = map.getContext(key)) {
        c.writeLock().lock(); // (1)
        MapEntry<ShortValue, LongList> entry = c.entry();
        if (entry != null) {
            Data<LongList> value = entry.value();
            BytesStore valueBytes = (BytesStore) value.bytes(); // (2)
            long valueBytesOffset = value.offset();
            long valueBytesSize = value.size();
            int valueListSize = (int) (valueBytesSize / Long.BYTES); // (3)
            if (index >= valueListSize) throw new IndexOutOfBoundsException(...);
            valueBytes.writeLong(valueBytesOffset + ((long) index) * Long.BYTES,
                element);
            ((ChecksumEntry) entry).updateChecksum(); // (4)
        } else {
            // there is no entry for the given key
            throw ...
        }
    }
}

备注：

必须从头获取writeLock()，否则调用context.entry()方法时会自动获取readLock()，无法升级读锁稍后写锁。请仔细阅读HashQueryContext javadoc。
Data.bytes() formally returns RandomDataInput, but you could be sure (it's specified in Data.bytes() javadoc) that it's actually an instance of BytesStore（这是 RandomDataInput 和 RandomDataOutput 的组合）。
假设提供了正确的 SizedReader 和 SizedWriter（或 DataAccess）。请注意，使用了 "bytes/element joint size" 技术，与 SizedReader and SizedWriter doc section, PointListSizeMarshaller 中给出的示例相同。您可以将 LongListMarshaller 基于该示例 class.
此演员表已指定，请参阅 ChecksumEntry javadoc and the section about checksums in the doc。如果您有一个纯粹的内存中（非持久化）Chronicle Map，或者关闭了校验和，则可以省略此调用。

单元素读取的实现类似。

Answer 2

回答额外问题：

I've implemented a SizedReader+Writer. Do I need DataAccess or is SizedWriter fast enough for primitive arrays? I looked at the ByteArrayDataAccess but it's not clear how to port it for long arrays given that the internal HeapBytesStore is so specific to byte[]/ByteBuffers?

使用 DataAccess 而不是 SizedWriter 可以在 Map.put(key, value) 上减少一份价值数据副本。但是，如果在您的用例中 putOneValue() （如上例所示）是主要的查询类型，则不会有太大区别。如果 Map.put(key, value)（和 replace() 等，即任何 "full value write" 操作）很重要，仍然可以为 LongList 实现 DataAccess。它看起来像这样：

class LongListDataAccess implements DataAccess<LongList>, Data<LongList>,
        StatefulCopyable<LongListDataAccess> {
    transient ByteStore cachedBytes;
    transient boolean cachedBytesInitialized;
    transient LongList list;

    @Override public Data<LongList> getData(LongList list) {
        this.list = list;
        this.cachedBytesInitialized = false;
        return this;
    }

    @Override public long size() {
        return ((long) list.size()) * Long.BYTES;
    }

    @Override public void writeTo(RandomDataOutput target, long targetOffset) {
        for (int i = 0; i < list.size(); i++) {
            target.writeLong(targetOffset + ((long) i) * Long.BYTES), list.get(i));
        }
    }

    ...
}

为了提高效率，方法size() 和writeTo() 是关键。但正确实施所有其他方法（我没有在此处编写）也很重要。仔细阅读 DataAccess、Data 和 StatefulCopyable javadocs，也非常注意教程中的 Understanding StatefulCopyable, DataAccess and SizedReader and Custom serialization checklist。

Does the read/write locking mediate across multiple process reading and writing on same machine or just within a single process?

跨进程是安全的，注意这个接口叫InterProcessReadWriteUpdateLock.

When storing objects, with a variable size not known in advance, as values will that cause fragmentation off heap and in the persisted file?

存储一个键的值一次并且之后不更改值的大小（并且不删除键）不会导致外部碎片。更改值的大小或删除键可能会导致外部碎片。 ChronicleMapBuilder.actualChunkSize() 配置允许在外部和内部碎片之间进行交易。块越大，外部碎片越少，但内部碎片越多。如果您的值明显大于页面大小 (4 KB)，您可以设置荒谬的大块大小，并且内部碎片仍然受页面大小限制，因为 Chronicle Map 能够利用 Linux 中的惰性页面分配功能.

如何使用 Chronicle Map 在随机索引上 serialise/deserialise long[] 值 get/set ？

How to serialise/deserialise long[] value with get/set on random indices using Chronicle Map?

java

low-latency

chronicle

chronicle-map

off-heap