如何在分块操作中获取原始 i、j、k 位置

How do I get the original i,j,k location in a blockwise operation

如果我有像 dask_array_object.blocks.ravel() 这样的操作并遍历由此产生的块:

 dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
 dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
 dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
 dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
 dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
 dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>...

如何获取以分块方式执行的函数内每个部分的原始 i、j、k 位置?

block_list = dask_array_object.blocks.ravel()

def store_data_into_binary(block_data, location_of_bin):
    # These are just functions that are specific to an internal library
    binary = open_binary(location_of_bin)
    block_data = block_data.compute()
    # How do I get these i, j, k locations?
    binary.put(block_data, (i0, j0, k0), (i1, j1, k1))
    # Attempt to remove the data from memory?
    block_data = None

for block in block_list:
    store_data_into_binary(block, "./location/of/file.bin")

此外,我注意到有一个选项可以使用函数 dask.array.blockwise() 进行分块操作,同一个问题。您如何知道以分块方式执行的函数内 dask 数组中部件的原始 i,j,k 位置?

我尝试使用 map_blocks,但是由于 map_blocks return 执行的函数是块,我的内存被炸毁了。

像这样:

import dask.array as da

x = da.random.randint(100, size=(2000,2000,2000)))

def func(x, block_id=None, block_info=None):
    # Grab the values of the 3D cube from Zarr disk store
    block_data = x.compute()
    # Function that writes the actual values to disk
    write_value_to_binary(block_data, "./file/datafile.bin")
    # Attempt to release the memory?
    x.close()

    return x

da.map_blocks(func, x).compute()

有没有办法不必 return 实际的 numpy 值和 return 一些空值?

一种简单的方法可能是像您一样使用 map_blocks,但只是 return 每次函数调用可以使用的最小数组

    return np.array([0])

然而,blocks.ravel() 输出的块与您期望的顺序完全一致,您只需按照其实现即可:

for i, j, k in np.ndindex(x.shape):
    block = x.blocks[i, j, k]
    # do something with i, j, k, block