如何在分块操作中获取原始 i、j、k 位置
How do I get the original i,j,k location in a blockwise operation
如果我有像 dask_array_object.blocks.ravel()
这样的操作并遍历由此产生的块:
dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>...
如何获取以分块方式执行的函数内每个部分的原始 i、j、k 位置?
block_list = dask_array_object.blocks.ravel()
def store_data_into_binary(block_data, location_of_bin):
# These are just functions that are specific to an internal library
binary = open_binary(location_of_bin)
block_data = block_data.compute()
# How do I get these i, j, k locations?
binary.put(block_data, (i0, j0, k0), (i1, j1, k1))
# Attempt to remove the data from memory?
block_data = None
for block in block_list:
store_data_into_binary(block, "./location/of/file.bin")
此外,我注意到有一个选项可以使用函数 dask.array.blockwise() 进行分块操作,同一个问题。您如何知道以分块方式执行的函数内 dask 数组中部件的原始 i,j,k
位置?
我尝试使用 map_blocks
,但是由于 map_blocks
return 执行的函数是块,我的内存被炸毁了。
像这样:
import dask.array as da
x = da.random.randint(100, size=(2000,2000,2000)))
def func(x, block_id=None, block_info=None):
# Grab the values of the 3D cube from Zarr disk store
block_data = x.compute()
# Function that writes the actual values to disk
write_value_to_binary(block_data, "./file/datafile.bin")
# Attempt to release the memory?
x.close()
return x
da.map_blocks(func, x).compute()
有没有办法不必 return 实际的 numpy 值和 return 一些空值?
一种简单的方法可能是像您一样使用 map_blocks
,但只是 return 每次函数调用可以使用的最小数组
return np.array([0])
然而,blocks.ravel()
输出的块与您期望的顺序完全一致,您只需按照其实现即可:
for i, j, k in np.ndindex(x.shape):
block = x.blocks[i, j, k]
# do something with i, j, k, block
如果我有像 dask_array_object.blocks.ravel()
这样的操作并遍历由此产生的块:
dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>,
dask.array<blocks, shape=(156, 156, 2126), dtype=float32, chunksize=(156, 156, 2126), chunktype=numpy.ndarray>...
如何获取以分块方式执行的函数内每个部分的原始 i、j、k 位置?
block_list = dask_array_object.blocks.ravel()
def store_data_into_binary(block_data, location_of_bin):
# These are just functions that are specific to an internal library
binary = open_binary(location_of_bin)
block_data = block_data.compute()
# How do I get these i, j, k locations?
binary.put(block_data, (i0, j0, k0), (i1, j1, k1))
# Attempt to remove the data from memory?
block_data = None
for block in block_list:
store_data_into_binary(block, "./location/of/file.bin")
此外,我注意到有一个选项可以使用函数 dask.array.blockwise() 进行分块操作,同一个问题。您如何知道以分块方式执行的函数内 dask 数组中部件的原始 i,j,k
位置?
我尝试使用 map_blocks
,但是由于 map_blocks
return 执行的函数是块,我的内存被炸毁了。
像这样:
import dask.array as da
x = da.random.randint(100, size=(2000,2000,2000)))
def func(x, block_id=None, block_info=None):
# Grab the values of the 3D cube from Zarr disk store
block_data = x.compute()
# Function that writes the actual values to disk
write_value_to_binary(block_data, "./file/datafile.bin")
# Attempt to release the memory?
x.close()
return x
da.map_blocks(func, x).compute()
有没有办法不必 return 实际的 numpy 值和 return 一些空值?
一种简单的方法可能是像您一样使用 map_blocks
,但只是 return 每次函数调用可以使用的最小数组
return np.array([0])
然而,blocks.ravel()
输出的块与您期望的顺序完全一致,您只需按照其实现即可:
for i, j, k in np.ndindex(x.shape):
block = x.blocks[i, j, k]
# do something with i, j, k, block