使用 joblib.Memory 在 AWS S3 中缓存数据
Using joblib.Memory to cache data in AWS S3
是否可以在 AWS S3 中使用 joblib.Memory 缓存函数输出,例如将远程 link 传递给 cachedir
参数?
例如:
s3_url = 'https://foo.s3..../folder/cache_folder/project_name/joblib'
from joblib import Memory
memory = Memory(s3_url, verbose=0)
@memory.cache
def my_function(x): return x
试试这个库:
https://github.com/aabadie/joblib-s3
来自他们的文档:
获取最新代码
要获取最新代码,请使用 git:
git clone git://github.com/aabadie/joblib-s3.git
正在安装 joblibs3
只需使用 pip:
$ cd joblib-s3
$ pip install -r requirements.txt .
使用joblibs3在AWS S3中缓存计算结果
参见以下示例:
import numpy as np
from joblib import Memory
from joblibs3 import register_s3_store_backend
if __name__ == '__main__':
register_s3_store_backend()
# we assume you S3 credentials are stored in ~/.aws/credentials, so no
# need to pass them to Memory constructor.
mem = Memory('joblib_cache', backend='s3', verbose=100, compress=True,
backend_options=dict(bucket="joblib-example"))
multiply = mem.cache(np.multiply)
array1 = np.arange(10000)
array2 = np.arange(10000)
result = multiply(array1, array2)
print(result)
是否可以在 AWS S3 中使用 joblib.Memory 缓存函数输出,例如将远程 link 传递给 cachedir
参数?
例如:
s3_url = 'https://foo.s3..../folder/cache_folder/project_name/joblib'
from joblib import Memory
memory = Memory(s3_url, verbose=0)
@memory.cache
def my_function(x): return x
试试这个库:
https://github.com/aabadie/joblib-s3
来自他们的文档:
获取最新代码
要获取最新代码,请使用 git:
git clone git://github.com/aabadie/joblib-s3.git
正在安装 joblibs3
只需使用 pip:
$ cd joblib-s3
$ pip install -r requirements.txt .
使用joblibs3在AWS S3中缓存计算结果
参见以下示例:
import numpy as np
from joblib import Memory
from joblibs3 import register_s3_store_backend
if __name__ == '__main__':
register_s3_store_backend()
# we assume you S3 credentials are stored in ~/.aws/credentials, so no
# need to pass them to Memory constructor.
mem = Memory('joblib_cache', backend='s3', verbose=100, compress=True,
backend_options=dict(bucket="joblib-example"))
multiply = mem.cache(np.multiply)
array1 = np.arange(10000)
array2 = np.arange(10000)
result = multiply(array1, array2)
print(result)