如何使用 s3 中的预训练模型来预测某些数据?
How to use a pretrained model from s3 to predict some data?
我已经使用 sagemaker 训练了一个语义分割模型,输出已保存到 s3 存储桶中。我想从 s3 加载这个模型来预测 sagemaker 中的一些图像。
我知道如何预测我是否在训练后离开笔记本实例 运行,因为它只是一个简单的部署,但如果我想使用旧模型,它并没有真正帮助。
我查看了这些资源并能够自己想出一些东西,但它不起作用因此我在这里:
https://course.fast.ai/deployment_amzn_sagemaker.html#deploy-to-sagemaker
https://aws.amazon.com/getting-started/tutorials/build-train-deploy-machine-learning-model-sagemaker/
https://sagemaker.readthedocs.io/en/stable/pipeline.html
我的代码是这样的:
from sagemaker.pipeline import PipelineModel
from sagemaker.model import Model
s3_model_bucket = 'bucket'
s3_model_key_prefix = 'prefix'
data = 's3://{}/{}/{}'.format(s3_model_bucket, s3_model_key_prefix, 'model.tar.gz')
models = ss_model.create_model() # ss_model is my sagemaker.estimator
model = PipelineModel(name=data, role=role, models= [models])
ss_predictor = model.deploy(initial_instance_count=1, instance_type='ml.c4.xlarge')
您实际上可以从现有工件实例化 Python SDK model
对象,并将其部署到端点。这使您可以从经过训练的工件中部署模型,而无需在笔记本中重新训练。例如,对于语义分割模型:
trainedmodel = sagemaker.model.Model(
model_data='s3://...model path here../model.tar.gz',
image='685385470294.dkr.ecr.eu-west-1.amazonaws.com/semantic-segmentation:latest', # example path for the semantic segmentation in eu-west-1
role=role) # your role here; could be different name
trainedmodel.deploy(initial_instance_count=1, instance_type='ml.c4.xlarge')
同样,您可以使用以下命令从支持 SDK 的任何经过身份验证的客户端实例化已部署端点上的预测器对象:
predictor = sagemaker.predictor.RealTimePredictor(
endpoint='endpoint name here',
content_type='image/jpeg',
accept='image/png')
关于这些抽象的更多信息:
input_features_data 是一个数据框
import sagemaker
from sagemaker.predictor import csv_serializer, json_deserializer
predictor = sagemaker.predictor.RealTimePredictor(
endpoint= PREDICTOR_ENDPOINT_NAME,
sagemaker_session=sagemaker.Session(),
serializer=csv_serializer,
deserializer=json_deserializer,
content_type='text/csv',
)
test_batch_size = 5
num_batches = -(-len(input_features_data) // test_batch_size)
count=0
predicted_values = []
for i in range(num_batches):
predicted_values += [predictor.predict(x) for x in
input_features_data[i * test_batch_size:(i + 1) * test_batch_size]]
return np.asarray(predicted_values)
我已经使用 sagemaker 训练了一个语义分割模型,输出已保存到 s3 存储桶中。我想从 s3 加载这个模型来预测 sagemaker 中的一些图像。
我知道如何预测我是否在训练后离开笔记本实例 运行,因为它只是一个简单的部署,但如果我想使用旧模型,它并没有真正帮助。
我查看了这些资源并能够自己想出一些东西,但它不起作用因此我在这里:
https://course.fast.ai/deployment_amzn_sagemaker.html#deploy-to-sagemaker https://aws.amazon.com/getting-started/tutorials/build-train-deploy-machine-learning-model-sagemaker/
https://sagemaker.readthedocs.io/en/stable/pipeline.html
我的代码是这样的:
from sagemaker.pipeline import PipelineModel
from sagemaker.model import Model
s3_model_bucket = 'bucket'
s3_model_key_prefix = 'prefix'
data = 's3://{}/{}/{}'.format(s3_model_bucket, s3_model_key_prefix, 'model.tar.gz')
models = ss_model.create_model() # ss_model is my sagemaker.estimator
model = PipelineModel(name=data, role=role, models= [models])
ss_predictor = model.deploy(initial_instance_count=1, instance_type='ml.c4.xlarge')
您实际上可以从现有工件实例化 Python SDK model
对象,并将其部署到端点。这使您可以从经过训练的工件中部署模型,而无需在笔记本中重新训练。例如,对于语义分割模型:
trainedmodel = sagemaker.model.Model(
model_data='s3://...model path here../model.tar.gz',
image='685385470294.dkr.ecr.eu-west-1.amazonaws.com/semantic-segmentation:latest', # example path for the semantic segmentation in eu-west-1
role=role) # your role here; could be different name
trainedmodel.deploy(initial_instance_count=1, instance_type='ml.c4.xlarge')
同样,您可以使用以下命令从支持 SDK 的任何经过身份验证的客户端实例化已部署端点上的预测器对象:
predictor = sagemaker.predictor.RealTimePredictor(
endpoint='endpoint name here',
content_type='image/jpeg',
accept='image/png')
关于这些抽象的更多信息:
input_features_data 是一个数据框
import sagemaker
from sagemaker.predictor import csv_serializer, json_deserializer
predictor = sagemaker.predictor.RealTimePredictor(
endpoint= PREDICTOR_ENDPOINT_NAME,
sagemaker_session=sagemaker.Session(),
serializer=csv_serializer,
deserializer=json_deserializer,
content_type='text/csv',
)
test_batch_size = 5
num_batches = -(-len(input_features_data) // test_batch_size)
count=0
predicted_values = []
for i in range(num_batches):
predicted_values += [predictor.predict(x) for x in
input_features_data[i * test_batch_size:(i + 1) * test_batch_size]]
return np.asarray(predicted_values)