首先太慢 运行 TorchScript 模型及其在 Flask 中的实现
Too slow first run TorchScript model and its implementation in Flask
我正在尝试在 Python 和 Flask 中部署 torchscripted 模型。正如我意识到的(至少如前所述 here ),脚本模型在使用前需要 "warmed up",因此此类模型的第一个 运行 比后续模型花费的时间长得多。我的问题是:有没有什么方法可以在 Flask 路由中加载 torchscripted 模型并进行预测而不损失 "worm-up" 时间?我可以在某个地方存储 "warm-uped" 模型以避免在每个请求中预热吗?
我写了简单的代码来重现 "warm-up" pass:
import torchvision, torch, time
model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
model = torch.jit.script(model)
model.eval()
x = [torch.randn((3,224,224))]
for i in range(3):
start = time.time()
model(x)
print(‘Time elapsed: {}’.format(time.time()-start))
输出:
Time elapsed: 38.29<br>
Time elapsed: 6.65<br>
Time elapsed: 6.65<br>
和 Flask 代码:
import torch, torchvision, os, time
from flask import Flask
app = Flask(__name__)
@app.route('/')
def test_scripted_model(path='/tmp/scripted_model.pth'):
if os.path.exists(path):
model = torch.jit.load(path, map_location='cpu')
else:
model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
model = torch.jit.script(model)
torch.jit.save(model, path)
model.eval()
x = [torch.randn((3, 224, 224))]
out = ''
for i in range(3):
start = time.time()
model(x)
out += 'Run {} time: {};\t'.format(i+1, round((time.time() - start), 2))
return out
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000, debug=False)
输出:
Run 1 time: 46.01; Run 2 time: 8.76; Run 3 time: 8.55;
OS: Ubuntu 18.04 & Windows10
Python版本:3.6.9
烧瓶:1.1.1
手电筒:1.4.0
火炬视觉:0.5.0
更新:
解决了 "warm-up" 问题:
with torch.jit.optimized_execution(False):
model(x)
更新2:
通过在服务器启动之前创建全局 python 模型对象并在那里预热来解决 Flask 问题(如下所述)。然后在每个请求中模型就可以使用了。
model = torch.jit.load(path, map_location='cpu').eval()
model(x)
app = Flask(__name__)
然后在@app.route:
@app.route('/')
def test_scripted_model():
global model
...
...
Can I store somewhere "warm-uped" model to avoid warming-up in every request?
是的,只需在 test_scripted_model
函数之外实例化您的模型并在函数内引用它。
我正在尝试在 Python 和 Flask 中部署 torchscripted 模型。正如我意识到的(至少如前所述 here ),脚本模型在使用前需要 "warmed up",因此此类模型的第一个 运行 比后续模型花费的时间长得多。我的问题是:有没有什么方法可以在 Flask 路由中加载 torchscripted 模型并进行预测而不损失 "worm-up" 时间?我可以在某个地方存储 "warm-uped" 模型以避免在每个请求中预热吗? 我写了简单的代码来重现 "warm-up" pass:
import torchvision, torch, time
model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
model = torch.jit.script(model)
model.eval()
x = [torch.randn((3,224,224))]
for i in range(3):
start = time.time()
model(x)
print(‘Time elapsed: {}’.format(time.time()-start))
输出:
Time elapsed: 38.29<br>
Time elapsed: 6.65<br>
Time elapsed: 6.65<br>
和 Flask 代码:
import torch, torchvision, os, time
from flask import Flask
app = Flask(__name__)
@app.route('/')
def test_scripted_model(path='/tmp/scripted_model.pth'):
if os.path.exists(path):
model = torch.jit.load(path, map_location='cpu')
else:
model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
model = torch.jit.script(model)
torch.jit.save(model, path)
model.eval()
x = [torch.randn((3, 224, 224))]
out = ''
for i in range(3):
start = time.time()
model(x)
out += 'Run {} time: {};\t'.format(i+1, round((time.time() - start), 2))
return out
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000, debug=False)
输出:
Run 1 time: 46.01; Run 2 time: 8.76; Run 3 time: 8.55;
OS: Ubuntu 18.04 & Windows10
Python版本:3.6.9
烧瓶:1.1.1
手电筒:1.4.0
火炬视觉:0.5.0
更新:
解决了 "warm-up" 问题:
with torch.jit.optimized_execution(False):
model(x)
更新2: 通过在服务器启动之前创建全局 python 模型对象并在那里预热来解决 Flask 问题(如下所述)。然后在每个请求中模型就可以使用了。
model = torch.jit.load(path, map_location='cpu').eval()
model(x)
app = Flask(__name__)
然后在@app.route:
@app.route('/')
def test_scripted_model():
global model
...
...
Can I store somewhere "warm-uped" model to avoid warming-up in every request?
是的,只需在 test_scripted_model
函数之外实例化您的模型并在函数内引用它。