从 DRF 序列化后更新 Django 模型
Updating Django Model After Serialization from DRF
我目前有一个 API 端点,它从客户端接收数据并启动 scrapy 爬虫作业。
问题是我需要创建 Job 模型实例,启动 scrapy 作业,然后使用 scrapy 作业返回的 task_id 更新模型。模型更新成功,但是DRF返回的序列化数据没有更新后的数据
我需要在开始作业之前创建模型实例,以便 scrapy 作业具有作业的主键来更新其状态并在完成作业时添加数据。
我知道为什么 JSON 响应没有我的新数据:我正在 DRF 完成工作后更新视图中的模型。在序列化程序实例上调用 .save() 后,我无法编辑序列化数据。
views.py
class StartJobView(views.APIView):
def post(self, request):
# map incoming 'id' field to 'client_id'
request.data['client_id'] = request.data['id']
serializer = JobSerializer(data=request.data)
if serializer.is_valid():
# create job entry
serializer.save()
id = serializer.data.get('id')
# get pk to pass to spider
settings = {
'id': id,
}
task_id = scrapyd.schedule('default', 'tester', settings=settings)
Job.objects.filter(id=id).update(task_id=task_id)
return Response(serializer.data, status=status.HTTP_201_CREATED)
return Response(serializer.errors, status=status.HTTP_400_BAD_REQUEST)
JSON 回复:
{
"id": "5f05f555-3214-41e4-81d1-b3915ae3f448",
"client_id": "8923356a-bc6e-4f17-bbea-bbc8699d308e",
"task_id": null,
"created": "2019-08-10T19:01:17.541873Z",
"status": "not_started",
"url": "http://brenden.info"
}
在序列化程序实例上调用 .save() 函数后,如何让序列化程序 class 知道模型更新?
问题是 scrapy 不会等待作业完成。该作业已安排好并将在另一个线程中完成。我不确定scrapy本身是否有内置的回调函数,但你可以自己检查。
您可以等到状态更改后再进行其余的序列化。
示例:
job_reult_check(job_id):
# It sets the limit for the total checks
# Here it will be 2 sec
check_limit = 10
wait_time = 0.2
while check_limit > 0:
# Give the job some time before the check
time.sleep(wait_time)
job = Job.objects.get(pk=job_id)
if job.status = "done":
check_limit = 0
else:
check_limit -= 1
return job
在您的视图中返回响应之前调用此函数:
class StartJobView(views.APIView):
def post(self, request):
# map incoming 'id' field to 'client_id'
request.data['client_id'] = request.data['id']
serializer = JobSerializer(data=request.data)
if serializer.is_valid():
# create job entry
serializer.save()
id = serializer.data.get('id')
# get pk to pass to spider
settings = {
'id': id,
}
task_id = scrapyd.schedule('default', 'tester', settings=settings)
job = Job.objects.filter(id=id).update(task_id=task_id)
job = job_reult_check(job.pk)
if job.status = "done":
# Job is finished
# You need to re-serialize the new instance
serializer = JobSerializer(job)
return Response(serializer.data, status=status.HTTP_201_CREATED)
return Response(serializer.errors, status=status.HTTP_400_BAD_REQUEST)
我目前有一个 API 端点,它从客户端接收数据并启动 scrapy 爬虫作业。
问题是我需要创建 Job 模型实例,启动 scrapy 作业,然后使用 scrapy 作业返回的 task_id 更新模型。模型更新成功,但是DRF返回的序列化数据没有更新后的数据
我需要在开始作业之前创建模型实例,以便 scrapy 作业具有作业的主键来更新其状态并在完成作业时添加数据。
我知道为什么 JSON 响应没有我的新数据:我正在 DRF 完成工作后更新视图中的模型。在序列化程序实例上调用 .save() 后,我无法编辑序列化数据。
views.py
class StartJobView(views.APIView):
def post(self, request):
# map incoming 'id' field to 'client_id'
request.data['client_id'] = request.data['id']
serializer = JobSerializer(data=request.data)
if serializer.is_valid():
# create job entry
serializer.save()
id = serializer.data.get('id')
# get pk to pass to spider
settings = {
'id': id,
}
task_id = scrapyd.schedule('default', 'tester', settings=settings)
Job.objects.filter(id=id).update(task_id=task_id)
return Response(serializer.data, status=status.HTTP_201_CREATED)
return Response(serializer.errors, status=status.HTTP_400_BAD_REQUEST)
JSON 回复:
{
"id": "5f05f555-3214-41e4-81d1-b3915ae3f448",
"client_id": "8923356a-bc6e-4f17-bbea-bbc8699d308e",
"task_id": null,
"created": "2019-08-10T19:01:17.541873Z",
"status": "not_started",
"url": "http://brenden.info"
}
在序列化程序实例上调用 .save() 函数后,如何让序列化程序 class 知道模型更新?
问题是 scrapy 不会等待作业完成。该作业已安排好并将在另一个线程中完成。我不确定scrapy本身是否有内置的回调函数,但你可以自己检查。
您可以等到状态更改后再进行其余的序列化。
示例:
job_reult_check(job_id):
# It sets the limit for the total checks
# Here it will be 2 sec
check_limit = 10
wait_time = 0.2
while check_limit > 0:
# Give the job some time before the check
time.sleep(wait_time)
job = Job.objects.get(pk=job_id)
if job.status = "done":
check_limit = 0
else:
check_limit -= 1
return job
在您的视图中返回响应之前调用此函数:
class StartJobView(views.APIView):
def post(self, request):
# map incoming 'id' field to 'client_id'
request.data['client_id'] = request.data['id']
serializer = JobSerializer(data=request.data)
if serializer.is_valid():
# create job entry
serializer.save()
id = serializer.data.get('id')
# get pk to pass to spider
settings = {
'id': id,
}
task_id = scrapyd.schedule('default', 'tester', settings=settings)
job = Job.objects.filter(id=id).update(task_id=task_id)
job = job_reult_check(job.pk)
if job.status = "done":
# Job is finished
# You need to re-serialize the new instance
serializer = JobSerializer(job)
return Response(serializer.data, status=status.HTTP_201_CREATED)
return Response(serializer.errors, status=status.HTTP_400_BAD_REQUEST)