在 Django 后端使用 Clamav 设置文件上传流扫描

Setting up a file upload stream scan using Clamav in a Django back-end

正在开发 React/Django 应用程序。我有用户通过 React 前端上传的文件,这些文件最终出现在 Django/DRF 后端。我们一直在服务器上安装防病毒 (AV) 运行ning,但我们想在将其写入磁盘之前添加流扫描。

关于如何设置它有点让我头疼。这是我正在查看的一些来源。

How do you virus scan a file being uploaded to your java webapp as it streams?

虽然已接受的最佳答案描述它“...非常容易”设置,但我正在努力。

我显然需要 cat testfile | clamscan - 根据 post 和相应的文档:

How do you virus scan a file being uploaded to your java webapp as it streams?

因此,如果我的后端如下所示:

class SaveDocumentAPIView(APIView):
    permission_classes = [IsAuthenticated]

    def post(self, request, *args, **kwargs):

        # this is for handling the files we do want
        # it writes the files to disk and writes them to the database
        for f in request.FILES.getlist('file'):
            max_id = Uploads.objects.all().aggregate(Max('id'))
            if max_id['id__max'] == None:
                max_id = 1
            else:    
                max_id = max_id['id__max'] + 1
            data = {
                'user_id': request.user.id,
                'sur_id': kwargs.get('sur_id'),
                'co': User.objects.get(id=request.user.id).co,
                'date_uploaded': datetime.datetime.now(),
                'size': f.size
            }
            filename = str(data['co']) + '_' + \
                    str(data['sur_id']) + '_' + \
                    str(max_id) + '_' + \
                    f.name
            data['doc_path'] = filename
            self.save_file(f, filename)
            serializer = SaveDocumentSerializer(data=data)
            if serializer.is_valid(raise_exception=True):
                serializer.save()
        return Response(status=HTTP_200_OK)

    # Handling the document
    def save_file(self, file, filename):
        with open('fileupload/' + filename, 'wb+') as destination:
            for chunk in file.chunks():
                destination.write(chunk)

我想我需要在 save_file 方法中添加一些东西,例如:

for chunk in file.chunks():
    # run bash comman from python
    cat chunk | clamscan -
    if passes_clamscan:
        destination.write(chunk)
        return HttpResponse('It passed')
    else:
        return HttpResponse('Virus detected')

所以我的问题是:

1) 如何 运行 来自 Python 的 Bash?

2) 如何从扫描中接收结果响应,以便可以将其发送回用户以及可以在后端使用响应来完成其他事情? (比如创建逻辑向用户和管理员发送一封电子邮件,告知他们的文件有病毒)。

我一直在玩这个,但运气不佳。

Running Bash commands in Python

此外,还有 Github 声称将 Clamav 与 Django 完美结合的存储库,但它们要么多年未更新,要么现有文档非常糟糕。请参阅以下内容:

https://github.com/vstoykov/django-clamd

https://github.com/musashiXXX/django-clamav-upload

https://github.com/QueraTeam/django-clamav

好的,使用 clamd 就可以了。我将 SaveDocumentAPIView 修改为以下内容。这会在将文件写入磁盘之前对其进行扫描,并在文件被感染时阻止它们被写入。仍然允许未受感染的文件通过,因此用户不必重新上传它们。

class SaveDocumentAPIView(APIView):
    permission_classes = [IsAuthenticated]

    def post(self, request, *args, **kwargs):

        # create array for files if infected
        infected_files = []

        # setup unix socket to scan stream
        cd = clamd.ClamdUnixSocket()

        # this is for handling the files we do want
        # it writes the files to disk and writes them to the database
        for f in request.FILES.getlist('file'):
            # scan stream
            scan_results = cd.instream(f)

            if (scan_results['stream'][0] == 'OK'):    
                # start to create the file name
                max_id = Uploads.objects.all().aggregate(Max('id'))
                if max_id['id__max'] == None:
                    max_id = 1
                else:    
                    max_id = max_id['id__max'] + 1
                data = {
                    'user_id': request.user.id,
                    'sur_id': kwargs.get('sur_id'),
                    'co': User.objects.get(id=request.user.id).co,
                    'date_uploaded': datetime.datetime.now(),
                    'size': f.size
                }
                filename = str(data['co']) + '_' + \
                        str(data['sur_id']) + '_' + \
                        str(max_id) + '_' + \
                        f.name
                data['doc_path'] = filename
                self.save_file(f, filename)
                serializer = SaveDocumentSerializer(data=data)
                if serializer.is_valid(raise_exception=True):
                    serializer.save()

            elif (scan_results['stream'][0] == 'FOUND'):
                send_mail(
                    'Virus Found in Submitted File',
                    'The user %s %s with email %s has submitted the following file ' \
                    'flagged as containing a virus: \n\n %s' % \
                    (
                        user_obj.first_name, 
                        user_obj.last_name, 
                        user_obj.email, 
                        f.name
                    ),
                    'The Company <no-reply@company.com>',
                    ['admin@company.com']
                )
                infected_files.append(f.name)

        return Response({'filename': infected_files}, status=HTTP_200_OK)

    # Handling the document
    def save_file(self, file, filename):
        with open('fileupload/' + filename, 'wb+') as destination:
            for chunk in file.chunks():
                destination.write(chunk)