Django:POST Pandas dataframe using API, and Pickle or how to "hello world" serializer and view for binary data within Django REST Framework

Django: POST Pandas dataframe using API, and Pickle or how to "hello world" serializer and view for binary data within Django REST Framework

这是一个基于

的问题

以下是我的尝试,但无法正常工作。

import pandas as pd
df = pd.DataFrame({'a': [0, 1, 2, 3]})

import pickle
pickled = pickle.dumps(df)

import base64
pickled_b64 = base64.b64encode(pickled)

想要通过 POST 通过 API 将 pickled_b64 对象发送到目的地 (www.foreignserver.com)

import requests
r = requests.post('http://www.foreignserver.com/api/DataFrame/', data = {'df_object':pickled_b64})

。 .

在服务器上www.foreignserver.com

使用 for Django REST Framework v. 3.9.4

models.py

class DataFrame(models.Model):
    df_object = models.BinaryField(blank=True, null=True)

serializers.py

class DataFrameSerializer(serializers.HyperlinkedModelSerializer):
    class Meta:
        model = DataFrame
        fields = ('__all__')

views.py

class DataFrameView(viewsets.ModelViewSet):
    queryset = DataFrame.objects.all()
    serializer_class = DataFrameSerializer

结果:

为了检查它的 memoryview 地址是否真的为空,我检索了假定的对象并用以下方法对其进行了分析:

print(list(object))
len(object)

结果都是0。

经过一番挖掘我发现:

"BinaryField are not supported by Django REST framework. You'll need to write a serializer field class and declare it in a mapping to make this work." ref:

你的序列化器是什么样子的?二进制文件可能正在保存(响应为 200),但您的序列化不知道如何将二进制字段字符串化。请通过直接检查数据库中的行来确认正在存储二进制文件。

看起来 drf 无法处理 BinaryField ootb。参见 How to use custom serializers fields in my HyeprlinkedModelSerializer

尝试

# serializers.py
from django.db import models

class MyBinaryField(serializers.Field):
    def to_representation(self, obj):
        return base64.b64decode(obj)
    def to_internal_value(self, data):
        return base64.encodestring(data)

class DataFrameSerializer(serializers.HyperlinkedModelSerializer):
    serializer_field_mapping = (
        serializers.ModelSerializer.serializer_field_mapping.copy()
    )
    serializer_field_mapping[models.BinaryField] = MyBinaryField
    class Meta:
        model = DataFrame
        fields = ('__all__')

经过一些工作,我能够回答我自己的问题。下面,我自己做了 interpretations/analysis,但可能误解了某些步骤中实际发生的事情。然而,下面的内容按预期工作并且是对问题的完整回答。

1。鉴于 DRF 中没有对 BinaryField 的原生(模型)支持,第一步是构建您自己的字段:

class MyBinaryField(serializers.Field):
    def to_internal_value(self, obj):
        return base64.b64decode(obj)
'''
to_internal_value is obj to database and it seems DRF 
sees the POST as a string (despite being encoded as bytes, 
therefore, by decode the string, you get to 
the underlying bytes data (pickle.dumps).   
'''

    def to_representation(self, value):
        return base64.b64encode(value)

'''
to_representation is the visual feedback, and in order 
for being able to see the byte data one need to decode it.   
'''

2。获得新字段后,将其实现到序列化程序中,

并定义方法。

请注意,serializers.ModelSerializer 不起作用,

所以你需要使用serializers.Serializer

class DataFrameSerializer(serializers.Serializer):
    serializer_field_mapping = (
        serializers.ModelSerializer.serializer_field_mapping.copy()
    )
    serializer_field_mapping[models.BinaryField] = MyBinaryField

    df_object = MyBinaryField()

    def create(self, validated_data):
        """
        Create and return a new `DataFrame' instance, given the validated data.
        """
        return DataFrame.objects.create(**validated_data)

    def update(self, instance, validated_data):
        """
        Update and return an existing 'DataFrame' instance, given the validated data.
        """
        instance.df_object = validated_data.get('df_object', instance.df_object)
        instance.save()
        return instance

3。最后,你定义你的视图

class DataFrameView(viewsets.ModelViewSet):
    queryset = DataFrame.objects.all()
    serializer_class = DataFrameSerializer

4。然后,您可以通过 API

访问和 POST 数据
import pickle
import requests
import base64
import pandas as pd

df = pd.DataFrame({'a': [0, 1, 2, 3]})
pickbytes = pickle.dumps(df)
b64_pickbytes = base64.b64encode(pickbytes)

url = 'http://localhost:8000/api/DataFrame/'
payload = {'df_object':b64_pickbytes}
r = requests.post(url=url, data=payload)

5。检索数据并重新创建 DataFrame

 >>> new = DataFrame.objects.first()
 >>> byt = new.df_object
 >>> s = pickle.loads(byt)
 >>> s
    a
 0  0
 1  1
 2  2
 3  3

与问题相关的有用帖子和文档:

[1] 
[2] 
[3] https://docs.python.org/3/library/stdtypes.html#memoryview
[4] https://www.django-rest-framework.org/api-guide/fields/#custom-fields
[5] https://www.django-rest-framework.org/tutorial/1-serialization/