为什么这个数据不是 python 中 avro 模式的示例?

Why is this datum not an example of the avro schema in python?

我在 Python 使用 kafka-python 解码来自 Kafka 的 Avro 消息时遇到了一些问题。简而言之,我专注于使用 avro 包解码消息。我已经使用官方 avro 文档中的架构和示例编写了一个测试:https://avro.apache.org/docs/current/gettingstartedpython.html.

repl.it

from avro.io import DatumWriter, DatumReader, BinaryEncoder, BinaryDecoder
import avro.schema
from io import BytesIO

schema = avro.schema.parse("""
    {
        "type": "record",
        "name": "User",
        "namespace": "example.avro",
        "fields": [
            {
                "name": "name",
                "type": "string"
            },
            {
                "name": "favorite_number",
                "type": [
                    "int",
                    "null"
                ]
            },
            {
                "name": "favorite_color",
                "type": [
                    "string",
                    "null"
                ]
            }
        ]
    }
""")

wb = BytesIO()
encoder = BinaryEncoder(wb)
writer = DatumWriter(schema)
writer.write('{"name":"Alyssa","favorite_number":256,"favorite_color":"blue"}', encoder)

rb = BytesIO(wb.getvalue())
decoder = BinaryDecoder(rb)
reader = DatumReader(schema)
msg = reader.read(decoder)

print(msg)

我收到 the datum {"name":"Alyssa","favorite_number":256,"favorite_color":"blue"} is not an example of the schema 的错误。鉴于此架构和数据直接来自 Python 的官方 Avro 文档,我做错了什么?

Traceback (most recent call last):
  File "main.py", line 36, in <module>
    writer.write('{"name":"Alyssa","favorite_number":256,"favorite_color":"blue"}', encoder)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/avro/io.py", line 979, in write
    raise AvroTypeException(self.writers_schema, datum)
avro.io.AvroTypeException: The datum {"name":"Alyssa","favorite_number":256,"favorite_color":"blue"} is not an example of the schema {
  "type": "record",
  "name": "User",
  "namespace": "example.avro",
  "fields": [
    {
      "type": "string",
      "name": "name"
    },
    {
      "type": [
        "int",
        "null"
      ],
      "name": "favorite_number"
    },
    {
      "type": [
        "string",
        "null"
      ],
      "name": "favorite_color"
    }
  ]
}

您目前有

writer.write('{"name":"Alyssa","favorite_number":256,"favorite_color":"blue"}', encoder)

所以您提供的数据是一个字符串。如果你把它改成这样的字典:

writer.write({"name":"Alyssa","favorite_number":256,"favorite_color":"blue"}, encoder)

然后就可以了。