架构演变的 Avro 反序列化错误
Avro deserialization error on schema evolution
我有 2 个模式:
schema1(旧模式):
{
"namespace": "com.org.package",
"type": "record",
"name": "EventModel",
"fields": [
{
"name":"name",
"type":"string"
},
{
"name":"id",
"type":"long"
}
]
}
我用布尔字段更新了架构:
schema2(新架构):
{
"namespace": "com.org.package",
"type": "record",
"name": "EventModel",
"fields": [
{
"name":"name",
"type":"string"
},
{
"name":"id",
"type":"long"
},
{
"name":"isActive",
"type":"boolean",
"default":false
}
]
}
kafka 主题包含属于旧架构 (schema1) 的消息。更新消费者模式后,即使更新字段中存在默认值,消费者也无法反序列化旧模式消息。
根据 Avro 文档:
if the reader's record schema has a field that contains a default value, and writer's schema does not have a field with the same name, then the reader should use the default value from its field.
if the reader's record schema has a field with no default value, and writer's schema does not have a field with the same name, an error is signalled.
反序列化时出现以下错误:
java.io.EOFException: null
at org.apache.avro.io.BinaryDecoder.readBoolean(BinaryDecoder.java:140) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.io.ValidatingDecoder.readBoolean(ValidatingDecoder.java:77) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:194) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:136) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:237) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:170) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144) ~[avro-1.9.1.jar!/:1.9.1]
为什么当记录缺少该字段时,默认值不应用于消费者?
非常感谢任何帮助。提前致谢!
尝试将 boolean
和 null
的类型更改为 isActive
,而不仅仅是 boolean
。类似于:
{
"name": "isActive",
"type": ["boolean", "null"],
"default": false
}
它将使模式向后兼容。
我有 2 个模式:
schema1(旧模式):
{
"namespace": "com.org.package",
"type": "record",
"name": "EventModel",
"fields": [
{
"name":"name",
"type":"string"
},
{
"name":"id",
"type":"long"
}
]
}
我用布尔字段更新了架构:
schema2(新架构):
{
"namespace": "com.org.package",
"type": "record",
"name": "EventModel",
"fields": [
{
"name":"name",
"type":"string"
},
{
"name":"id",
"type":"long"
},
{
"name":"isActive",
"type":"boolean",
"default":false
}
]
}
kafka 主题包含属于旧架构 (schema1) 的消息。更新消费者模式后,即使更新字段中存在默认值,消费者也无法反序列化旧模式消息。
根据 Avro 文档:
if the reader's record schema has a field that contains a default value, and writer's schema does not have a field with the same name, then the reader should use the default value from its field.
if the reader's record schema has a field with no default value, and writer's schema does not have a field with the same name, an error is signalled.
反序列化时出现以下错误:
java.io.EOFException: null
at org.apache.avro.io.BinaryDecoder.readBoolean(BinaryDecoder.java:140) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.io.ValidatingDecoder.readBoolean(ValidatingDecoder.java:77) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:194) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:136) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:237) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:170) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) ~[avro-1.9.1.jar!/:1.9.1]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144) ~[avro-1.9.1.jar!/:1.9.1]
为什么当记录缺少该字段时,默认值不应用于消费者? 非常感谢任何帮助。提前致谢!
尝试将 boolean
和 null
的类型更改为 isActive
,而不仅仅是 boolean
。类似于:
{
"name": "isActive",
"type": ["boolean", "null"],
"default": false
}
它将使模式向后兼容。