使用 Apache Avro 的 Avro 数组
Avro array using Apache Avro
实际上我的问题类似于下面 link 中显示的答案(显示 serializing/desrializing 使用 avsc javascript 库),但我需要一个解决方案来序列化到avro 并使用 apache avro 将其反序列化为 java 而不是...
Avro schema for Json array
https://avro.apache.org/docs/current/gettingstartedjava.html
数据
[
{"id":1,"text":"some text","user_id":1},
{"id":1,"text":"some text","user_id":2},
...
]
架构
{
"name": "Name",
"type": "array",
"namespace": "com.hi.avro.model",
"items": {
"name": "NameDetails",
"type": "record",
"fields": [
{
"name": "id",
"type": "int"
},
{
"name": "text",
"type": "string"
},
{
"name": "user_id",
"type": "int"
}
]
}
}
感谢任何帮助...
我已经阅读了 API,并在我的代码中进行了尝试...
当我创建 post 时,我试图将它分配给 GenericRecord,但它不起作用。所以我 post 编辑了这个问题,因为我不太清楚。
最后,我没有将整个数组分配给 GenericRecord,而是继续使用 GenericArray 并向其添加 GenericRecord
下面是代码片段
//到json
GenericArray record = new GenericDatumReader<GenericArray>(schema).read(null, binaryDecoder);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
NoWrappingJsonEncoder jsonEncoder = new NoWrappingJsonEncoder(record.getSchema(), outputStream);
DatumWriter<GenericArray> writer = record instanceof SpecificRecord ?
new SpecificDatumWriter<>(record.getSchema()) :
new GenericDatumWriter<>(record.getSchema());
writer.write(record, jsonEncoder);
jsonEncoder.flush();
byte[] result = outputStream.toByteArray();
//到avro
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
BinaryEncoder encoder = EncoderFactory.get().binaryEncoder(outputStream, null);
GenericDatumWriter<Object> writer = new GenericDatumWriter(schema, genericData);
byte[] data;
List list = mapper.readValue(data, List.class);
GenericRecordBuilder record = new GenericRecordBuilder(schema.getElementType());
List<GenericData.Record> array = new ArrayList<>();
json.forEach(entry -> {
//logic for reading GenericRecord for each of the list should be done here
...
array.add(genericRecord);
});
writer.write(array, encoder);
encoder.flush();
byte[] result = outputStream.toByteArray();
实际上我的问题类似于下面 link 中显示的答案(显示 serializing/desrializing 使用 avsc javascript 库),但我需要一个解决方案来序列化到avro 并使用 apache avro 将其反序列化为 java 而不是...
Avro schema for Json array
https://avro.apache.org/docs/current/gettingstartedjava.html
数据
[
{"id":1,"text":"some text","user_id":1},
{"id":1,"text":"some text","user_id":2},
...
]
架构
{
"name": "Name",
"type": "array",
"namespace": "com.hi.avro.model",
"items": {
"name": "NameDetails",
"type": "record",
"fields": [
{
"name": "id",
"type": "int"
},
{
"name": "text",
"type": "string"
},
{
"name": "user_id",
"type": "int"
}
]
}
}
感谢任何帮助...
我已经阅读了 API,并在我的代码中进行了尝试...
当我创建 post 时,我试图将它分配给 GenericRecord,但它不起作用。所以我 post 编辑了这个问题,因为我不太清楚。
最后,我没有将整个数组分配给 GenericRecord,而是继续使用 GenericArray 并向其添加 GenericRecord
下面是代码片段
//到json
GenericArray record = new GenericDatumReader<GenericArray>(schema).read(null, binaryDecoder);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
NoWrappingJsonEncoder jsonEncoder = new NoWrappingJsonEncoder(record.getSchema(), outputStream);
DatumWriter<GenericArray> writer = record instanceof SpecificRecord ?
new SpecificDatumWriter<>(record.getSchema()) :
new GenericDatumWriter<>(record.getSchema());
writer.write(record, jsonEncoder);
jsonEncoder.flush();
byte[] result = outputStream.toByteArray();
//到avro
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
BinaryEncoder encoder = EncoderFactory.get().binaryEncoder(outputStream, null);
GenericDatumWriter<Object> writer = new GenericDatumWriter(schema, genericData);
byte[] data;
List list = mapper.readValue(data, List.class);
GenericRecordBuilder record = new GenericRecordBuilder(schema.getElementType());
List<GenericData.Record> array = new ArrayList<>();
json.forEach(entry -> {
//logic for reading GenericRecord for each of the list should be done here
...
array.add(genericRecord);
});
writer.write(array, encoder);
encoder.flush();
byte[] result = outputStream.toByteArray();