Apache Thrift：字节类型和二进制类型之间的区别

Question

我想使用 Thrift 发送 1024 字节的数据。必须正好是1024字节，因为是和其他框架的对比基准。

Thrift 有两种类型来表示字节：'byte' 和 'binary'，但我不知道如何使用这些类型。 'binary' 类型映射到 std::string 这很奇怪（我不明白为什么以及如何使用它）。 'byte' 类型映射到一个 8 位整数，这对我来说似乎更合乎逻辑。

为了表示 1024 字节的数据，我使用：list<byte> byteSequence，大小为 1024。

但是编译警告建议我使用 binary 而不是 list<byte>，但为什么呢？如何？

我认为 'binary' 会获得更好的性能，因为它对于 1024 字节序列来说非常慢。

谢谢。

Answer 1

这可能取决于您将 thrift 文件编译成的语言，但是 binary 直接告诉 thrift 您确实想要传输一系列原始的、未编码的字节。

它在传输层的大小方面可能不会有太大变化，但是当您 instantiate/de-serialise 使用您选择的语言的对象时，您可能运行感到惊讶。例如，在 Java 中，binary 字段将用 byte[] 表示，而 list[byte] 将为您提供 List[Byte]，其表示效率要低得多同样的事情。

根据 thrift doc:

，

Java 可能是 binary 的唯一原因

binary: a sequence of unencoded bytes

N.B.: This is currently a specialized form of the string type above, added to provide better interoperability with Java. The current plan-of-record is to elevate this to a base type at some point.

Answer 2

But a compile warning advises me to use binary instead of list, but why ? and how ?

'byte' type is mapped to a 8 bits integer which seems more logical to me.

这正是出现警告的原因。这似乎合乎逻辑，但这是最糟糕的选择。此外，Thrift 中的 byte 实际上是 i8 - signed 类型。

'binary' type is mapped to std::string which is quite strange (I don't understand why).

别担心。那是历史性的事情。 binary 类型是后来添加的，并且在某些方面类似于 string 实现，以减少与旧版本的兼容性摩擦。这只是一个实现细节。

but I don't know how to use these types.

与任何其他类型一样：

 struct wtf {
   1 : binary foo
   2 : string bar
   3 : byte baz     // i8 is replacing byte to indicate the signedness
   4 : list<byte>   // not recommended, but nevertheless works 
 }

Apache Thrift：字节类型和二进制类型之间的区别

Apache Thrift : difference between byte and binary types

c++

binary

byte

thrift

sequence