将 ProtoBuffer Varint 类型从 bool 类型更改为表示所有位掩码值的枚举类型是否向前兼容？

Question

我想让以下 ProtoBuffer 消息向前兼容。

当前存储消息定义了一个状态字段作为 bool 类型：

message Storage {
    bool state = 1;
}

在 Protobuffer 编码中，它编码 Varint 类型，如 bool 和 enum 按以下格式键入：

|1-bit sequence number|4-bit serial number|3-bit data type|n-bit payload|

对于Varint类型，数据类型值会变成000:

|X|XXXX|000|XXXX...|

由于Storage消息结构只包含一个序列号为1的字段，序列号将变为0作为序列号number 还没有解析到最后一个字节。因此，上述格式将变为：

|0|0001|000|XXXX...|

现在，如果设置Storage.state = 0，则存储如下：

|0|0001|000|<0 will not be encoded>

Storage 消息的 Protobuffer 值将变为 0x8。

如果设置Storage.state = 1，则存储如下：

|0|0001|000|00000001|

Storage 消息的 Protobuffer 值将变为 0x8 0x1。

现在，我想从 bool[=105 中更改上面的 Storage.state 定义=] 键入 enum 类型如下：

// BIT7 | BIT6 | BIT5 | BIT4 | BIT3 | BIT2 | BIT1 | BIT0 | //------------------------------------------------------- // 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | = STATE0 (0) // 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | = STATE1 (1) // 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | = STATE2 (2) // 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | = STATE2 (3) //... so go on // 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | = STATE2 (254) // 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | = STATE2 (255) enum State { STATE0 = 0; STATE1 = 1; STATE2 = 2; STATE3 = 3; //... so go on STATE254 = 254; STATE255 = 255; } message Storage { State state = 1 }

所以现在，在 Protobuf 编码中，

如果设置Storage.state = State.STATE0，则存储如下：

|0|0001|000|<0 will not be encoded>

Storage 消息的 Protobuffer 值将变为 0x8。

如果设置Storage.state = State.STATE1，则存储如下：

|0|0001|000|00000001|

Storage 消息的 Protobuffer 值将变为 0x8 0x1。

如果设置Storage.state = State.STATE2，则存储如下：

|0|0001|000|00000010|

Storage 消息的 Protobuffer 值将变为 0x8 0x2。

如果设置Storage.state = State.STATE255，则存储如下：

|0|0001|000|11111111|

Storage 消息的 Protobuffer 值将变为 0x8 0xFF。

此更改是否仍然向前兼容 proto2 和 proto3 以及 C 和 Java？

我的问题基于以下参考资料： google protocol buffer -- the coding principle of protobuf II

Answer 1

我假设您实际上要在此处存储的是：按位状态值 - 在 C# 中可能是 [Flags] enum（提及纯粹是为了设置上下文）。

老实说，声明一个具有每位组合值的枚举：不是一个好主意；它会很快升级，并且使用起来不直观。当 copy/pasting 大量行...

时，它还会留下潜在的愚蠢错误

// omitted... 212 lines - but would you spot the error?
STATE213 = 213;
STATE214 = 214;
STATE215 = 214;
STATE216 = 216;
STATE217 = 217;
// ... etc

（好的，那个特定的错误需要允许别名标志，但是：你明白了）

在proto2中，枚举应该被识别；当遇到意外的枚举值时，它会变得有点...模糊，具有以下任何一项：

解析失败
被视为未知字段（需要通过单独的 API 访问）
通过整数值静默处理和解析（具有保留位标志的作用）

由于每个标志组合都没有枚举定义，您想要这里的选项 3，但不能保证在所有实现中都如此。

在proto3中，框架尽可能向3的方向倾斜，explicitly in the language specification，存储和检索整数值（具有保留位标志的作用）但也明确指出某些平台不允许开放枚举类型 - 例如，Java.

由于这个限制，既然你在标签中提到了java，我建议直接使用整数。它至少会在所有实现上以类似的方式工作。与您提出的解决方案相比，它 至少可用 - 但通常更多可用；考虑它如何作为枚举工作：

obj.state = State.State217;

与整数相比：

obj.state = 217;

这也将允许按位 combination/test/etc 操作用于值，而封闭枚举类型则不是这种情况。

至于 bool、enum 和 int32/uint32/sint32（以及 64 位对应项）在技术上是否可以互换（比例允许）：是的；它们都被编码为 varint。

将 ProtoBuffer Varint 类型从 bool 类型更改为表示所有位掩码值的枚举类型是否向前兼容？

Will changing the ProtoBuffer Varint type from a bool type to an enum type representing all bit-mask values be forward compatible?

protocol-buffers

proto

protobuf-c

protobuf-java