C++ 抽象 Endian 是中性的吗？

Question

假设我有一个客户端和一个服务器，它们通过一些网络协议相互通信 16 位数字，例如 ModbusTCP，但是协议在这里不相关。

现在我知道，客户端的字节序是little（我的PC），服务器的字节序是big（一些 PLC），客户端完全用 C++ 编写，带有 Boost Asio 套接字。使用此设置，我认为我必须交换从服务器接收到的字节以正确地将数字存储在 uint16_t 变量中，但是这是 错误的 因为我读的不正确值。

到目前为止，我的理解是我的 C++ 抽象将值正确地存储到变量中，而不需要我真正关心交换或字节序。考虑这个片段：

// received 0x0201  (513 in big endian)
uint8_t high { 0x02 };  // first byte
uint8_t low { 0x01 };   // second byte
// merge into 16 bit value (no swap)
uint16_t val = (static_cast<uint16_t>(high)<< 8) | (static_cast<uint16_t>(low));
std::cout<<val;   //correctly prints 513

这让我有些吃惊，也是因为如果我用指针查看内存表示，我发现它们实际上存储在客户端的小端：

// take the address of val, convert it to uint8_t pointer
auto addr = static_cast<uint8_t*>(&val);
// take the first and second bytes and print them 
printf ("%d ", (int)addr[0]);   // print 1
printf ("%d", (int)addr[1]);    // print 2

所以问题是：

只要我不弄乱内存地址和指针，C++ 可以保证我从网络读取的值是正确的，无论服务器，对吗？或者我在这里遗漏了什么？

编辑： 感谢您的回答，我想补充一点，我目前正在使用 boost::asio::write(socket, boost::asio::buffer(data)) 从客户端向服务器发送数据，数据是 std::vector<uint8_t>。所以我的理解是，只要我按网络顺序填充数据，我就不应该关心我的系统（甚至是服务器的 16 位数据）的字节顺序，因为我是在“值”上操作而不是直接读取字节凭记忆，对吧？

要使用 htons 系列函数，我必须更改我的底层 TCP 层以使用 memcpy 或类似的和 uint8_t* 数据缓冲区，它更像 C 风格而不是C++ish，我为什么要这样做？有没有我没有看到的优势？

Answer 1

(static_cast<uint16_t>(high)<< 8) | (static_cast<uint16_t>(low)) 具有相同的行为，无论字节序如何，数字的“左”端始终是最高有效位，字节序只会改变该位是第一个还是最后一个字节。

例如：

uint16_t input = 0x0201;
uint8_t leftByte = input >> 8; // same result regardless of endianness
uint8_t rightByte = input & 0xFF; // same result regardless of endianness
uint8_t data[2];
memcpy(data, &input, sizeof(input)); // data will be {0x02, 0x01} or {0x01, 0x02} depending on endianness

反方向同理：

uint8_t data[] = {0x02, 0x01};
uint16_t output1;
memcpy(&output1, data, sizeof(output1)); // will be 0x0102 or 0x0201 depending on endianness
uint16_t output2 = data[1] << 8 | data[0]; // will be 0x0201 regardless of endianness

为确保您的代码适用于所有平台，最好使用 htons 和 ntohs 函数系列：

uint16_t input = 0x0201; // input is in host order
uint16_t networkInput = htons(input);
uint8_t data[2];
memcpy(data, &networkInput , sizeof(networkInput));
// data is big endian or "network" order
uint16_t networkOutput;
memcpy(&networkOutput, &data, sizeof(networkOutput));
uint16_t output = ntohs(networkOutput);  // output is in host order

Answer 2

您的代码的第一个片段工作正常，因为您没有直接使用字节地址。由于运算符“<<”和“|”的定义，此类代码被编译为具有独立于您的平台 ENDIANness 的正确操作结果通过 C++ 语言。

您的代码的第二个片段证明了这一点，显示了您的小端系统上各个字节的实际值。

TCP/IP 网络标准化了大端格式的使用，并提供了以下实用程序：

在发送多字节数值之前使用标准函数：htonl（“host-to-network-long”）和 htons（“host-to-netowrk-short”）将您的值转换为网络表示形式，
收到多字节数值后，使用标准函数：ntohl（“network-to-host-long”）和 ntohs（“network-to-host-short”）将您的值转换为特定平台代表。

（实际上这 4 个实用程序仅在小端平台上进行转换，而在大端平台上不执行任何操作。但是始终使用它们会使您的代码与平台无关）。

使用 ASIO，您可以使用以下工具访问这些实用程序： #include <boost/asio.hpp>

您可以在 Google 中查找主题 'man htonl' 或 'msdn htonl' 阅读更多内容。

Answer 3

关于 Modbus :

对于 16 位字，Modbus 首先发送最高有效字节，这意味着它使用 Big-Endian，然后如果客户端或服务器使用 Little-Endian，它们将不得不在发送或接收时交换字节。

另一个问题是 Modbus 没有定义 32 位类型的 16 位寄存器的发送顺序。

有些 Modbus 服务器设备首先发送最高有效的 16 位寄存器，而其他设备则相反。为此，唯一的解决方案是在客户端配置中交换 16 位寄存器的可能性。

传输字符串时也会出现类似的问题，有些服务器不是发送abcdef而是发送badcfe

C++ 抽象 Endian 是中性的吗？

is C++ abstraction Endian neutral?

c++

modbus

endianness

modbus-tcp