c# 字符串到 c++ wstring 使用 Encoding.Unicode.Getbytes()

Question

所以问题是当使用 c# 时，char 是 4 个字节，所以“abc”是 (65 0 66 0 67 0)。

当通过将其发送到套接字中将其输入到 C++ 中的 wstring 时，我得到以下输出 a.

如何将这样的字符串转换为 C++ 字符串？

Answer 1

听起来您需要 ASCII 或 UTF-8 编码而不是 Unicode。

65 0 66 0 67 0 只会让你得到 A，因为下一个零在 C++ 中被解释为空终止字符。

可以找到将 Unicode 转换为 ASCII 的策略 here。

Answer 2

using c# the char is 4 bytes

不，在 CSharp 中，字符串以 UTF16 编码。 UTF16 编码单元至少需要两个字节。对于简单的字符，单个代码单元可以表示一个代码点（例如 65 0）。

在 Windows wstring 通常也是 UTF16（2-4 字节）编码。但是在 Unix/Linux wstring 通常使用 UTF32 编码（总是 4 字节）。

与 ASCII 相比，Unicode 代码点具有相同的数值 - 因此 UTF-16 编码的 ASCII 文本通常如下所示：{num} 0 {num} 0 {num} 0... 在此处查看详细信息：(https://en.wikipedia.org/wiki/UTF-16)

你能告诉我们一些代码吗，你是如何构造你的 wstring 对象的？空字节在这里很关键，因为它是 ASCII / ANSI 字符串的结束标记。

Answer 3

我已经能够使用 std::u16string 解决问题。这是一些示例代码

std::vector<char> data = { 65, 0, 66, 0, 67, 0 };
std::u16string string(&data[0], data.size() / 2);
// now string should be encoded right

c# string to c++ wstring using Encoding.Unicode.Getbytes()