JsonCpp:序列化 JSON 导致字节串中的数据丢失
JsonCpp: Serializing JSON causes loss of data in byte string
我有一个简单的用例,我希望序列化和传输 0 到 256 之间的整数向量。我推测最 space 最有效的方法是将向量序列化为序列化字符串,其中第 n 个字符的 ASCII 码相当于相应向量的第 n 个元素。为此,我写了以下两个函数:
std::string SerializeToBytes(const std::vector<int> &frag)
{
std::vector<unsigned char> res;
res.reserve(frag.size());
for(int val : frag) {
res.push_back((char) val);
}
return std::string(res.begin(), res.end());
}
std::vector<int> ParseFromBytes(const std::string &serialized_frag)
{
std::vector<int> res;
res.reserve(serialized_frag.length());
for(unsigned char c : serialized_frag) {
res.push_back(c);
}
return res;
}
但是,当使用 JsonCpp 发送此数据时,我 运行 遇到了问题。下面的最小可重现示例表明问题并非源于上述方法,而是仅在 Json::Value
被序列化并随后被解析时才会出现。这会导致序列化字符串中的一些编码数据丢失。
#include <cassert>
#include <json/json.h>
int main() {
std::vector frag = { 230 };
std::string serialized = SerializeToBytes(frag);
// Will pass, indicating that the SerializeToBytes and ParseFromBytes functions are not the issue.
assert(frag == ParseFromBytes(serialized));
Json::Value val;
val["STR"] = serialized;
// Will pass, showing that the issue does not appear until JSON is serialized and then parsed.
assert(frag == ParseFromBytes(val["STR"].asString()));
Json::StreamWriterBuilder builder;
builder["indentation"] = "";
std::string serialized_json = Json::writeString(builder, val);
// Will be serialized to "{\"STR\":\"\ufffd\"}".
Json::Value reconstructed_json;
Json::Reader reader;
reader.parse(serialized_json, reconstructed_json);
// Will produce { 239, 191, 189 }, rather than { 230 }, as it should.
std::vector<int> frag_from_json = ParseFromBytes(reconstructed_json["STR"].asString());
// Will fail, showing that the issue stems from the serialize/parsing process.
assert(frag == frag_from_json);
return 0;
}
这个问题的原因是什么,我该如何解决?感谢您提供的任何帮助。
JsoncppClass Value
This class is a discriminated union wrapper that can represents a:
- ...
- UTF-8 string
- ...
{ 230 }
是无效的 UTF-8 字符串。因此,Json::writeString(builder, val)
对正确结果的进一步期望是非法的。
我有一个简单的用例,我希望序列化和传输 0 到 256 之间的整数向量。我推测最 space 最有效的方法是将向量序列化为序列化字符串,其中第 n 个字符的 ASCII 码相当于相应向量的第 n 个元素。为此,我写了以下两个函数:
std::string SerializeToBytes(const std::vector<int> &frag)
{
std::vector<unsigned char> res;
res.reserve(frag.size());
for(int val : frag) {
res.push_back((char) val);
}
return std::string(res.begin(), res.end());
}
std::vector<int> ParseFromBytes(const std::string &serialized_frag)
{
std::vector<int> res;
res.reserve(serialized_frag.length());
for(unsigned char c : serialized_frag) {
res.push_back(c);
}
return res;
}
但是,当使用 JsonCpp 发送此数据时,我 运行 遇到了问题。下面的最小可重现示例表明问题并非源于上述方法,而是仅在 Json::Value
被序列化并随后被解析时才会出现。这会导致序列化字符串中的一些编码数据丢失。
#include <cassert>
#include <json/json.h>
int main() {
std::vector frag = { 230 };
std::string serialized = SerializeToBytes(frag);
// Will pass, indicating that the SerializeToBytes and ParseFromBytes functions are not the issue.
assert(frag == ParseFromBytes(serialized));
Json::Value val;
val["STR"] = serialized;
// Will pass, showing that the issue does not appear until JSON is serialized and then parsed.
assert(frag == ParseFromBytes(val["STR"].asString()));
Json::StreamWriterBuilder builder;
builder["indentation"] = "";
std::string serialized_json = Json::writeString(builder, val);
// Will be serialized to "{\"STR\":\"\ufffd\"}".
Json::Value reconstructed_json;
Json::Reader reader;
reader.parse(serialized_json, reconstructed_json);
// Will produce { 239, 191, 189 }, rather than { 230 }, as it should.
std::vector<int> frag_from_json = ParseFromBytes(reconstructed_json["STR"].asString());
// Will fail, showing that the issue stems from the serialize/parsing process.
assert(frag == frag_from_json);
return 0;
}
这个问题的原因是什么,我该如何解决?感谢您提供的任何帮助。
JsoncppClass Value
This class is a discriminated union wrapper that can represents a:
- ...
- UTF-8 string
- ...
{ 230 }
是无效的 UTF-8 字符串。因此,Json::writeString(builder, val)
对正确结果的进一步期望是非法的。