使用 DataContractSerializer 和 XmlDictionaryWriter 序列化 JObject 后崩溃
Crash after serializing JObject with DataContractSerializer and XmlDictionaryWriter
我必须使用 DataContractSerializer 序列化 Newtonsoft JObject,它因堆栈溢出而崩溃。
如何让它发挥作用?
我的代码是。
var serializer = new DataContractSerializer(typeof(JObject));
MemoryStream stream1 = new MemoryStream();
var writer = XmlDictionaryWriter.CreateBinaryWriter(stream1);
var obj = new JObject();
serializer.WriteObject(writer, obj);
writer.Flush();
以下示例使用 ISerializationSurrogateProvider
功能将 JObject 转换为普通类型。它会因堆栈溢出而崩溃。
using System;
using System.IO;
using Newtonsoft.Json.Linq;
using System.Runtime.Serialization;
using System.Xml;
class Program
{
[DataContract(Name = "JTokenReference", Namespace = "urn:actors")]
[Serializable]
public sealed class JTokenReference
{
public JTokenReference()
{
}
[DataMember(Name = "JType", Order = 0, IsRequired = true)]
public JTokenType JType { get; set; }
[DataMember(Name = "Value", Order = 1, IsRequired = true)]
public string Value { get; set; }
public static JTokenReference From(JToken jt)
{
if (jt == null)
{
return null;
}
return new JTokenReference()
{
Value = jt.ToString(),
JType = jt.Type
};
}
public object To()
{
switch (JType)
{
case JTokenType.Object:
{
return JObject.Parse(Value);
}
case JTokenType.Array:
{
return JArray.Parse(Value);
}
default:
{
return JToken.Parse(Value);
}
}
}
}
internal class ActorDataContractSurrogate : ISerializationSurrogateProvider
{
public static readonly ISerializationSurrogateProvider Instance = new ActorDataContractSurrogate();
public Type GetSurrogateType(Type type)
{
if (typeof(JToken).IsAssignableFrom(type))
{
return typeof(JTokenReference);
}
return type;
}
public object GetObjectToSerialize(object obj, Type targetType)
{
if (obj == null)
{
return null;
}
else if (obj is JToken jt)
{
return JTokenReference.From(jt);
}
return obj;
}
public object GetDeserializedObject(object obj, Type targetType)
{
if (obj == null)
{
return null;
}
else if (obj is JTokenReference reference &&
typeof(JToken).IsAssignableFrom(targetType))
{
return reference.To();
}
return obj;
}
}
[DataContract(Name = "Test", Namespace = "urn:actors")]
[Serializable]
public class Test
{
[DataMember(Name = "obj", Order = 0, IsRequired = false)]
public JObject obj;
}
static void Main(string[] args)
{
var serializer = new DataContractSerializer(typeof(Test),
new DataContractSerializerSettings()
{
MaxItemsInObjectGraph = int.MaxValue,
KnownTypes = new Type[] { typeof(JTokenReference), typeof(JObject), typeof(JToken) },
});
serializer.SetSerializationSurrogateProvider(ActorDataContractSurrogate.Instance);
MemoryStream stream1 = new MemoryStream();
var writer = XmlDictionaryWriter.CreateBinaryWriter(stream1);
var obj = new JObject();
var test = new Test()
{
obj = obj,
};
serializer.WriteObject(writer, test);
writer.Flush();
Console.WriteLine(System.Text.Encoding.UTF8.GetString(stream1.GetBuffer(), 0, checked((int)stream1.Length)));
}
}
我正在尝试定义一个新类型 JTokenReference 以在序列化时替换 JObject/JToken,但它在替换发生之前崩溃了。好像解析类型失败了。
TL;DR
您的方法是合理的,应该可行,但由于递归集合类型的 ISerializationSurrogateProvider
功能中似乎存在错误而失败。每当需要序列化 JToken
时,您都需要更改设计以使用代理项属性,例如如下:
[IgnoreDataMember]
public JObject obj { get; set; }
[DataMember(Name = "obj", Order = 0, IsRequired = false)]
string objSurrogate { get { return obj?.ToString(Newtonsoft.Json.Formatting.None); } set { obj = (value == null ? null : JObject.Parse(value)); } }
说明
您遇到的崩溃是堆栈溢出,可以更简单地重现如下。当数据协定序列化程序编写一个泛型如 List<string>
时,它通过组合泛型 class 和参数名称构造一个 data contract name,如下所示:
List<string>
: ArrayOfstring
List<List<string>
: ArrayOfArrayOfstring
List<List<List<string>>>
: ArrayOfArrayOfArrayOfstring
等等。随着通用嵌套越来越深,名称变得越来越长。那么,如果我们像下面这样定义一个自递归集合类型会发生什么?
public class RecursiveList<T> : List<RecursiveList<T>>
{
}
好吧,如果我们尝试使用数据契约序列化程序序列化其中一个列表,它会因堆栈溢出异常而崩溃 试图找出契约名称。演示 fiddle #1 here -- 您需要取消注释行 //Test(new RecursiveList<string>());
才能看到崩溃:
Stack overflow.
at System.ModuleHandle.ResolveType(System.Runtime.CompilerServices.QCallModule, Int32, IntPtr*, Int32, IntPtr*, Int32, System.Runtime.CompilerServices.ObjectHandleOnStack)
at System.ModuleHandle.ResolveTypeHandleInternal(System.Reflection.RuntimeModule, Int32, System.RuntimeTypeHandle[], System.RuntimeTypeHandle[])
at System.Reflection.RuntimeModule.ResolveType(Int32, System.Type[], System.Type[])
at System.Reflection.CustomAttribute.FilterCustomAttributeRecord(System.Reflection.MetadataToken, System.Reflection.MetadataImport ByRef, System.Reflection.RuntimeModule, System.Reflection.MetadataToken, System.RuntimeType, Boolean, ListBuilder`1<System.Object> ByRef, System.RuntimeType ByRef, System.IRuntimeMethodInfo ByRef, Boolean ByRef)
at System.Reflection.CustomAttribute.IsCustomAttributeDefined(System.Reflection.RuntimeModule, Int32, System.RuntimeType, Int32, Boolean)
at System.Reflection.CustomAttribute.IsDefined(System.RuntimeType, System.RuntimeType, Boolean)
at System.Runtime.Serialization.CollectionDataContract.IsCollectionOrTryCreate(System.Type, Boolean, System.Runtime.Serialization.DataContract ByRef, System.Type ByRef, Boolean)
at System.Runtime.Serialization.CollectionDataContract.IsCollectionHelper(System.Type, System.Type ByRef, Boolean)
at System.Runtime.Serialization.DataContract.GetNonDCTypeStableName(System.Type)
at System.Runtime.Serialization.DataContract.GetStableName(System.Type, Boolean ByRef)
at System.Runtime.Serialization.DataContract.GetCollectionStableName(System.Type, System.Type, System.Runtime.Serialization.CollectionDataContractAttribute ByRef)
at System.Runtime.Serialization.DataContract.GetNonDCTypeStableName(System.Type)
at System.Runtime.Serialization.DataContract.GetStableName(System.Type, Boolean ByRef)
at System.Runtime.Serialization.DataContract.GetCollectionStableName(System.Type, System.Type, System.Runtime.Serialization.CollectionDataContractAttribute ByRef)
at System.Runtime.Serialization.DataContract.GetNonDCTypeStableName(System.Type)
at System.Runtime.Serialization.DataContract.GetStableName(System.Type, Boolean ByRef)
糟糕。好吧,如果我们为 RecursiveList<string>
创建一个序列化代理,例如下面的虚拟代理,会怎样呢?
public class RecursiveListStringSurrogate
{
// A dummy surrogate that serializes nothing, for testing purposes.
}
public class RecursiveListStringSurrogateSelector : ISerializationSurrogateProvider
{
public object GetDeserializedObject(object obj, Type targetType)
{
if (obj is RecursiveListStringSurrogate)
return new RecursiveList<string>();
return obj;
}
public object GetObjectToSerialize(object obj, Type targetType)
{
if (obj is RecursiveList<string>)
return new RecursiveListStringSurrogate();
return obj;
}
public Type GetSurrogateType(Type type)
{
if (type == typeof(RecursiveList<string>))
return typeof(RecursiveListStringSurrogate);
return type;
}
}
使用那个代理,一个空的new RecursiveList<string>()
确实可以成功序列化,因为
<RecursiveListStringSurrogate xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/" />
演示 fiddle #2 here.
好的,现在让我们尝试在 RecursiveList<string>
嵌入模型中时使用代理项,例如:
public class Model
{
public RecursiveList<string> List { get; set; }
}
好吧,当我尝试用一个空列表序列化这个模型的一个实例时,崩溃又回来了。演示 fiddle #3 here - 您需要取消注释行 //Test(new Model { List = new RecursiveList<string>() });
才能看到崩溃。
再次糟糕。目前还不完全清楚为什么会失败。我只能推测,在某个地方,Microsoft 保留了一个字典,将原始数据合同名称映射到代理数据合同名称——这会导致堆栈溢出,只是生成一个字典键。
现在这与 JObject
和您的 Test
class 有什么关系?好吧,事实证明 JObject
is another example of a recursive collection type. It implements IDictionary<string, JToken?>
and JToken
反过来实现了 IEnumerable<JToken>
,从而触发了我们在包含 RecursiveList<string>
.
的简单模型中看到的相同堆栈溢出。
您甚至可能想 report an issue 向 Microsoft 报告此事(尽管我不知道他们是否正在修复数据合同序列化程序的错误。)
解决方法
为避免此问题,您将需要修改模型以使用 JToken
成员的代理属性,如本答案开头所示:
[DataContract(Name = "Test", Namespace = "urn:actors")]
public class Test
{
[IgnoreDataMember]
public JObject obj { get; set; }
[DataMember(Name = "obj", Order = 0, IsRequired = false)]
string objSurrogate { get { return obj?.ToString(Newtonsoft.Json.Formatting.None); } set { obj = (value == null ? null : JObject.Parse(value)); } }
}
可以序列化成功如下:
var obj = new JObject();
var test = new Test()
{
obj = obj,
};
var serializer = new DataContractSerializer(test.GetType());
MemoryStream stream1 = new MemoryStream();
var writer = XmlDictionaryWriter.CreateBinaryWriter(stream1);
serializer.WriteObject(writer, test);
writer.Flush();
Console.WriteLine(System.Text.Encoding.UTF8.GetString(stream1.GetBuffer(), 0, checked((int)stream1.Length)));
备注:
如果您需要将 JToken
序列化为 根对象 您可以将其包装在某个容器对象中,或者使用 ActorDataContractSurrogate
从你的问题。正如我们所见,当递归集合类型是根对象时,序列化功能似乎确实适用于它们。
由于您正在序列化为二进制文件,为了提高效率,我建议将 JObject
格式化为 Formatting.None
。
代理属性只要标上[DataMember]
就可以是私有的.
演示 fiddle #4 here.
我必须使用 DataContractSerializer 序列化 Newtonsoft JObject,它因堆栈溢出而崩溃。 如何让它发挥作用? 我的代码是。
var serializer = new DataContractSerializer(typeof(JObject));
MemoryStream stream1 = new MemoryStream();
var writer = XmlDictionaryWriter.CreateBinaryWriter(stream1);
var obj = new JObject();
serializer.WriteObject(writer, obj);
writer.Flush();
以下示例使用 ISerializationSurrogateProvider
功能将 JObject 转换为普通类型。它会因堆栈溢出而崩溃。
using System;
using System.IO;
using Newtonsoft.Json.Linq;
using System.Runtime.Serialization;
using System.Xml;
class Program
{
[DataContract(Name = "JTokenReference", Namespace = "urn:actors")]
[Serializable]
public sealed class JTokenReference
{
public JTokenReference()
{
}
[DataMember(Name = "JType", Order = 0, IsRequired = true)]
public JTokenType JType { get; set; }
[DataMember(Name = "Value", Order = 1, IsRequired = true)]
public string Value { get; set; }
public static JTokenReference From(JToken jt)
{
if (jt == null)
{
return null;
}
return new JTokenReference()
{
Value = jt.ToString(),
JType = jt.Type
};
}
public object To()
{
switch (JType)
{
case JTokenType.Object:
{
return JObject.Parse(Value);
}
case JTokenType.Array:
{
return JArray.Parse(Value);
}
default:
{
return JToken.Parse(Value);
}
}
}
}
internal class ActorDataContractSurrogate : ISerializationSurrogateProvider
{
public static readonly ISerializationSurrogateProvider Instance = new ActorDataContractSurrogate();
public Type GetSurrogateType(Type type)
{
if (typeof(JToken).IsAssignableFrom(type))
{
return typeof(JTokenReference);
}
return type;
}
public object GetObjectToSerialize(object obj, Type targetType)
{
if (obj == null)
{
return null;
}
else if (obj is JToken jt)
{
return JTokenReference.From(jt);
}
return obj;
}
public object GetDeserializedObject(object obj, Type targetType)
{
if (obj == null)
{
return null;
}
else if (obj is JTokenReference reference &&
typeof(JToken).IsAssignableFrom(targetType))
{
return reference.To();
}
return obj;
}
}
[DataContract(Name = "Test", Namespace = "urn:actors")]
[Serializable]
public class Test
{
[DataMember(Name = "obj", Order = 0, IsRequired = false)]
public JObject obj;
}
static void Main(string[] args)
{
var serializer = new DataContractSerializer(typeof(Test),
new DataContractSerializerSettings()
{
MaxItemsInObjectGraph = int.MaxValue,
KnownTypes = new Type[] { typeof(JTokenReference), typeof(JObject), typeof(JToken) },
});
serializer.SetSerializationSurrogateProvider(ActorDataContractSurrogate.Instance);
MemoryStream stream1 = new MemoryStream();
var writer = XmlDictionaryWriter.CreateBinaryWriter(stream1);
var obj = new JObject();
var test = new Test()
{
obj = obj,
};
serializer.WriteObject(writer, test);
writer.Flush();
Console.WriteLine(System.Text.Encoding.UTF8.GetString(stream1.GetBuffer(), 0, checked((int)stream1.Length)));
}
}
我正在尝试定义一个新类型 JTokenReference 以在序列化时替换 JObject/JToken,但它在替换发生之前崩溃了。好像解析类型失败了。
TL;DR
您的方法是合理的,应该可行,但由于递归集合类型的 ISerializationSurrogateProvider
功能中似乎存在错误而失败。每当需要序列化 JToken
时,您都需要更改设计以使用代理项属性,例如如下:
[IgnoreDataMember]
public JObject obj { get; set; }
[DataMember(Name = "obj", Order = 0, IsRequired = false)]
string objSurrogate { get { return obj?.ToString(Newtonsoft.Json.Formatting.None); } set { obj = (value == null ? null : JObject.Parse(value)); } }
说明
您遇到的崩溃是堆栈溢出,可以更简单地重现如下。当数据协定序列化程序编写一个泛型如 List<string>
时,它通过组合泛型 class 和参数名称构造一个 data contract name,如下所示:
List<string>
:ArrayOfstring
List<List<string>
:ArrayOfArrayOfstring
List<List<List<string>>>
:ArrayOfArrayOfArrayOfstring
等等。随着通用嵌套越来越深,名称变得越来越长。那么,如果我们像下面这样定义一个自递归集合类型会发生什么?
public class RecursiveList<T> : List<RecursiveList<T>>
{
}
好吧,如果我们尝试使用数据契约序列化程序序列化其中一个列表,它会因堆栈溢出异常而崩溃 试图找出契约名称。演示 fiddle #1 here -- 您需要取消注释行 //Test(new RecursiveList<string>());
才能看到崩溃:
Stack overflow.
at System.ModuleHandle.ResolveType(System.Runtime.CompilerServices.QCallModule, Int32, IntPtr*, Int32, IntPtr*, Int32, System.Runtime.CompilerServices.ObjectHandleOnStack)
at System.ModuleHandle.ResolveTypeHandleInternal(System.Reflection.RuntimeModule, Int32, System.RuntimeTypeHandle[], System.RuntimeTypeHandle[])
at System.Reflection.RuntimeModule.ResolveType(Int32, System.Type[], System.Type[])
at System.Reflection.CustomAttribute.FilterCustomAttributeRecord(System.Reflection.MetadataToken, System.Reflection.MetadataImport ByRef, System.Reflection.RuntimeModule, System.Reflection.MetadataToken, System.RuntimeType, Boolean, ListBuilder`1<System.Object> ByRef, System.RuntimeType ByRef, System.IRuntimeMethodInfo ByRef, Boolean ByRef)
at System.Reflection.CustomAttribute.IsCustomAttributeDefined(System.Reflection.RuntimeModule, Int32, System.RuntimeType, Int32, Boolean)
at System.Reflection.CustomAttribute.IsDefined(System.RuntimeType, System.RuntimeType, Boolean)
at System.Runtime.Serialization.CollectionDataContract.IsCollectionOrTryCreate(System.Type, Boolean, System.Runtime.Serialization.DataContract ByRef, System.Type ByRef, Boolean)
at System.Runtime.Serialization.CollectionDataContract.IsCollectionHelper(System.Type, System.Type ByRef, Boolean)
at System.Runtime.Serialization.DataContract.GetNonDCTypeStableName(System.Type)
at System.Runtime.Serialization.DataContract.GetStableName(System.Type, Boolean ByRef)
at System.Runtime.Serialization.DataContract.GetCollectionStableName(System.Type, System.Type, System.Runtime.Serialization.CollectionDataContractAttribute ByRef)
at System.Runtime.Serialization.DataContract.GetNonDCTypeStableName(System.Type)
at System.Runtime.Serialization.DataContract.GetStableName(System.Type, Boolean ByRef)
at System.Runtime.Serialization.DataContract.GetCollectionStableName(System.Type, System.Type, System.Runtime.Serialization.CollectionDataContractAttribute ByRef)
at System.Runtime.Serialization.DataContract.GetNonDCTypeStableName(System.Type)
at System.Runtime.Serialization.DataContract.GetStableName(System.Type, Boolean ByRef)
糟糕。好吧,如果我们为 RecursiveList<string>
public class RecursiveListStringSurrogate
{
// A dummy surrogate that serializes nothing, for testing purposes.
}
public class RecursiveListStringSurrogateSelector : ISerializationSurrogateProvider
{
public object GetDeserializedObject(object obj, Type targetType)
{
if (obj is RecursiveListStringSurrogate)
return new RecursiveList<string>();
return obj;
}
public object GetObjectToSerialize(object obj, Type targetType)
{
if (obj is RecursiveList<string>)
return new RecursiveListStringSurrogate();
return obj;
}
public Type GetSurrogateType(Type type)
{
if (type == typeof(RecursiveList<string>))
return typeof(RecursiveListStringSurrogate);
return type;
}
}
使用那个代理,一个空的new RecursiveList<string>()
确实可以成功序列化,因为
<RecursiveListStringSurrogate xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/" />
演示 fiddle #2 here.
好的,现在让我们尝试在 RecursiveList<string>
嵌入模型中时使用代理项,例如:
public class Model
{
public RecursiveList<string> List { get; set; }
}
好吧,当我尝试用一个空列表序列化这个模型的一个实例时,崩溃又回来了。演示 fiddle #3 here - 您需要取消注释行 //Test(new Model { List = new RecursiveList<string>() });
才能看到崩溃。
再次糟糕。目前还不完全清楚为什么会失败。我只能推测,在某个地方,Microsoft 保留了一个字典,将原始数据合同名称映射到代理数据合同名称——这会导致堆栈溢出,只是生成一个字典键。
现在这与 JObject
和您的 Test
class 有什么关系?好吧,事实证明 JObject
is another example of a recursive collection type. It implements IDictionary<string, JToken?>
and JToken
反过来实现了 IEnumerable<JToken>
,从而触发了我们在包含 RecursiveList<string>
.
您甚至可能想 report an issue 向 Microsoft 报告此事(尽管我不知道他们是否正在修复数据合同序列化程序的错误。)
解决方法
为避免此问题,您将需要修改模型以使用 JToken
成员的代理属性,如本答案开头所示:
[DataContract(Name = "Test", Namespace = "urn:actors")]
public class Test
{
[IgnoreDataMember]
public JObject obj { get; set; }
[DataMember(Name = "obj", Order = 0, IsRequired = false)]
string objSurrogate { get { return obj?.ToString(Newtonsoft.Json.Formatting.None); } set { obj = (value == null ? null : JObject.Parse(value)); } }
}
可以序列化成功如下:
var obj = new JObject();
var test = new Test()
{
obj = obj,
};
var serializer = new DataContractSerializer(test.GetType());
MemoryStream stream1 = new MemoryStream();
var writer = XmlDictionaryWriter.CreateBinaryWriter(stream1);
serializer.WriteObject(writer, test);
writer.Flush();
Console.WriteLine(System.Text.Encoding.UTF8.GetString(stream1.GetBuffer(), 0, checked((int)stream1.Length)));
备注:
如果您需要将
JToken
序列化为 根对象 您可以将其包装在某个容器对象中,或者使用ActorDataContractSurrogate
从你的问题。正如我们所见,当递归集合类型是根对象时,序列化功能似乎确实适用于它们。由于您正在序列化为二进制文件,为了提高效率,我建议将
JObject
格式化为Formatting.None
。代理属性只要标上
[DataMember]
就可以是私有的.
演示 fiddle #4 here.