使用 DataContractSerializer 和 XmlDictionaryWriter 序列化 JObject 后崩溃

Crash after serializing JObject with DataContractSerializer and XmlDictionaryWriter

我必须使用 DataContractSerializer 序列化 Newtonsoft JObject,它因堆栈溢出而崩溃。 如何让它发挥作用? 我的代码是。

var serializer = new DataContractSerializer(typeof(JObject));
MemoryStream stream1 = new MemoryStream();
var writer = XmlDictionaryWriter.CreateBinaryWriter(stream1);
var obj = new JObject();
serializer.WriteObject(writer, obj);
writer.Flush();

以下示例使用 ISerializationSurrogateProvider 功能将 JObject 转换为普通类型。它会因堆栈溢出而崩溃。


using System;
using System.IO;
using Newtonsoft.Json.Linq;
using System.Runtime.Serialization;
using System.Xml;

class Program
{
    [DataContract(Name = "JTokenReference", Namespace = "urn:actors")]
    [Serializable]
    public sealed class JTokenReference
    {
        public JTokenReference()
        {
        }

        [DataMember(Name = "JType", Order = 0, IsRequired = true)]
        public JTokenType JType { get; set; }

        [DataMember(Name = "Value", Order = 1, IsRequired = true)]
        public string Value { get; set; }

        public static JTokenReference From(JToken jt)
        {
            if (jt == null)
            {
                return null;
            }
            return new JTokenReference()
            {
                Value = jt.ToString(),
                JType = jt.Type
            };
        }
        public object To()
        {
            switch (JType)
            {
                case JTokenType.Object:
                    {
                        return JObject.Parse(Value);
                    }
                case JTokenType.Array:
                    {
                        return JArray.Parse(Value);
                    }
                default:
                    {
                        return JToken.Parse(Value);
                    }
            }
        }
    }

    internal class ActorDataContractSurrogate : ISerializationSurrogateProvider
    {
        public static readonly ISerializationSurrogateProvider Instance = new ActorDataContractSurrogate();

        public Type GetSurrogateType(Type type)
        {
            if (typeof(JToken).IsAssignableFrom(type))
            {
                return typeof(JTokenReference);
            }

            return type;
        }

        public object GetObjectToSerialize(object obj, Type targetType)
        {
            if (obj == null)
            {
                return null;
            }
            else if (obj is JToken jt)
            {
                return JTokenReference.From(jt);
            }

            return obj;
        }

        public object GetDeserializedObject(object obj, Type targetType)
        {
            if (obj == null)
            {
                return null;
            }
            else if (obj is JTokenReference reference &&
                    typeof(JToken).IsAssignableFrom(targetType))
            {
                return reference.To();
            }
            return obj;
        }
    }

    [DataContract(Name = "Test", Namespace = "urn:actors")]
    [Serializable]
    public class Test
    {
        [DataMember(Name = "obj", Order = 0, IsRequired = false)]
        public JObject obj;
    }

    static void Main(string[] args)
    {
        var serializer = new DataContractSerializer(typeof(Test),
        new DataContractSerializerSettings()
        {
            MaxItemsInObjectGraph = int.MaxValue,
            KnownTypes = new Type[] { typeof(JTokenReference), typeof(JObject), typeof(JToken) },
        });

        serializer.SetSerializationSurrogateProvider(ActorDataContractSurrogate.Instance);

        MemoryStream stream1 = new MemoryStream();
        var writer = XmlDictionaryWriter.CreateBinaryWriter(stream1);
        var obj = new JObject();
        var test = new Test()
        {
            obj = obj,
        };
        serializer.WriteObject(writer, test);
        writer.Flush();
        Console.WriteLine(System.Text.Encoding.UTF8.GetString(stream1.GetBuffer(), 0, checked((int)stream1.Length)));
    }
}

我正在尝试定义一个新类型 JTokenReference 以在序列化时替换 JObject/JToken,但它在替换发生之前崩溃了。好像解析类型失败了。

TL;DR

您的方法是合理的,应该可行,但由于递归集合类型的 ISerializationSurrogateProvider 功能中似乎存在错误而失败。每当需要序列化 ​​JToken 时,您都需要更改设计以使用代理项属性,例如如下:

[IgnoreDataMember]
public JObject obj { get; set; }

[DataMember(Name = "obj", Order = 0, IsRequired = false)]
string objSurrogate { get { return obj?.ToString(Newtonsoft.Json.Formatting.None); } set { obj = (value == null ? null : JObject.Parse(value)); } }

说明

您遇到的崩溃是堆栈溢出,可以更简单地重现如下。当数据协定序列化程序编写一个泛型如 List<string> 时,它通过组合泛型 class 和参数名称构造一个 data contract name,如下所示:

  • List<string>: ArrayOfstring
  • List<List<string>: ArrayOfArrayOfstring
  • List<List<List<string>>>: ArrayOfArrayOfArrayOfstring

等等。随着通用嵌套越来越深,名称变得越来越长。那么,如果我们像下面这样定义一个自递归集合类型会发生什么?

public class RecursiveList<T> : List<RecursiveList<T>>
{
}

好吧,如果我们尝试使用数据契约序列化程序序列化其中一个列表,它会因堆栈溢出异常而崩溃 试图找出契约名称。演示 fiddle #1 here -- 您需要取消注释行 //Test(new RecursiveList<string>()); 才能看到崩溃:

Stack overflow.
   at System.ModuleHandle.ResolveType(System.Runtime.CompilerServices.QCallModule, Int32, IntPtr*, Int32, IntPtr*, Int32, System.Runtime.CompilerServices.ObjectHandleOnStack)
   at System.ModuleHandle.ResolveTypeHandleInternal(System.Reflection.RuntimeModule, Int32, System.RuntimeTypeHandle[], System.RuntimeTypeHandle[])
   at System.Reflection.RuntimeModule.ResolveType(Int32, System.Type[], System.Type[])
   at System.Reflection.CustomAttribute.FilterCustomAttributeRecord(System.Reflection.MetadataToken, System.Reflection.MetadataImport ByRef, System.Reflection.RuntimeModule, System.Reflection.MetadataToken, System.RuntimeType, Boolean, ListBuilder`1<System.Object> ByRef, System.RuntimeType ByRef, System.IRuntimeMethodInfo ByRef, Boolean ByRef)
   at System.Reflection.CustomAttribute.IsCustomAttributeDefined(System.Reflection.RuntimeModule, Int32, System.RuntimeType, Int32, Boolean)
   at System.Reflection.CustomAttribute.IsDefined(System.RuntimeType, System.RuntimeType, Boolean)
   at System.Runtime.Serialization.CollectionDataContract.IsCollectionOrTryCreate(System.Type, Boolean, System.Runtime.Serialization.DataContract ByRef, System.Type ByRef, Boolean)
   at System.Runtime.Serialization.CollectionDataContract.IsCollectionHelper(System.Type, System.Type ByRef, Boolean)
   at System.Runtime.Serialization.DataContract.GetNonDCTypeStableName(System.Type)
   at System.Runtime.Serialization.DataContract.GetStableName(System.Type, Boolean ByRef)
   at System.Runtime.Serialization.DataContract.GetCollectionStableName(System.Type, System.Type, System.Runtime.Serialization.CollectionDataContractAttribute ByRef)
   at System.Runtime.Serialization.DataContract.GetNonDCTypeStableName(System.Type)
   at System.Runtime.Serialization.DataContract.GetStableName(System.Type, Boolean ByRef)
   at System.Runtime.Serialization.DataContract.GetCollectionStableName(System.Type, System.Type, System.Runtime.Serialization.CollectionDataContractAttribute ByRef)
   at System.Runtime.Serialization.DataContract.GetNonDCTypeStableName(System.Type)
   at System.Runtime.Serialization.DataContract.GetStableName(System.Type, Boolean ByRef)

糟糕。好吧,如果我们为 RecursiveList<string>

创建一个序列化代理,例如下面的虚拟代理,会怎样呢?
public class RecursiveListStringSurrogate
{
    // A dummy surrogate that serializes nothing, for testing purposes.
}

public class RecursiveListStringSurrogateSelector : ISerializationSurrogateProvider
{
    public object GetDeserializedObject(object obj, Type targetType)
    {
        if (obj is RecursiveListStringSurrogate)
            return new RecursiveList<string>();
        return obj;
    }

    public object GetObjectToSerialize(object obj, Type targetType)
    {
        if (obj is RecursiveList<string>)
            return new RecursiveListStringSurrogate();
        return obj;
    }

    public Type GetSurrogateType(Type type) 
    {
        if (type == typeof(RecursiveList<string>))
            return typeof(RecursiveListStringSurrogate);
        return type;
    }
}

使用那个代理,一个空的new RecursiveList<string>()确实可以成功序列化,因为

<RecursiveListStringSurrogate xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/" />

演示 fiddle #2 here.

好的,现在让我们尝试在 RecursiveList<string> 嵌入模型中时使用代理项,例如:

public class Model
{
    public RecursiveList<string> List { get; set; }
}

好吧,当我尝试用一​​个空列表序列化这个模型的一个实例时,崩溃又回来了。演示 fiddle #3 here - 您需要取消注释行 //Test(new Model { List = new RecursiveList<string>() }); 才能看到崩溃。

再次糟糕。目前还不完全清楚为什么会失败。我只能推测,在某个地方,Microsoft 保留了一个字典,将原始数据合同名称映射到代理数据合同名称——这会导致堆栈溢出,只是生成一个字典键。

现在这与 JObject 和您的 Test class 有什么关系?好吧,事实证明 JObject is another example of a recursive collection type. It implements IDictionary<string, JToken?> and JToken 反过来实现了 IEnumerable<JToken>,从而触发了我们在包含 RecursiveList<string>.

的简单模型中看到的相同堆栈溢出。

您甚至可能想 report an issue 向 Microsoft 报告此事(尽管我不知道他们是否正在修复数据合同序列化程序的错误。)

解决方法

为避免此问题,您将需要修改模型以使用 JToken 成员的代理属性,如本答案开头所示:

[DataContract(Name = "Test", Namespace = "urn:actors")]
public class Test
{
    [IgnoreDataMember]
    public JObject obj { get; set; }
    
    [DataMember(Name = "obj", Order = 0, IsRequired = false)]
    string objSurrogate { get { return obj?.ToString(Newtonsoft.Json.Formatting.None); } set { obj = (value == null ? null : JObject.Parse(value)); } }
}

可以序列化成功如下:

var obj = new JObject();
var test = new Test()
{
    obj = obj,
};

var serializer = new DataContractSerializer(test.GetType());

MemoryStream stream1 = new MemoryStream();
var writer = XmlDictionaryWriter.CreateBinaryWriter(stream1);
serializer.WriteObject(writer, test);
writer.Flush();
Console.WriteLine(System.Text.Encoding.UTF8.GetString(stream1.GetBuffer(), 0, checked((int)stream1.Length)));

备注:

  • 如果您需要将 JToken 序列化为 根对象 您可以将其包装在某个容器对象中,或者使用 ActorDataContractSurrogate 从你的问题。正如我们所见,当递归集合类型是根对象时,序列化功能似乎确实适用于它们。

  • 由于您正在序列化为二进制文件,为了提高效率,我建议将 JObject 格式化为 Formatting.None

  • 代理属性只要标上[DataMember]就可以是私有的.

演示 fiddle #4 here.