反序列化实现 IXmlSerializable 的类型集合永远运行

Deserializing collection of types implementing IXmlSerializable runs forever

我有一个 class 实现 IXmlSerializable。 class 包含一些属性。序列化和反序列化 class 的单个实例工作正常。但是在收集 class 的情况下,序列化工作正常但反序列化永远运行。这是一个代码片段。我正在使用 .Net 4.6.2。

public class MyClass : IXmlSerializable
{
    public int A { get; set; }
    public int B { get; set; }

    public XmlSchema GetSchema()
    {
        return null;
    }

    public void ReadXml(XmlReader reader)
    {
        this.A = Convert.ToInt32(reader.GetAttribute("A"));
        this.B = Convert.ToInt32(reader.GetAttribute("B"));
    }

    public void WriteXml(XmlWriter writer)
    {
        writer.WriteAttributeString("A", this.A.ToString());
        writer.WriteAttributeString("B", this.B.ToString());
    }
}
class Program
{
    static void Main(string[] args)
    {
        var instance = new MyClass { A = 1, B = 2 };
        Serialize(instance);
        instance = Deserialize<MyClass>();//works fine

        var list = new List<MyClass> { new MyClass { A = 10, B = 20 } };
        Serialize(list);
        list = Deserialize<List<MyClass>>();//runs forever
    }

    private static void Serialize(object o)
    {
        XmlSerializer ser = new XmlSerializer(o.GetType());
        using (TextWriter writer = new StreamWriter("xml.xml", false, Encoding.UTF8))
        {
            ser.Serialize(writer, o);
        }
    }

    private static T Deserialize<T>()
    {
        XmlSerializer ser = new XmlSerializer(typeof(T));
        using (TextReader reader = new StreamReader("xml.xml"))
        {
            return (T)ser.Deserialize(reader);
        }
    }
}

您的问题是,如 documentation 中所述,ReadXml() 必须消耗其包装元素及其内容:

The ReadXml method must reconstitute your object using the information that was written by the WriteXml method.

When this method is called, the reader is positioned on the start tag that wraps the information for your type. That is, directly on the start tag that indicates the beginning of a serialized object. When this method returns, it must have read the entire element from beginning to end, including all of its contents. Unlike the WriteXml method, the framework does not handle the wrapper element automatically. Your implementation must do so. Failing to observe these positioning rules may cause code to generate unexpected runtime exceptions or corrupt data.

MyClass.ReadXml() 没有这样做,当 MyClass 对象没有序列化为根元素时会导致无限循环。相反,您的 MyClass 必须看起来像这样:

public class MyClass : IXmlSerializable
{
    public int A { get; set; }
    public int B { get; set; }

    public XmlSchema GetSchema()
    {
        return null;
    }

    public void ReadXml(XmlReader reader)
    {
        /*
         * https://msdn.microsoft.com/en-us/library/system.xml.serialization.ixmlserializable.readxml.aspx
         * 
         * When this method is called, the reader is positioned at the start of the element that wraps the information for your type. 
         * That is, just before the start tag that indicates the beginning of a serialized object. When this method returns, 
         * it must have read the entire element from beginning to end, including all of its contents. Unlike the WriteXml method, 
         * the framework does not handle the wrapper element automatically. Your implementation must do so. Failing to observe these 
         * positioning rules may cause code to generate unexpected runtime exceptions or corrupt data.
         */
        var isEmptyElement = reader.IsEmptyElement;
        this.A = XmlConvert.ToInt32(reader.GetAttribute("A"));
        this.B = XmlConvert.ToInt32(reader.GetAttribute("B"));
        reader.ReadStartElement();
        if (!isEmptyElement)
        {
            reader.ReadEndElement();
        }
    }

    public void WriteXml(XmlWriter writer)
    {
        writer.WriteAttributeString("A", XmlConvert.ToString(this.A));
        writer.WriteAttributeString("B", XmlConvert.ToString(this.B));
    }
}

现在您的 <MyClass> 元素非常简单,没有嵌套或可选元素。对于更复杂的自定义序列化,您可以采用几种策略来保证您的 ReadXml() 方法读取的内容与它应该的一样多,不多也不少。

首先,您可以调用 XNode.ReadFrom() 将当前元素加载到 XElement 中。这比直接从 XmlReader 解析需要更多的内存,但是 更容易使用:

public class MyClass : IXmlSerializable
{
    public int A { get; set; }
    public int B { get; set; }

    public XmlSchema GetSchema()
    {
        return null;
    }

    public void ReadXml(XmlReader reader)
    {
        var element = (XElement)XNode.ReadFrom(reader);
        this.A = (int)element.Attribute("A");
        this.B = (int)element.Attribute("B");
    }

    public void WriteXml(XmlWriter writer)
    {
        writer.WriteAttributeString("A", XmlConvert.ToString(this.A));
        writer.WriteAttributeString("B", XmlConvert.ToString(this.B));
    }
}

其次,您可以使用 XmlReader.ReadSubtree() 来确保消耗所需的 XML 内容:

public class MyClass : IXmlSerializable
{
    public int A { get; set; }
    public int B { get; set; }

    public XmlSchema GetSchema()
    {
        return null;
    }

    protected virtual void ReadXmlSubtree(XmlReader reader)
    {
        this.A = XmlConvert.ToInt32(reader.GetAttribute("A"));
        this.B = XmlConvert.ToInt32(reader.GetAttribute("B"));
    }

    public void ReadXml(XmlReader reader)
    {
        // Consume all child nodes of the current element using ReadSubtree()
        using (var subReader = reader.ReadSubtree())
        {
            subReader.MoveToContent();
            ReadXmlSubtree(subReader);
        }
        reader.Read(); // Consume the end element itself.
    }

    public void WriteXml(XmlWriter writer)
    {
        writer.WriteAttributeString("A", XmlConvert.ToString(this.A));
        writer.WriteAttributeString("B", XmlConvert.ToString(this.B));
    }
}

一些最后的说明:

  • 一定要同时处理<MyClass /><MyClass></MyClass>。这两种形式在语义上是相同的,发送系统可以选择其中一种。

  • 优先使用 XmlConvert class 中的方法将基元与 XML 相互转换。这样做可以正确处理国际化。

  • 一定要测试是否有缩进。有时 ReadXml() 方法会消耗一个额外的 XML 节点,但是当启用缩进时错误将被隐藏——因为它是被吃掉的空白节点。

  • 如需进一步阅读,请参阅 How to Implement IXmlSerializable Correctly