反序列化派生类型时,XMLSerializer 警告未知 nodes/attributes

XMLSerializer warns about unknown nodes/attributes when deserializing derived types

我最近使用 XMLSerializer 为未知节点、元素和属性注册了事件处理程序,我用它来反序列化类型层次结构中的复杂类型。我这样做是因为我收到的一些 XML 来自第三方;我对数据格式更改感兴趣,这可能会给我带来麻烦。

在 XML 中,XMLSerializer 生成它使用标准 XML 属性 xsi:type="somederivedtypename" 来标识由 XML 元素表示的实际派生类型.

我很惊讶地看到同一个序列化程序在反序列化时将它刚刚生成的相同属性视为未知。有趣的是,反序列化是正确和完整的(在我的真实程序中也有更复杂的类型和数据)。这意味着序列化程序会在反序列化的早期阶段正确评估类型信息。但是在稍后的数据提取阶段,该属性显然被误认为是对象的真实数据部分,这当然是未知的。

在我的应用程序中,无端警告最终会弄乱一个不受欢迎的通用日志文件。在我看来,序列化程序应该回读它产生的 XML 而不会出现问题。我的问题:

这里是一个最小的例子:

using System;
using System.IO;
using System.Xml.Serialization;

namespace XsiTypeAnomaly
{
    /// <summary>
    /// A trivial base type.
    /// </summary>
    [XmlInclude(typeof(DerivedT))]
    public class BaseT{}

    /// <summary>
    /// A trivial derived type to demonstrate a serialization issue.
    /// </summary>
    public class DerivedT : BaseT
    {
        public int anInt { get; set; }
    }

    class Program
    {
        private static void serializer_UnknownAttribute
            (   object sender, 
                XmlAttributeEventArgs e )
        {
            Console.Error.WriteLine("Warning: Deserializing " 
                    + e.ObjectBeingDeserialized
                    + ": Unknown attribute "
                    + e.Attr.Name);
                }

        private static void serializer_UnknownNode(object sender, XmlNodeEventArgs e)
        {
            Console.Error.WriteLine("Warning: Deserializing "
                    + e.ObjectBeingDeserialized
                    + ": Unknown node "
                    + e.Name);
        }

        private static void serializer_UnknownElement(object sender, XmlElementEventArgs e)
        {
            Console.Error.WriteLine("Warning: Deserializing "
                    + e.ObjectBeingDeserialized
                    + ": Unknown element "
                    + e.Element.Name);
        }

        /// <summary>
        /// Serialize, display the xml, and deserialize a trivial object.
        /// </summary>
        /// <param name="args"></param>
        static void Main(string[] args)
        {
            BaseT aTypeObj = new DerivedT() { anInt = 1 };
            using (MemoryStream stream = new MemoryStream())
            {
                var serializer = new XmlSerializer(typeof(BaseT));

                // register event handlers for unknown XML bits
                serializer.UnknownAttribute += serializer_UnknownAttribute;
                serializer.UnknownElement += serializer_UnknownElement;
                serializer.UnknownNode += serializer_UnknownNode;

                serializer.Serialize(stream, aTypeObj);
                stream.Flush();

                // output the xml
                stream.Position = 0;
                Console.Write((new StreamReader(stream)).ReadToEnd() + Environment.NewLine);
                stream.Position = 0;
                var serResult = serializer.Deserialize(stream) as DerivedT;

                Console.WriteLine(
                        (serResult.anInt == 1 ? "Successfully " : "Unsuccessfully ")
                    + "read back object");
            }
        }
    }
}

输出:

<?xml version="1.0"?>
<BaseT xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:type="DerivedT">
  <anInt>1</anInt>
</BaseT>
Warning: Deserializing XsiTypeAnomaly.DerivedT: Unknown node xsi:type
Warning: Deserializing XsiTypeAnomaly.DerivedT: Unknown attribute xsi:type
Successfully read back object

我认为这不是使用 XmlSerializer 的正确方法,即使您有正确的输出和警告,在更高级的情况下也不知道会出现什么问题。

您应该使用实际派生类型 (aTypeObj.GetType()) 甚至泛型来进行排序。

Am I doing something wrong?

我不这么认为。我同意你的观点,即 XmlSerializer 应该在没有任何警告的情况下反序列化它自己的输出。此外,xsi:type 是在 XML Schema specification, and obviously it is supported by XmlSerializer, as demonstrated by your example and documented in MSDN Library.

中定义的标准属性

因此,这种行为看起来只是一种疏忽。我可以想象一组 Microsoft 开发人员在 .NET Framework 的开发过程中致力于 XmlSerializer 的不同方面,并且从未同时测试 xsi:type 和事件。

That means that the serializer evaluates the type information properly during an early stage in the deserialization. But during a later data-extraction stage the attribute is apparently mistaken for a true data part of the object, which is of course unknown.

你的观察是正确的。

XmlSerializer class 生成一个动态程序集来序列化和反序列化对象。在您的示例中,反序列化 DerivedT 实例的生成方法如下所示:

private DerivedT Read2_DerivedT(bool isNullable, bool checkType)
{
    // [Code that uses isNullable and checkType omitted...]

    DerivedT derivedT = new DerivedT();
    while (this.Reader.MoveToNextAttribute())
    {
        if (!this.IsXmlnsAttribute(this.Reader.Name))
            this.UnknownNode(derivedT);
    }

    this.Reader.MoveToElement();
    // [Code that reads child elements and populates derivedT.anInt omitted...]
    return derivedT;
}

解串器在读取 xsi:type 属性并决定创建 DerivedT 实例后调用此方法。如您所见,while 循环为除 xmlns 属性之外的所有属性引发 UnknownNode 事件。这就是为什么您获得 xsi:type.

的 UnknownNode(和 UnknownAttribute)事件的原因

while循环是由内部XmlSerializationReaderILGen.WriteAttributes方法生成的。代码相当复杂,但我看不到会导致 xsi:type 属性被跳过的代码路径(我在下面描述的第二种解决方法除外)。

Is there a workaround?

我会忽略 xsi:type:

的 UnknownNode 和 UnknownAttribute 事件
private static void serializer_UnknownNode(object sender, XmlNodeEventArgs e)
{
    if (e.NodeType == XmlNodeType.Attribute &&
        e.NamespaceURI == XmlSchema.InstanceNamespace && e.LocalName == "type")
    {
        // Ignore xsi:type attributes.
    }
    else
    {
        // [Log it...]
    }
}

// [And similarly for UnknownAttribute using e.Attr instead of e...]

另一个(hackier,IMO)解决方法是将 xsi:type 映射到 BaseT class:

中的虚拟 属性
[XmlInclude(typeof(DerivedT))]
public class BaseT
{
    [XmlAttribute("type", Namespace = XmlSchema.InstanceNamespace)]
    [DebuggerBrowsable(DebuggerBrowsableState.Never)] // Hide this useless property
    public string XmlSchemaType
    {
        get { return null; } // Must return null for XmlSerializer.Serialize to work
        set { }
    }
}

您是否尝试过 XMLSerializer 构造函数,您可以在其中将派生类型作为 extraTypes 之一传递?

看这里:https://msdn.microsoft.com/en-us/library/e5aakyae%28v=vs.110%29.aspx

你可以这样使用它:

var serializer = new XmlSerializer(typeof(BaseT), new Type[] { typeof(DerivedT) });

当然,通常您可能希望从其他地方检索派生类型列表。例如来自另一个程序集。