反序列化大型 XML 文档中的单个元素:xmlSerializer.Deserialize(xmlReader.ReadSubtree()) 由于命名空间问题而失败
Deserializing a single element in a large XML document: xmlSerializer.Deserialize(xmlReader.ReadSubtree()) fails due to namespace issues
我正在尝试一次性处理大型 XML 文档(使用 XmlReader
),并使用 XmlSerializer
.[=31 仅反序列化其中的某些元素=]
下面是一些代码和一个小型模拟 XML 文档,展示了我是如何尝试这样做的。
Rationale for using XmlReader
: 1. I am dealing with very large XML documents (10–250 MB), which for this reason I do not want to load into memory. So XmlDocument
is out of the question. 2. I want to extract only certain elements. Typically I will be able to ignore most other content. XmlReader
appears to give me an efficient means of skipping irrelevant content. 3. I do not know in advance whether any and all elements that I can deal with will be present; therefore I am not using a bunch of Xpath
/XQuery
or LINQ to XML-based queries, because I want to make only a single pass over the XML files (due to their size).
public class ElementOfInterest { }
…
var xml = @"<?xml version='1.0' encoding='utf-8' ?>
<Root xmlns:ex='urn:stakx:example'
xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>
<ElementOfInterest xsi:type='ex:ElementOfInterest' />
</Root>";
var reader = System.Xml.XmlReader.Create(new System.IO.StringReader(xml));
reader.ReadToFollowing("ElementOfInterest");
var serializer = new System.Xml.Serialization.XmlSerializer(typeof(ElementOfInterest));
serializer.Deserialize(reader.ReadSubtree());
最后一行代码抛出以下内部异常:
InvalidOperationException
: "Namespace prefix ex
is not defined."
显然,XmlSerializer
无法识别 xsi:type
属性值中的 ex
命名空间前缀。
这只是我遇到的一个错误,但坦率地说,更大的问题是我不知道如何处理整个命名空间问题。我只是在寻找一种方便的方法来反序列化 XML 文档中的单个节点,但这似乎需要手动 register/manage 命名空间,并以某种方式从 XmlReader
到 XmlSerializer
.
有人可以通过指出我的代码中的错误或展示替代方法来演示如何从使用 XmlReader
读取的 XML 文档中反序列化单个节点吗?
以下作品:
using System.IO;
using System.Xml;
using System.Xml.Serialization;
static void Main()
{
var xml = @"<?xml version='1.0' encoding='utf-8' ?>
<Root
xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
xmlns:ex='urn:stakx:example'
>
<ex:ElementOfInterest xsi:type='ex:ElementOfInterest' />
</Root>";
var nt = new NameTable();
var mgr = new XmlNamespaceManager(nt);
mgr.AddNamespace("ex", "urn:stakx:example");
var ctxt = new XmlParserContext(nt, mgr, "", XmlSpace.Default);
var reader = XmlReader.Create(new StringReader(xml), null, ctxt);
var serializer = new XmlSerializer(typeof(ElementOfInterest));
reader.ReadToFollowing("ElementOfInterest", "urn:stakx:example");
var eoi = (ElementOfInterest)serializer.Deserialize(reader.ReadSubtree());
}
[XmlRoot(Namespace = "urn:stakx:example")]
public class ElementOfInterest { }
注意输入中的命名空间:<ex:ElementOfInterest>
。
我正在尝试一次性处理大型 XML 文档(使用 XmlReader
),并使用 XmlSerializer
.[=31 仅反序列化其中的某些元素=]
下面是一些代码和一个小型模拟 XML 文档,展示了我是如何尝试这样做的。
Rationale for using
XmlReader
: 1. I am dealing with very large XML documents (10–250 MB), which for this reason I do not want to load into memory. SoXmlDocument
is out of the question. 2. I want to extract only certain elements. Typically I will be able to ignore most other content.XmlReader
appears to give me an efficient means of skipping irrelevant content. 3. I do not know in advance whether any and all elements that I can deal with will be present; therefore I am not using a bunch ofXpath
/XQuery
or LINQ to XML-based queries, because I want to make only a single pass over the XML files (due to their size).
public class ElementOfInterest { }
…
var xml = @"<?xml version='1.0' encoding='utf-8' ?>
<Root xmlns:ex='urn:stakx:example'
xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>
<ElementOfInterest xsi:type='ex:ElementOfInterest' />
</Root>";
var reader = System.Xml.XmlReader.Create(new System.IO.StringReader(xml));
reader.ReadToFollowing("ElementOfInterest");
var serializer = new System.Xml.Serialization.XmlSerializer(typeof(ElementOfInterest));
serializer.Deserialize(reader.ReadSubtree());
最后一行代码抛出以下内部异常:
InvalidOperationException
: "Namespace prefixex
is not defined."
显然,XmlSerializer
无法识别 xsi:type
属性值中的 ex
命名空间前缀。
这只是我遇到的一个错误,但坦率地说,更大的问题是我不知道如何处理整个命名空间问题。我只是在寻找一种方便的方法来反序列化 XML 文档中的单个节点,但这似乎需要手动 register/manage 命名空间,并以某种方式从 XmlReader
到 XmlSerializer
.
有人可以通过指出我的代码中的错误或展示替代方法来演示如何从使用 XmlReader
读取的 XML 文档中反序列化单个节点吗?
以下作品:
using System.IO;
using System.Xml;
using System.Xml.Serialization;
static void Main()
{
var xml = @"<?xml version='1.0' encoding='utf-8' ?>
<Root
xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
xmlns:ex='urn:stakx:example'
>
<ex:ElementOfInterest xsi:type='ex:ElementOfInterest' />
</Root>";
var nt = new NameTable();
var mgr = new XmlNamespaceManager(nt);
mgr.AddNamespace("ex", "urn:stakx:example");
var ctxt = new XmlParserContext(nt, mgr, "", XmlSpace.Default);
var reader = XmlReader.Create(new StringReader(xml), null, ctxt);
var serializer = new XmlSerializer(typeof(ElementOfInterest));
reader.ReadToFollowing("ElementOfInterest", "urn:stakx:example");
var eoi = (ElementOfInterest)serializer.Deserialize(reader.ReadSubtree());
}
[XmlRoot(Namespace = "urn:stakx:example")]
public class ElementOfInterest { }
注意输入中的命名空间:<ex:ElementOfInterest>
。