在 XML 验证期间访问架构信息
Accessing schema information during XML validation
我有一些错误 XML 无法根据其架构进行验证。这些错误几乎都是一样的——违反文档模型的空元素——但它们可能发生在文档中的数百个不同元素上。
我打算的解决方案是验证文档,从 XElement 列表中生成的异常对象(如果有)的 SourceObject 属性 捕获有问题的空元素,然后从文档中删除这些元素。但是,SourceObject 属性 始终为空。
阅读相关内容后,我了解到文档对象在验证发生之前不会填充模式信息。但是,考虑到这一点,我仍然无法从验证过程中获得任何有用的信息,因为相关的对象属性始终为空,无论我何时尝试访问它们。
这是我目前的情况:
public void FixXml(string xmlDoc)
{
XDocument doc = XDocument.Parse(xmlDoc);
XmlSchemaSet schema = new XmlSchemaSet();
schema.Add("", @"../../test.xsd");
schema.Compile();
doc.Validate(schema, (Callback));
foreach (XElement element in errors)
{
// This is where I'd start making changes to the document if the list didn't contain a bunch of nulls.
}
}
回调方法:
(当我确信代码可以正常工作时,我可能会将其填充到 lambda 中)。
private void Callback(object sender, ValidationEventArgs eventArgs)
{
XmlSchemaValidationException ex = (eventArgs.Exception as XmlSchemaValidationException);
if (ex != null)
{
XElement element = (ex.SourceObject as XElement);
errors.Add(element);
}
}
This question 及其答案对我很有用,我已经将部分解决方案应用到我自己的项目中,但它似乎仍然不起作用。我觉得我在这里遗漏了一些明显而愚蠢的东西。
XmlSchemaValidationException.SourceObject
is null
is explained in the docs
的原因
When an XmlSchemaValidationException
is thrown during validation of a class that implements the IXPathNavigable
interface such as the XPathNavigator
or XmlNode
class, the object returned by the SourceObject
property is an instance of a class that implements the IXPathNavigable
interface.
When an XmlSchemaValidationException
is thrown during validation by a validating XmlReader
object, the value of the SourceObject
property is null.
不幸的是,XDocument
does not implement IXPathNavigable
等 SourceObject
是 null
。
如果您只需要 SourceObject
,您可以像这样创建调用 Extensions.CreateNavigator(this XNode node)
to create a navigator for your document, then validate using XPathNavigator.CheckValidity(XmlSchemaSet, ValidationEventHandler)
:
var errors = new List<XmlSchemaValidationException>();
ValidationEventHandler callback = (sender, args) =>
{
var exception = (args.Exception as XmlSchemaValidationException);
if (exception != null)
{
errors.Add(exception);
}
};
var navigator = doc.CreateNavigator();
navigator.CheckValidity(schema, callback);
foreach (var exception in errors)
{
var node = (XObject)exception.SourceObject;
// Do something with the node.
Console.WriteLine();
Console.WriteLine(exception);
Console.WriteLine("{0}: {1}", node.GetType(), node.ToString());
Assert.IsTrue(node != null, "node != null");
}
但是,实验表明 XmlSchemaException.SourceSchemaObject
always seems to be null with this approach, and also XElement.IXmlSerializable.GetSchema()
is not populated. I'm not sure why the source schema object is not passed in, but testing in .NET Core 3.0.0 shows it is not. (Possibly this is related to Issue #38748: XSD Validation Errors- Missing details on xsd schema error code 由于当前未实施而被关闭。)
如果您还需要源架构对象,则需要遵循 documentation for Extensions.GetSchemaInfo()
and validate the XDocument
using XDocument.Validate(XDocument, XmlSchemaSet, ValidationEventHandler, Boolean addSchemaInfo)
. This populates the schema information into the LINQ to XML tree -- but, sadly, prevents SourceObject
from being set. Instead, when errors are detected, you will need to traverse the XElement
hierarchy looking for elements and attributes for which GetSchemaInfo()
returns an IXmlSchemaInfo
for which Validity
is not Valid
:
中的方法
var errors = new List<XmlSchemaValidationException>();
ValidationEventHandler callback = (sender, args) =>
{
var exception = (args.Exception as XmlSchemaValidationException);
if (exception != null)
{
errors.Add(exception);
}
};
doc.Validate(schema, callback, true);
foreach (var exception in errors)
{
// Handle the exception itself.
Console.WriteLine(exception);
}
if (errors.Count > 0)
{
// If there were any errors, traverse the entire document looking for invalid nodes:
DumpInvalidNodes(doc.Root);
}
示例方法 DumpInvalidNodes
从 Microsoft docs
修改而来
//Taken from https://docs.microsoft.com/en-us/dotnet/api/system.xml.schema.extensions.getschemainfo?view=netframework-4.8#System_Xml_Schema_Extensions_GetSchemaInfo_System_Xml_Linq_XElement_
//with an added null check:
static void DumpInvalidNodes(XElement el)
{
if (el.GetSchemaInfo().Validity != XmlSchemaValidity.Valid)
Console.WriteLine("Invalid Element {0}",
el.AncestorsAndSelf()
.InDocumentOrder()
.Aggregate("", (s, i) => s + "/" + i.Name.ToString()));
foreach (XAttribute att in el.Attributes())
{
var si = att.GetSchemaInfo();
// MUST CHECK FOR NULL HERE
// Because w3 standard attributes like xmlns:xsi will have null SchemaInfo
// when not included in the schema, rather than being reported as Invalid.
if (si != null && si.Validity != XmlSchemaValidity.Valid)
Console.WriteLine("Invalid Attribute {0}",
att
.Parent
.AncestorsAndSelf()
.InDocumentOrder()
.Aggregate("",
(s, i) => s + "/" + i.Name.ToString()) + "/@" + att.Name.ToString()
);
}
foreach (XElement child in el.Elements())
DumpInvalidNodes(child);
}
请注意,我的测试表明需要修改文档代码以检查 XAttribute.GetSchemaInfo()
返回 null
。当未明确包含在模式中时,这似乎发生在 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
等 w3c 标准属性上。
演示 fiddle #2 here.
更新:似乎doc.CreateNavigator().CheckValidity(schema, callback)
不适用于较早版本的Full Framework;例如在 .Net 4.7 上抛出异常 System.NotSupportedException: This XPathNavigator does not support XSD validation
。演示 fiddle #3 here。如果您遇到这个问题,您将不得不使用第二种方法。
我有一些错误 XML 无法根据其架构进行验证。这些错误几乎都是一样的——违反文档模型的空元素——但它们可能发生在文档中的数百个不同元素上。
我打算的解决方案是验证文档,从 XElement 列表中生成的异常对象(如果有)的 SourceObject 属性 捕获有问题的空元素,然后从文档中删除这些元素。但是,SourceObject 属性 始终为空。
阅读相关内容后,我了解到文档对象在验证发生之前不会填充模式信息。但是,考虑到这一点,我仍然无法从验证过程中获得任何有用的信息,因为相关的对象属性始终为空,无论我何时尝试访问它们。
这是我目前的情况:
public void FixXml(string xmlDoc)
{
XDocument doc = XDocument.Parse(xmlDoc);
XmlSchemaSet schema = new XmlSchemaSet();
schema.Add("", @"../../test.xsd");
schema.Compile();
doc.Validate(schema, (Callback));
foreach (XElement element in errors)
{
// This is where I'd start making changes to the document if the list didn't contain a bunch of nulls.
}
}
回调方法: (当我确信代码可以正常工作时,我可能会将其填充到 lambda 中)。
private void Callback(object sender, ValidationEventArgs eventArgs)
{
XmlSchemaValidationException ex = (eventArgs.Exception as XmlSchemaValidationException);
if (ex != null)
{
XElement element = (ex.SourceObject as XElement);
errors.Add(element);
}
}
This question 及其答案对我很有用,我已经将部分解决方案应用到我自己的项目中,但它似乎仍然不起作用。我觉得我在这里遗漏了一些明显而愚蠢的东西。
XmlSchemaValidationException.SourceObject
is null
is explained in the docs
When an
XmlSchemaValidationException
is thrown during validation of a class that implements theIXPathNavigable
interface such as theXPathNavigator
orXmlNode
class, the object returned by theSourceObject
property is an instance of a class that implements theIXPathNavigable
interface.When an
XmlSchemaValidationException
is thrown during validation by a validatingXmlReader
object, the value of theSourceObject
property is null.
不幸的是,XDocument
does not implement IXPathNavigable
等 SourceObject
是 null
。
如果您只需要 SourceObject
,您可以像这样创建调用 Extensions.CreateNavigator(this XNode node)
to create a navigator for your document, then validate using XPathNavigator.CheckValidity(XmlSchemaSet, ValidationEventHandler)
:
var errors = new List<XmlSchemaValidationException>();
ValidationEventHandler callback = (sender, args) =>
{
var exception = (args.Exception as XmlSchemaValidationException);
if (exception != null)
{
errors.Add(exception);
}
};
var navigator = doc.CreateNavigator();
navigator.CheckValidity(schema, callback);
foreach (var exception in errors)
{
var node = (XObject)exception.SourceObject;
// Do something with the node.
Console.WriteLine();
Console.WriteLine(exception);
Console.WriteLine("{0}: {1}", node.GetType(), node.ToString());
Assert.IsTrue(node != null, "node != null");
}
但是,实验表明 XmlSchemaException.SourceSchemaObject
always seems to be null with this approach, and also XElement.IXmlSerializable.GetSchema()
is not populated. I'm not sure why the source schema object is not passed in, but testing in .NET Core 3.0.0 shows it is not. (Possibly this is related to Issue #38748: XSD Validation Errors- Missing details on xsd schema error code 由于当前未实施而被关闭。)
如果您还需要源架构对象,则需要遵循 documentation for Extensions.GetSchemaInfo()
and validate the XDocument
using XDocument.Validate(XDocument, XmlSchemaSet, ValidationEventHandler, Boolean addSchemaInfo)
. This populates the schema information into the LINQ to XML tree -- but, sadly, prevents SourceObject
from being set. Instead, when errors are detected, you will need to traverse the XElement
hierarchy looking for elements and attributes for which GetSchemaInfo()
returns an IXmlSchemaInfo
for which Validity
is not Valid
:
var errors = new List<XmlSchemaValidationException>();
ValidationEventHandler callback = (sender, args) =>
{
var exception = (args.Exception as XmlSchemaValidationException);
if (exception != null)
{
errors.Add(exception);
}
};
doc.Validate(schema, callback, true);
foreach (var exception in errors)
{
// Handle the exception itself.
Console.WriteLine(exception);
}
if (errors.Count > 0)
{
// If there were any errors, traverse the entire document looking for invalid nodes:
DumpInvalidNodes(doc.Root);
}
示例方法 DumpInvalidNodes
从 Microsoft docs
//Taken from https://docs.microsoft.com/en-us/dotnet/api/system.xml.schema.extensions.getschemainfo?view=netframework-4.8#System_Xml_Schema_Extensions_GetSchemaInfo_System_Xml_Linq_XElement_
//with an added null check:
static void DumpInvalidNodes(XElement el)
{
if (el.GetSchemaInfo().Validity != XmlSchemaValidity.Valid)
Console.WriteLine("Invalid Element {0}",
el.AncestorsAndSelf()
.InDocumentOrder()
.Aggregate("", (s, i) => s + "/" + i.Name.ToString()));
foreach (XAttribute att in el.Attributes())
{
var si = att.GetSchemaInfo();
// MUST CHECK FOR NULL HERE
// Because w3 standard attributes like xmlns:xsi will have null SchemaInfo
// when not included in the schema, rather than being reported as Invalid.
if (si != null && si.Validity != XmlSchemaValidity.Valid)
Console.WriteLine("Invalid Attribute {0}",
att
.Parent
.AncestorsAndSelf()
.InDocumentOrder()
.Aggregate("",
(s, i) => s + "/" + i.Name.ToString()) + "/@" + att.Name.ToString()
);
}
foreach (XElement child in el.Elements())
DumpInvalidNodes(child);
}
请注意,我的测试表明需要修改文档代码以检查 XAttribute.GetSchemaInfo()
返回 null
。当未明确包含在模式中时,这似乎发生在 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
等 w3c 标准属性上。
演示 fiddle #2 here.
更新:似乎doc.CreateNavigator().CheckValidity(schema, callback)
不适用于较早版本的Full Framework;例如在 .Net 4.7 上抛出异常 System.NotSupportedException: This XPathNavigator does not support XSD validation
。演示 fiddle #3 here。如果您遇到这个问题,您将不得不使用第二种方法。