根据 DTD 验证 XML - 未找到预期标记
Validate XML against DTD - expected markup not found
我知道这里也有类似的问题。不幸的是,我找不到任何能为我提供答案的东西。
我正在尝试针对现有 DTD 文件验证 XML,但我的代码一直抛出
expected DTD markup not found. Line 1 Position 1.
这是 XML 的样子(仅头部,缩写):
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE tms PUBLIC "-//Schema//DTD DocuMan TMS V5//EN" "Tms.dtd"[]>
<tms name=...
这是引用的 DTD 的样子(开头注释版权,省略,为了便于阅读而缩写):
<!ENTITY % para 'p|codeblock|procedural-instructions'>
<!ENTITY % list '(ul|ol)'>
...
...
<!-- simple reference to original dtd -->
<!ENTITY % ST4.dtd SYSTEM "ST4.dtd">
%ST4.dtd;
...
...
<!ELEMENT tms (tmsnode|node|rtf)*>
<!ATTLIST tms
...
...
引用的第二个 DTD 如下所示:
<!ENTITY lt "&#60;"> <!-- < -->
<!ENTITY gt "&#62;"> <!-- > -->
<!ENTITY amp "&#38;"> <!-- & -->
...
...
<!ELEMENT comment (#PCDATA| br | tab)*>
...
...
None 的 DTD 有一个额外的 "DOCTYPE" 元素,以防您想知道。
这是我根据 DTD 读取/验证 XML 文件的代码:
var xml = new XmlDocument();
try
{
xml.Load(fil);
var settings = new XmlReaderSettings
{
DtdProcessing = DtdProcessing.Parse,
ValidationType = ValidationType.DTD,
XmlResolver = new XmlUrlResolver()
};
var context = new XmlParserContext(xml.NameTable,
new XmlNamespaceManager(xml.NameTable),
xml.DocumentType.Name, "", xml.DocumentType.PublicId, xml.DocumentType.SystemId, "", "en", XmlSpace.Default);
using (var reader = XmlReader.Create(fil, settings, context))
{
try
{
while (reader.Read()){}
}
catch (Exception except)
{
bkwValidate.ReportProgress(index, Path.GetFileName(fil) + ": " + except.Message);
}
}
}
catch (Exception exception)
{
bkwValidate.ReportProgress(index, Path.GetFileName(fil) + ": " + exception.Message);
}
更新:
原来我在谷歌上搜索到的代码有一个错误:XmlParserContext 的参数顺序不正确。 internalSubset 的空字符串需要跟在 sysId 之后。现在这让我更进一步:
var context = new XmlParserContext(xml.NameTable,
new XmlNamespaceManager(xml.NameTable),
xml.DocumentType.Name,xml.DocumentType.PublicId, xml.DocumentType.SystemId, "","", "en", XmlSpace.Default);
不幸的是,我现在遇到了一个错误
cannot have multiple DTDs
Heureka!
问题在于 XmlParserContext:它太详细了!
如果我将它简化为最基本的必需品,即使有多个 DTD,它也能正常工作:
var xml = new XmlDocument();
try
{
xml.Load(fil);
var settings = new XmlReaderSettings
{
DtdProcessing = DtdProcessing.Parse,
ValidationType = ValidationType.DTD,
XmlResolver = new XmlUrlResolver(),
NameTable = xml.NameTable
};
var context = new XmlParserContext(xml.NameTable, new XmlNamespaceManager(xml.NameTable), "en",
XmlSpace.Preserve);
using (var reader = XmlReader.Create(fil, settings, context))
{
try
{
while (reader.Read()) { }
}
catch (Exception except)
{
bkwValidate.ReportProgress(index, Path.GetFileName(fil) + ": " + except.Message);
}
}
}
catch (Exception exception)
{
bkwValidate.ReportProgress(index, Path.GetFileName(fil) + ": " + exception.Message);
}
我知道这里也有类似的问题。不幸的是,我找不到任何能为我提供答案的东西。 我正在尝试针对现有 DTD 文件验证 XML,但我的代码一直抛出
expected DTD markup not found. Line 1 Position 1.
这是 XML 的样子(仅头部,缩写):
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE tms PUBLIC "-//Schema//DTD DocuMan TMS V5//EN" "Tms.dtd"[]>
<tms name=...
这是引用的 DTD 的样子(开头注释版权,省略,为了便于阅读而缩写):
<!ENTITY % para 'p|codeblock|procedural-instructions'>
<!ENTITY % list '(ul|ol)'>
...
...
<!-- simple reference to original dtd -->
<!ENTITY % ST4.dtd SYSTEM "ST4.dtd">
%ST4.dtd;
...
...
<!ELEMENT tms (tmsnode|node|rtf)*>
<!ATTLIST tms
...
...
引用的第二个 DTD 如下所示:
<!ENTITY lt "&#60;"> <!-- < -->
<!ENTITY gt "&#62;"> <!-- > -->
<!ENTITY amp "&#38;"> <!-- & -->
...
...
<!ELEMENT comment (#PCDATA| br | tab)*>
...
...
None 的 DTD 有一个额外的 "DOCTYPE" 元素,以防您想知道。
这是我根据 DTD 读取/验证 XML 文件的代码:
var xml = new XmlDocument();
try
{
xml.Load(fil);
var settings = new XmlReaderSettings
{
DtdProcessing = DtdProcessing.Parse,
ValidationType = ValidationType.DTD,
XmlResolver = new XmlUrlResolver()
};
var context = new XmlParserContext(xml.NameTable,
new XmlNamespaceManager(xml.NameTable),
xml.DocumentType.Name, "", xml.DocumentType.PublicId, xml.DocumentType.SystemId, "", "en", XmlSpace.Default);
using (var reader = XmlReader.Create(fil, settings, context))
{
try
{
while (reader.Read()){}
}
catch (Exception except)
{
bkwValidate.ReportProgress(index, Path.GetFileName(fil) + ": " + except.Message);
}
}
}
catch (Exception exception)
{
bkwValidate.ReportProgress(index, Path.GetFileName(fil) + ": " + exception.Message);
}
更新:
原来我在谷歌上搜索到的代码有一个错误:XmlParserContext 的参数顺序不正确。 internalSubset 的空字符串需要跟在 sysId 之后。现在这让我更进一步:
var context = new XmlParserContext(xml.NameTable,
new XmlNamespaceManager(xml.NameTable),
xml.DocumentType.Name,xml.DocumentType.PublicId, xml.DocumentType.SystemId, "","", "en", XmlSpace.Default);
不幸的是,我现在遇到了一个错误
cannot have multiple DTDs
Heureka!
问题在于 XmlParserContext:它太详细了!
如果我将它简化为最基本的必需品,即使有多个 DTD,它也能正常工作:
var xml = new XmlDocument();
try
{
xml.Load(fil);
var settings = new XmlReaderSettings
{
DtdProcessing = DtdProcessing.Parse,
ValidationType = ValidationType.DTD,
XmlResolver = new XmlUrlResolver(),
NameTable = xml.NameTable
};
var context = new XmlParserContext(xml.NameTable, new XmlNamespaceManager(xml.NameTable), "en",
XmlSpace.Preserve);
using (var reader = XmlReader.Create(fil, settings, context))
{
try
{
while (reader.Read()) { }
}
catch (Exception except)
{
bkwValidate.ReportProgress(index, Path.GetFileName(fil) + ": " + except.Message);
}
}
}
catch (Exception exception)
{
bkwValidate.ReportProgress(index, Path.GetFileName(fil) + ": " + exception.Message);
}