读取丑字后继续解析

continue parsing after reading ugly character

c
libxml2

我有一个 XML-response（根据编码属性应该是 UTF-8），其中包含一个字符“\uffff\u0551”作为元素，xmlParseMemory() 导致 NULL-Document 错误 XML-Verarbeitungsfehler: nicht wohlgeformt [格式不正确]。

我可以设置 parserChain，以便库将跳过这些字符并继续使用结果文档进行解析吗？

我确实阅读了 xmlsoft.org 的一些（不是全部）手册页，但没有找到任何内容。

没有。 FFFF 不是 valid Unicode character, invalid characters are fatal errors, and the XML spec declares fatal errors are unrecoverable:

Once a fatal error is detected, however, the processor must not continue normal processing (i.e., it must not continue to pass character data and information about the document's logical structure to the application in the normal way).

如果您想解析此文档，您需要先清除无效字符，然后再将其交给 XML 解析器。

读取丑字后继续解析

continue parsing after reading ugly character

c

libxml2