Java 使用非法 XML 字符解组 xml
Java unmarshal xml with illegal XML characters
我正在尝试使用 javax.xml.bind.Unmarshaller
解组 XML 字符串,但收到以下错误:
Caused by: org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x13) was found in the element content of the document.
是否有任何通用解决方案可以从输入字符串中删除所有非法 XML 字符?
例如,我尝试使用以下方法但没有帮助:
public static String illegalXML11CharactersPattern = "[^"
+ "\u0001-\uD7FF"
+ "\uE000-\uFFFD"
+ "\ud800\udc00-\udbff\udfff"
+ "]+";
public static String stripNonValidXML11Characters(String xml) {
return xml.replaceAll(illegalXML11CharactersPattern, "");
}
最后,我采用以下方法完成:
xml = org.apache.commons.lang3.StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml10(xml));
我正在尝试使用 javax.xml.bind.Unmarshaller
解组 XML 字符串,但收到以下错误:
Caused by: org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x13) was found in the element content of the document.
是否有任何通用解决方案可以从输入字符串中删除所有非法 XML 字符?
例如,我尝试使用以下方法但没有帮助:
public static String illegalXML11CharactersPattern = "[^"
+ "\u0001-\uD7FF"
+ "\uE000-\uFFFD"
+ "\ud800\udc00-\udbff\udfff"
+ "]+";
public static String stripNonValidXML11Characters(String xml) {
return xml.replaceAll(illegalXML11CharactersPattern, "");
}
最后,我采用以下方法完成:
xml = org.apache.commons.lang3.StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml10(xml));