Java com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl 的 11 个 UTF-16 BOM 问题
Java 11 UTF-16 BOM issues with com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl
我有一个 UTF-16 XML 文件:
<?xml version="1.0" encoding="utf-16" standalone="yes"?>
它以 BOM FE FF 开头。
将我的代码迁移到 Java11,我得到:
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]
Message: Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:652) ~[?:?]
这是使用 JAXB 对其进行解组。
如果我使用参考实现,就会发生:
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]
Message: Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:652) ~[?:?]
at com.sun.xml.bind.v2.runtime.unmarshaller.StAXStreamConnector.bridge(StAXStreamConnector.java:134) ~[jaxb-runtime-2.4.0-SNAPSHOT.jar:?]
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:385) ~[jaxb-runtime-2.4.0-SNAPSHOT.jar:?]
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:356) ~[jaxb-runtime-2.4.0-SNAPSHOT.jar:?]
或 MOXy:
Message: Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:652) ~[?:?]
at org.eclipse.persistence.internal.oxm.record.XMLStreamReaderReader.parse(XMLStreamReaderReader.java:98) ~[org.eclipse.persistence.core-2.5.2.jar:?]
at org.eclipse.persistence.internal.oxm.record.XMLStreamReaderReader.parse(XMLStreamReaderReader.java:86) ~[org.eclipse.persistence.core-2.5.2.jar:?]
at org.eclipse.persistence.internal.oxm.record.SAXUnmarshaller.unmarshal(SAXUnmarshaller.java:895) ~[org.eclipse.persistence.core-2.5.2.jar:?]
at org.eclipse.persistence.oxm.XMLUnmarshaller.unmarshal(XMLUnmarshaller.java:659) ~[org.eclipse.persistence.core-2.5.2.jar:?]
at org.eclipse.persistence.jaxb.JAXBUnmarshaller.unmarshal(JAXBUnmarshaller.java:585) ~[org.eclipse.persistence.moxy-2.5.2.jar:?]
他们都用com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl
使用 Java 6 到 8 对该文件进行解组工作正常。Java 9 或 11 有什么变化吗?
如果我删除 FE FF BOM,它会解组 OK Java 11。
原来我的问题是由 maven-resources-plugin 引起的,过滤设置为 true。那是在破坏任何 UTF-16 资源,将前 2 个字节更改为 EF BF。
我有一个 UTF-16 XML 文件:
<?xml version="1.0" encoding="utf-16" standalone="yes"?>
它以 BOM FE FF 开头。
将我的代码迁移到 Java11,我得到:
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]
Message: Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:652) ~[?:?]
这是使用 JAXB 对其进行解组。
如果我使用参考实现,就会发生:
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]
Message: Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:652) ~[?:?]
at com.sun.xml.bind.v2.runtime.unmarshaller.StAXStreamConnector.bridge(StAXStreamConnector.java:134) ~[jaxb-runtime-2.4.0-SNAPSHOT.jar:?]
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:385) ~[jaxb-runtime-2.4.0-SNAPSHOT.jar:?]
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:356) ~[jaxb-runtime-2.4.0-SNAPSHOT.jar:?]
或 MOXy:
Message: Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:652) ~[?:?]
at org.eclipse.persistence.internal.oxm.record.XMLStreamReaderReader.parse(XMLStreamReaderReader.java:98) ~[org.eclipse.persistence.core-2.5.2.jar:?]
at org.eclipse.persistence.internal.oxm.record.XMLStreamReaderReader.parse(XMLStreamReaderReader.java:86) ~[org.eclipse.persistence.core-2.5.2.jar:?]
at org.eclipse.persistence.internal.oxm.record.SAXUnmarshaller.unmarshal(SAXUnmarshaller.java:895) ~[org.eclipse.persistence.core-2.5.2.jar:?]
at org.eclipse.persistence.oxm.XMLUnmarshaller.unmarshal(XMLUnmarshaller.java:659) ~[org.eclipse.persistence.core-2.5.2.jar:?]
at org.eclipse.persistence.jaxb.JAXBUnmarshaller.unmarshal(JAXBUnmarshaller.java:585) ~[org.eclipse.persistence.moxy-2.5.2.jar:?]
他们都用com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl
使用 Java 6 到 8 对该文件进行解组工作正常。Java 9 或 11 有什么变化吗?
如果我删除 FE FF BOM,它会解组 OK Java 11。
原来我的问题是由 maven-resources-plugin 引起的,过滤设置为 true。那是在破坏任何 UTF-16 资源,将前 2 个字节更改为 EF BF。