Hadoop 文件开头附加的奇怪字符

Strange characters appended at beginning of file in Hadoop

每当我使用 Java 在 Hadoop 中创建一个新文件并写入内容时,特殊字符都会附加在文件的开头。有办法消除吗?下面是代码

TransformerFactory tf = TransformerFactory.newInstance();
        Transformer transformer = tf.newTransformer();
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
        transformer.setOutputProperty(OutputKeys.METHOD, "xml");
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
        StringWriter writer = new StringWriter();
        transformer.transform(new DOMSource(document), new StreamResult(writer));
        String extractedXML = writer.getBuffer().toString().replaceAll("\r$", "");
        FSDataOutputStream fin = fs.create("/filelocation/input.txt");
        fin.writeUTF(extractedXML);
        fin.close();


$ hadoop fs -cat /filelocation/input.txt|head -5
)▒hello world
input1
hello again
hello
welcome again

它对我有用,只需替换下面几行

FSDataOutputStream fin = fs.create("/filelocation/input.txt");
fin.writeUTF(extractedXML);
fin.close();

使用以下代码:

OutputStream os = fs.create( "/filelocation/input.txt",  new Progressable() {
    public void progress() {

    }
 });
BufferedWriter br = new BufferedWriter( new OutputStreamWriter( os, "UTF-8" ) );
br.write(extractedXML);
br.close();