EDI 到 XML 巨大的文件转换
EDI to XML Huge file conversions
我正在将 EDI 文件转换为 XML。但是,我的输入文件恰好也在 BIF 中,大约为 100Mb,这给我一个 JAVA 内存不足错误。
我试图查阅 Smook 的文档以了解大文件转换,但它是从 XML 到 EDI 的转换。
下面是 运行 我的 main
时我得到的响应
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
at java.lang.StringBuffer.append(StringBuffer.java:367)
at java.io.StringWriter.write(StringWriter.java:94)
at java.io.Writer.write(Writer.java:127)
at freemarker.core.TextBlock.accept(TextBlock.java:56)
at freemarker.core.Environment.visit(Environment.java:257)
at freemarker.core.MixedContent.accept(MixedContent.java:57)
at freemarker.core.Environment.visitByHiddingParent(Environment.java:278)
at freemarker.core.IteratorBlock$Context.runLoop(IteratorBlock.java:157)
at freemarker.core.Environment.visitIteratorBlock(Environment.java:501)
at freemarker.core.IteratorBlock.accept(IteratorBlock.java:67)
at freemarker.core.Environment.visit(Environment.java:257)
at freemarker.core.Macro$Context.runMacro(Macro.java:173)
at freemarker.core.Environment.visit(Environment.java:686)
at freemarker.core.UnifiedCall.accept(UnifiedCall.java:80)
at freemarker.core.Environment.visit(Environment.java:257)
at freemarker.core.MixedContent.accept(MixedContent.java:57)
at freemarker.core.Environment.visit(Environment.java:257)
at freemarker.core.Environment.process(Environment.java:235)
at freemarker.template.Template.process(Template.java:262)
at org.milyn.util.FreeMarkerTemplate.apply(FreeMarkerTemplate.java:92)
at org.milyn.util.FreeMarkerTemplate.apply(FreeMarkerTemplate.java:86)
at org.milyn.event.report.HtmlReportGenerator.applyTemplate(HtmlReportGenerator.java:76)
at org.milyn.event.report.AbstractReportGenerator.processFinishEvent(AbstractReportGenerator.java:197)
at org.milyn.event.report.AbstractReportGenerator.processLifecycleEvent(AbstractReportGenerator.java:157)
at org.milyn.event.report.AbstractReportGenerator.onEvent(AbstractReportGenerator.java:92)
at org.milyn.Smooks._filter(Smooks.java:558)
at org.milyn.Smooks.filterSource(Smooks.java:482)
at com.***.xfunctional.EdiToXml.runSmooksTransform(EdiToXml.java:40)
at com.***.xfunctional.EdiToXml.main(EdiToXml.java:57)
import java.io.*;
import java.util.Arrays;
import java.util.Locale;
import javax.xml.transform.stream.StreamSource;
import org.milyn.Smooks;
import org.milyn.SmooksException;
import org.milyn.container.ExecutionContext;
import org.milyn.event.report.HtmlReportGenerator;
import org.milyn.io.StreamUtils;
import org.milyn.payload.StringResult;
import org.milyn.payload.SystemOutResult;
import org.xml.sax.SAXException;
public class EdiToXml {
private static byte[] messageIn = readInputMessage();
protected static String runSmooksTransform() throws IOException, SAXException, SmooksException {
Locale defaultLocale = Locale.getDefault();
Locale.setDefault(new Locale("en", "EN"));
// Instantiate Smooks with the config...
Smooks smooks = new Smooks("smooks-config.xml");
try {
// Create an exec context - no profiles....
ExecutionContext executionContext = smooks.createExecutionContext();
StringResult result = new StringResult();
// Configure the execution context to generate a report...
executionContext.setEventListener(new HtmlReportGenerator("target/report/report.html"));
// Filter the input message to the outputWriter, using the execution context...
smooks.filterSource(executionContext, new StreamSource(new ByteArrayInputStream(messageIn)),result);
Locale.setDefault(defaultLocale);
return result.getResult();
} finally {
smooks.close();
}
}
public static void main(String[] args) throws IOException, SAXException, SmooksException {
System.out.println("\n\n==============Message In==============");
System.out.println("======================================\n");
pause(
"The EDI input stream can be seen above. Press 'enter' to see this stream transformed into XML...");
String messageOut = EdiToXml.runSmooksTransform();
System.out.println("==============Message Out=============");
System.out.println(messageOut);
System.out.println("======================================\n\n");
pause("And that's it! Press 'enter' to finish...");
}
private static byte[] readInputMessage() {
try {
InputStream input = new BufferedInputStream(new FileInputStream("/home/****/Downloads/BifInputFile.DATA"));
return StreamUtils.readStream(input);
} catch (IOException e) {
e.printStackTrace();
return "<no-message/>".getBytes();
}
}
private static void pause(String message) {
try {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
System.out.print("> " + message);
in.readLine();
} catch (IOException e) {
}
System.out.println("\n");
}
}
<?xml version="1.0"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:edi="http://www.milyn.org/xsd/smooks/edi-1.4.xsd">
<!--
Configure the EDI Reader to parse the message stream into a stream of SAX events.
-->
<edi:reader mappingModel="edi-to-xml-bif-mapping.xml" validate="false"/>
</smooks-resource-list>
我在代码中编辑了这一行以反映流的用法:-
smooks.filterSource(executionContext, new StreamSource(new FileInputStream("/home/***/Downloads/sample-text-file.txt")), result);
但是我现在把下面这个当作错误。有人猜到什么是最好的方法吗?
Exception in thread "main" org.milyn.SmooksException: Failed to filter source.
at org.milyn.delivery.sax.SmooksSAXFilter.doFilter(SmooksSAXFilter.java:97)
at org.milyn.delivery.sax.SmooksSAXFilter.doFilter(SmooksSAXFilter.java:64)
at org.milyn.Smooks._filter(Smooks.java:526)
at org.milyn.Smooks.filterSource(Smooks.java:482)
at ****.EdiToXml.runSmooksTransform(EdiToXml.java:41)
at com.***.***.EdiToXml.main(EdiToXml.java:58)
Caused by: org.milyn.edisax.EDIParseException: EDI message processing failed [EDIFACT-BIF-TO-XML][1.0]. Must be a minimum of 1 instances of segment [UNH]. Currently at segment number 1.
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:504)
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:453)
at org.milyn.edisax.EDIParser.parse(EDIParser.java:428)
at org.milyn.edisax.EDIParser.parse(EDIParser.java:386)
at org.milyn.smooks.edi.EDIReader.parse(EDIReader.java:111)
at org.milyn.delivery.sax.SAXParser.parse(SAXParser.java:76)
at org.milyn.delivery.sax.SmooksSAXFilter.doFilter(SmooksSAXFilter.java:86)
... 5 more
消息有效,xml 映射良好。我只是没有使用最佳的消息读写方法。
我开始意识到 Smooks 的 filterSource 方法可以直接使用 InputStream 和 OutputStream 作为变量。请在下方找到导致程序高效 运行 而无需经历 JAVA 内存错误的代码段。
//Instantiate a FileInputStream
FileInputStream inputStream = new FileInputStream(inputFileName);
//Instantiate an FileOutputStream
FileOutputStream outputStream = new FileOutputStream(outputFileName);
try {
// Filter the input message to the outputWriter...
smooks.filterSource(new StreamSource(inputStream), new StreamResult(outputStream));
Locale.setDefault(defaultLocale);
} finally {
smooks.close();
inputStream.close();
outputStream.close();
}
感谢社区。
此致。
我是 Smooks 和 Edifact 解析工具的原作者。 Jason 给我发邮件征求这方面的建议,但我已经多年没有参与其中了,所以不确定我能提供多大帮助。
Smooks 没有将完整的消息读入内存。它通过一个将其转换为 SAX 事件流的解析器对其进行流式传输,使其“看起来像”XML 到它的任何下游。如果这些事件随后用于在男性中构建一个大的 Java 对象模型,那么可能会导致 OOM 错误等。
查看异常消息,看起来 EDIFACT 输入与正在使用的定义文件不匹配。
Caused by: org.milyn.edisax.EDIParseException: EDI message processing failed [EDIFACT-BIF-TO-XML][1.0]. Must be a minimum of 1 instances of segment [UNH]. Currently at segment number 1.
那些 EDIFACT 定义文件最初是直接从 EDIFACT 组发布的定义生成的,但我确实记得很多人“调整”了消息格式,这似乎是这里可能发生的事情(因此出现上述错误).一种解决方案是调整 pre-generated 定义以匹配。
我知道在过去的一两年里,Smooks 在这个领域发生了很多变化(使用 Apache Daffodil 进行定义),但我不是谈论这些的最佳人选。您可以尝试使用 Smooks 邮件列表寻求帮助。
我正在将 EDI 文件转换为 XML。但是,我的输入文件恰好也在 BIF 中,大约为 100Mb,这给我一个 JAVA 内存不足错误。
我试图查阅 Smook 的文档以了解大文件转换,但它是从 XML 到 EDI 的转换。
下面是 运行 我的 main
时我得到的响应Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
at java.lang.StringBuffer.append(StringBuffer.java:367)
at java.io.StringWriter.write(StringWriter.java:94)
at java.io.Writer.write(Writer.java:127)
at freemarker.core.TextBlock.accept(TextBlock.java:56)
at freemarker.core.Environment.visit(Environment.java:257)
at freemarker.core.MixedContent.accept(MixedContent.java:57)
at freemarker.core.Environment.visitByHiddingParent(Environment.java:278)
at freemarker.core.IteratorBlock$Context.runLoop(IteratorBlock.java:157)
at freemarker.core.Environment.visitIteratorBlock(Environment.java:501)
at freemarker.core.IteratorBlock.accept(IteratorBlock.java:67)
at freemarker.core.Environment.visit(Environment.java:257)
at freemarker.core.Macro$Context.runMacro(Macro.java:173)
at freemarker.core.Environment.visit(Environment.java:686)
at freemarker.core.UnifiedCall.accept(UnifiedCall.java:80)
at freemarker.core.Environment.visit(Environment.java:257)
at freemarker.core.MixedContent.accept(MixedContent.java:57)
at freemarker.core.Environment.visit(Environment.java:257)
at freemarker.core.Environment.process(Environment.java:235)
at freemarker.template.Template.process(Template.java:262)
at org.milyn.util.FreeMarkerTemplate.apply(FreeMarkerTemplate.java:92)
at org.milyn.util.FreeMarkerTemplate.apply(FreeMarkerTemplate.java:86)
at org.milyn.event.report.HtmlReportGenerator.applyTemplate(HtmlReportGenerator.java:76)
at org.milyn.event.report.AbstractReportGenerator.processFinishEvent(AbstractReportGenerator.java:197)
at org.milyn.event.report.AbstractReportGenerator.processLifecycleEvent(AbstractReportGenerator.java:157)
at org.milyn.event.report.AbstractReportGenerator.onEvent(AbstractReportGenerator.java:92)
at org.milyn.Smooks._filter(Smooks.java:558)
at org.milyn.Smooks.filterSource(Smooks.java:482)
at com.***.xfunctional.EdiToXml.runSmooksTransform(EdiToXml.java:40)
at com.***.xfunctional.EdiToXml.main(EdiToXml.java:57)
import java.io.*;
import java.util.Arrays;
import java.util.Locale;
import javax.xml.transform.stream.StreamSource;
import org.milyn.Smooks;
import org.milyn.SmooksException;
import org.milyn.container.ExecutionContext;
import org.milyn.event.report.HtmlReportGenerator;
import org.milyn.io.StreamUtils;
import org.milyn.payload.StringResult;
import org.milyn.payload.SystemOutResult;
import org.xml.sax.SAXException;
public class EdiToXml {
private static byte[] messageIn = readInputMessage();
protected static String runSmooksTransform() throws IOException, SAXException, SmooksException {
Locale defaultLocale = Locale.getDefault();
Locale.setDefault(new Locale("en", "EN"));
// Instantiate Smooks with the config...
Smooks smooks = new Smooks("smooks-config.xml");
try {
// Create an exec context - no profiles....
ExecutionContext executionContext = smooks.createExecutionContext();
StringResult result = new StringResult();
// Configure the execution context to generate a report...
executionContext.setEventListener(new HtmlReportGenerator("target/report/report.html"));
// Filter the input message to the outputWriter, using the execution context...
smooks.filterSource(executionContext, new StreamSource(new ByteArrayInputStream(messageIn)),result);
Locale.setDefault(defaultLocale);
return result.getResult();
} finally {
smooks.close();
}
}
public static void main(String[] args) throws IOException, SAXException, SmooksException {
System.out.println("\n\n==============Message In==============");
System.out.println("======================================\n");
pause(
"The EDI input stream can be seen above. Press 'enter' to see this stream transformed into XML...");
String messageOut = EdiToXml.runSmooksTransform();
System.out.println("==============Message Out=============");
System.out.println(messageOut);
System.out.println("======================================\n\n");
pause("And that's it! Press 'enter' to finish...");
}
private static byte[] readInputMessage() {
try {
InputStream input = new BufferedInputStream(new FileInputStream("/home/****/Downloads/BifInputFile.DATA"));
return StreamUtils.readStream(input);
} catch (IOException e) {
e.printStackTrace();
return "<no-message/>".getBytes();
}
}
private static void pause(String message) {
try {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
System.out.print("> " + message);
in.readLine();
} catch (IOException e) {
}
System.out.println("\n");
}
}
<?xml version="1.0"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:edi="http://www.milyn.org/xsd/smooks/edi-1.4.xsd">
<!--
Configure the EDI Reader to parse the message stream into a stream of SAX events.
-->
<edi:reader mappingModel="edi-to-xml-bif-mapping.xml" validate="false"/>
</smooks-resource-list>
我在代码中编辑了这一行以反映流的用法:-
smooks.filterSource(executionContext, new StreamSource(new FileInputStream("/home/***/Downloads/sample-text-file.txt")), result);
但是我现在把下面这个当作错误。有人猜到什么是最好的方法吗?
Exception in thread "main" org.milyn.SmooksException: Failed to filter source.
at org.milyn.delivery.sax.SmooksSAXFilter.doFilter(SmooksSAXFilter.java:97)
at org.milyn.delivery.sax.SmooksSAXFilter.doFilter(SmooksSAXFilter.java:64)
at org.milyn.Smooks._filter(Smooks.java:526)
at org.milyn.Smooks.filterSource(Smooks.java:482)
at ****.EdiToXml.runSmooksTransform(EdiToXml.java:41)
at com.***.***.EdiToXml.main(EdiToXml.java:58)
Caused by: org.milyn.edisax.EDIParseException: EDI message processing failed [EDIFACT-BIF-TO-XML][1.0]. Must be a minimum of 1 instances of segment [UNH]. Currently at segment number 1.
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:504)
at org.milyn.edisax.EDIParser.mapSegments(EDIParser.java:453)
at org.milyn.edisax.EDIParser.parse(EDIParser.java:428)
at org.milyn.edisax.EDIParser.parse(EDIParser.java:386)
at org.milyn.smooks.edi.EDIReader.parse(EDIReader.java:111)
at org.milyn.delivery.sax.SAXParser.parse(SAXParser.java:76)
at org.milyn.delivery.sax.SmooksSAXFilter.doFilter(SmooksSAXFilter.java:86)
... 5 more
消息有效,xml 映射良好。我只是没有使用最佳的消息读写方法。
我开始意识到 Smooks 的 filterSource 方法可以直接使用 InputStream 和 OutputStream 作为变量。请在下方找到导致程序高效 运行 而无需经历 JAVA 内存错误的代码段。
//Instantiate a FileInputStream
FileInputStream inputStream = new FileInputStream(inputFileName);
//Instantiate an FileOutputStream
FileOutputStream outputStream = new FileOutputStream(outputFileName);
try {
// Filter the input message to the outputWriter...
smooks.filterSource(new StreamSource(inputStream), new StreamResult(outputStream));
Locale.setDefault(defaultLocale);
} finally {
smooks.close();
inputStream.close();
outputStream.close();
}
感谢社区。
此致。
我是 Smooks 和 Edifact 解析工具的原作者。 Jason 给我发邮件征求这方面的建议,但我已经多年没有参与其中了,所以不确定我能提供多大帮助。
Smooks 没有将完整的消息读入内存。它通过一个将其转换为 SAX 事件流的解析器对其进行流式传输,使其“看起来像”XML 到它的任何下游。如果这些事件随后用于在男性中构建一个大的 Java 对象模型,那么可能会导致 OOM 错误等。
查看异常消息,看起来 EDIFACT 输入与正在使用的定义文件不匹配。
Caused by: org.milyn.edisax.EDIParseException: EDI message processing failed [EDIFACT-BIF-TO-XML][1.0]. Must be a minimum of 1 instances of segment [UNH]. Currently at segment number 1.
那些 EDIFACT 定义文件最初是直接从 EDIFACT 组发布的定义生成的,但我确实记得很多人“调整”了消息格式,这似乎是这里可能发生的事情(因此出现上述错误).一种解决方案是调整 pre-generated 定义以匹配。
我知道在过去的一两年里,Smooks 在这个领域发生了很多变化(使用 Apache Daffodil 进行定义),但我不是谈论这些的最佳人选。您可以尝试使用 Smooks 邮件列表寻求帮助。