如何使用 JAVA 将给定的分层 XML 文件转换为非规范化关系数据库 Table
How to convert a given hierarchical XML file into a denormalized Relational Database Table using JAVA
给定以下 XML 结构:
<clinical_study>
<primary_outcome>
<measure></measure>
<time_frame></time_frame>
<safety_issue></safety_issue>
<description></description>
</primary_outcome>
<secondary_outcome>
<measure></measure>
<time_frame></time_frame>
<safety_issue></safety_issue>
<description></description>
</secondary_outcome>
</clinical_study>
我想解析这些属性标记中的值,并将它们转储到名为 "CLINICAL_STUDY" 的 Oracle table 中,具有以下列结构:
desc clinical_study
Name Null Type
---------------------------- -------- -----------------
PRIMARY_OUTCOME_MEASURE VARCHAR2(50)
PRIMARY_OUTCOME_TIME_FRAME VARCHAR2(50)
PRIMARY_OUTCOME_SAFETY_ISSUE VARCHAR2(50)
PRIMARY_OUTCOME_DESCRIPTION VARCHAR2(4000)
SECONDARY_OUTCOME_MEASURE VARCHAR2(50)
SECONDARY_OUTCOME_TIME_FRAME VARCHAR2(50)
SECONDARY_OUTCOME_SAFETY_ISSUE VARCHAR2(50)
SECONDARY_OUTCOME_DESCRIPTION VARCHAR2(4000)
我知道有几种方法可以实现这一点,但不太确定哪种方法最简单。我倾向于将 "flatten" 数据然后轻松转换为 table 结构的 XSLT,但很好奇是否有更简单的方法。提前致谢。
如果您使用 Java 进行编程,则不需要 运行 XSLT 转换。您应该能够从 XML 文件中读取数据并将其写入数据库。
当然有不止一种方法可以做到这一点,但一种方法是使用 JAXB 读入数据。 (如果你有足够小的数据可以在处理时将其保存在内存中,这基本上可以工作。如果你有大量数据,你可能想要使用流式 XML 解析器 API 就像取而代之的是 StAX。)
首先,您可以创建一对 类 来表示您的输入数据,并使用定义 XML.
映射的 JAXB 注释进行注释
@XmlRootElement(name="clinical_study")
@XmlAccessorType(XmlAccessType.FIELD)
public class ClinicalStudy {
@XmlElement(name="primary_outcome")
private Outcome primaryOutcome;
@XmlElement(name="secondary_outcome")
private Outcome secondaryOutcome;
// getters and setters omitted for brevity
}
和
@XmlAccessorType(XmlAccessType.FIELD)
public class Outcome {
@XmlElement(name="measure")
private String measure;
@XmlElement(name="time_frame")
private String timeFrame;
@XmlElement(name="safety_issue")
private String safetyIssue;
@XmlElement(name="description")
private String description;
// getters and setters omitted for brevity
}
然后您可以直接读取或 "unmarshal" 来自 XML 的数据(假设 inputStream
是 XML 内容的流)。
JAXBContext context = JAXBContext.newInstance(ClinicalStudy.class);
Unmarshaller unmarshaller = context.createUnmarshaller();
ClinicalStudy clinicalStudy = (ClinicalStudy) unmarshaller.unmarshal(inputStream);
并插入到您的数据库中(假设 conn
是一个 JDBC 数据库连接)。
PreparedStatement pstmt = conn.prepareStatement(
"INSERT INTO clinical_study ("
+ "PRIMARY_OUTCOME_MEASURE, "
+ "PRIMARY_OUTCOME_TIME_FRAME, "
+ "PRIMARY_OUTCOME_SAFETY_ISSUE, "
+ "PRIMARY_OUTCOME_DESCRIPTION, "
+ "SECONDARY_OUTCOME_MEASURE, "
+ "SECONDARY_OUTCOME_TIME_FRAME, "
+ "SECONDARY_OUTCOME_SAFETY_ISSUE, "
+ "SECONDARY_OUTCOME_DESCRIPTION) "
+ "VALUES (?, ?, ?, ?, ?, ?, ?, ?)");
pstmt.setString(1, clinicalStudy.getPrimaryOutcome().getMeasure());
pstmt.setString(2, clinicalStudy.getPrimaryOutcome().getTimeFrame());
pstmt.setString(3, clinicalStudy.getPrimaryOutcome().getSafetyIssue());
pstmt.setString(4, clinicalStudy.getPrimaryOutcome().getDescription());
pstmt.setString(5, clinicalStudy.getSecondaryOutcome().getMeasure());
pstmt.setString(6, clinicalStudy.getSecondaryOutcome().getTimeFrame());
pstmt.setString(7, clinicalStudy.getSecondaryOutcome().getSafetyIssue());
pstmt.setString(8, clinicalStudy.getSecondaryOutcome().getDescription());
pstmt.execute();
给定以下 XML 结构:
<clinical_study>
<primary_outcome>
<measure></measure>
<time_frame></time_frame>
<safety_issue></safety_issue>
<description></description>
</primary_outcome>
<secondary_outcome>
<measure></measure>
<time_frame></time_frame>
<safety_issue></safety_issue>
<description></description>
</secondary_outcome>
</clinical_study>
我想解析这些属性标记中的值,并将它们转储到名为 "CLINICAL_STUDY" 的 Oracle table 中,具有以下列结构:
desc clinical_study
Name Null Type
---------------------------- -------- -----------------
PRIMARY_OUTCOME_MEASURE VARCHAR2(50)
PRIMARY_OUTCOME_TIME_FRAME VARCHAR2(50)
PRIMARY_OUTCOME_SAFETY_ISSUE VARCHAR2(50)
PRIMARY_OUTCOME_DESCRIPTION VARCHAR2(4000)
SECONDARY_OUTCOME_MEASURE VARCHAR2(50)
SECONDARY_OUTCOME_TIME_FRAME VARCHAR2(50)
SECONDARY_OUTCOME_SAFETY_ISSUE VARCHAR2(50)
SECONDARY_OUTCOME_DESCRIPTION VARCHAR2(4000)
我知道有几种方法可以实现这一点,但不太确定哪种方法最简单。我倾向于将 "flatten" 数据然后轻松转换为 table 结构的 XSLT,但很好奇是否有更简单的方法。提前致谢。
如果您使用 Java 进行编程,则不需要 运行 XSLT 转换。您应该能够从 XML 文件中读取数据并将其写入数据库。
当然有不止一种方法可以做到这一点,但一种方法是使用 JAXB 读入数据。 (如果你有足够小的数据可以在处理时将其保存在内存中,这基本上可以工作。如果你有大量数据,你可能想要使用流式 XML 解析器 API 就像取而代之的是 StAX。)
首先,您可以创建一对 类 来表示您的输入数据,并使用定义 XML.
映射的 JAXB 注释进行注释@XmlRootElement(name="clinical_study")
@XmlAccessorType(XmlAccessType.FIELD)
public class ClinicalStudy {
@XmlElement(name="primary_outcome")
private Outcome primaryOutcome;
@XmlElement(name="secondary_outcome")
private Outcome secondaryOutcome;
// getters and setters omitted for brevity
}
和
@XmlAccessorType(XmlAccessType.FIELD)
public class Outcome {
@XmlElement(name="measure")
private String measure;
@XmlElement(name="time_frame")
private String timeFrame;
@XmlElement(name="safety_issue")
private String safetyIssue;
@XmlElement(name="description")
private String description;
// getters and setters omitted for brevity
}
然后您可以直接读取或 "unmarshal" 来自 XML 的数据(假设 inputStream
是 XML 内容的流)。
JAXBContext context = JAXBContext.newInstance(ClinicalStudy.class);
Unmarshaller unmarshaller = context.createUnmarshaller();
ClinicalStudy clinicalStudy = (ClinicalStudy) unmarshaller.unmarshal(inputStream);
并插入到您的数据库中(假设 conn
是一个 JDBC 数据库连接)。
PreparedStatement pstmt = conn.prepareStatement(
"INSERT INTO clinical_study ("
+ "PRIMARY_OUTCOME_MEASURE, "
+ "PRIMARY_OUTCOME_TIME_FRAME, "
+ "PRIMARY_OUTCOME_SAFETY_ISSUE, "
+ "PRIMARY_OUTCOME_DESCRIPTION, "
+ "SECONDARY_OUTCOME_MEASURE, "
+ "SECONDARY_OUTCOME_TIME_FRAME, "
+ "SECONDARY_OUTCOME_SAFETY_ISSUE, "
+ "SECONDARY_OUTCOME_DESCRIPTION) "
+ "VALUES (?, ?, ?, ?, ?, ?, ?, ?)");
pstmt.setString(1, clinicalStudy.getPrimaryOutcome().getMeasure());
pstmt.setString(2, clinicalStudy.getPrimaryOutcome().getTimeFrame());
pstmt.setString(3, clinicalStudy.getPrimaryOutcome().getSafetyIssue());
pstmt.setString(4, clinicalStudy.getPrimaryOutcome().getDescription());
pstmt.setString(5, clinicalStudy.getSecondaryOutcome().getMeasure());
pstmt.setString(6, clinicalStudy.getSecondaryOutcome().getTimeFrame());
pstmt.setString(7, clinicalStudy.getSecondaryOutcome().getSafetyIssue());
pstmt.setString(8, clinicalStudy.getSecondaryOutcome().getDescription());
pstmt.execute();