如何使用 XSLT 2.0 将 csv 文件转换为结构化 XML 文件?
how to transform a csv file to a structured XML file using XSLT 2.0?
我想将以下 CSV 转换为 XML
CSV 输入示例
01,TeacherHeader1
02,StudentHeader1
03,SubjectHeader1
10,Grade1,Score99
10,Grade2,Score99
48,SubjectTrailer1
49,StudentTrailer1
02,StudentHeader2
03,SubjectHeader1
10,Grade1,Score50
10,Grade2,Score50
48,SubjectTrailer1
49,StudentTrailer2
50,TeacherTrailer1
输出应该是
<FileHeader>
<id>01</id>
<name>TeacherHeader1</name>
</FileHeader>
<GroupRecord>
<GroupHeader>
<id>02</id>
<name>StudentHeader1</name>
</GroupHeader>
<AccountRecord>
<AccountHeader>
<id>03</id>
<name>SubjectHeader1</name>
</AccountHeader>
<AccountDetails>
<Details>
<id>10</id>
<name>Grade1</name>
<value>Score99</value>
</Details>
<Details>
<id>10</id>
<name>Grade2</name>
<value>Score99</value>
</Details>
</AccountDetails>
<AccountTrailer>
<id>48</id>
<name>SubjectTrailer1</name>
</AccountTrailer>
</AccountRecord>
<GroupTrailer>
<id>49</id>
<name>StudentTrailer1</name>
</GroupTrailer>
</GroupRecord>
<GroupRecord>
<GroupHeader>
<id>02</id>
<name>StudentHeader2</name>
</GroupHeader>
<AccountRecord>
<AccountHeader>
<id>03</id>
<name>SubjectHeader1</name>
</AccountHeader>
<AccountDetails>
<Details>
<id>10</id>
<name>Grade1</name>
<value>Score99</value>
</Details>
<Details>
<id>10</id>
<name>Grade2</name>
<value>Score99</value>
</Details>
</AccountDetails>
<AccountTrailer>
<id>48</id>
<name>SubjectTrailer1</name>
</AccountTrailer>
</AccountRecord>
<GroupTrailer>
<id>49</id>
<name>StudentTrailer2</name>
</GroupTrailer>
</GroupRecord>
<FileTrailer>
<id>50</id>
<name>TeacherTrailer1</name>
</FileTrailer>
其中
01 = FileHeader
02 = GroupHeader (grouped inside GroupRecord)
03 = AccountHeader (grouped inside AccountRecord)
10 = Details (grouped inside AccountDetails)
48 = AccountTrailer (grouped inside AccountRecord)
49 = GroupTrailer (group inside GroupRecord)
50 = FileTrailer
我想将上面的 CSV 转换为结构正确的 XML,如上所示。
任何帮助将不胜感激。谢谢
正如我在评论中所说,您可以使用 unparsed-text
和 tokenize
处理文本文件以将其转换为 XML(或使用 unparsed-text-lines
和 tokenize
in XSLT 3 if available),那么剩下的任务可以用嵌套的 xsl:for-each-group
s 完成,一旦建立了规则模式,甚至可以用一个或两个递归函数;以下尝试拼出嵌套的 for-each-group
s:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
expand-text="yes"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="#all"
version="3.0">
<xsl:param name="data" as="xs:string">01,TeacherHeader1
02,StudentHeader1
03,SubjectHeader1
10,Grade1,Score99
10,Grade2,Score99
48,SubjectTrailer1
49,StudentTrailer1
02,StudentHeader2
03,SubjectHeader1
10,Grade1,Score50
10,Grade2,Score50
48,SubjectTrailer1
49,StudentTrailer2
50,TeacherTrailer1</xsl:param>
<xsl:param name="header-ids" as="xs:string*"
select="'01', '02', '03', '10', '48', '49', '50'"/>
<xsl:param name="header-names" as="xs:string*"
select="'FileHeader ', 'GroupHeader', 'AccountHeader', 'Details', 'AccountTrailer', 'GroupTrailer', 'FileTrailer'"/>
<xsl:variable name="lines">
<xsl:for-each select="tokenize($data, '\r?\n')">
<line>
<xsl:variable name="tokens" as="xs:string*" select="tokenize(., ',')"/>
<id>{$tokens[1]}</id>
<name>{$tokens[2]}</name>
<xsl:if test="$tokens[3]">
<value>{$tokens[3]}</value>
</xsl:if>
</line>
</xsl:for-each>
</xsl:variable>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/" name="xsl:initial-template">
<xsl:for-each-group select="$lines/line" group-starting-with="line[id = '01']">
<File>
<xsl:apply-templates select="."/>
<xsl:for-each-group select="current-group() except ." group-ending-with="line[id = '50']">
<xsl:for-each-group select="current-group()[position() lt last()]" group-starting-with="line[id = '02']">
<GroupRecord>
<xsl:apply-templates select="."/>
<xsl:for-each-group select="current-group() except ." group-ending-with="line[id = '49']">
<xsl:for-each-group select="current-group()[position() lt last()]" group-starting-with="line[id = '03']">
<AccountRecord>
<xsl:apply-templates select="."/>
<AccountDetails>
<xsl:apply-templates select="(current-group() except .)[id != '48']"/>
</AccountDetails>
<xsl:apply-templates select="current-group()[id = '48']"/>
</AccountRecord>
</xsl:for-each-group>
<xsl:apply-templates select="current-group()[last()]"/>
</xsl:for-each-group>
</GroupRecord>
</xsl:for-each-group>
<xsl:apply-templates select="current-group()[last()]"/>
</xsl:for-each-group>
</File>
</xsl:for-each-group>
</xsl:template>
<xsl:template match="line">
<xsl:element name="{$header-names[index-of($header-ids, current()/id)]}">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
https://xsltfiddle.liberty-development.net/gWEaSv8。为了示例的完整性和紧凑性,示例数据已被内联,但您当然可以改用 <xsl:param name="data" as="xs:string" select="unparsed-text('file.txt')"/>
。我还使用了 xsl:mode
声明和 name="xsl:initial-template"
,这两个 XSLT 3 特性都需要为 XSLT 2 处理器进行调整,以拼出身份转换并使用不同的模板名称,例如name="main"
作为代码的入口点。我还在那里使用了像 <id>{$tokens[1]}</id>
这样的文本值模板,对于 XSLT 2 处理器,您需要使用例如<id><xsl:value-of select="$tokens[1]"/</id>
.
我想将以下 CSV 转换为 XML
CSV 输入示例
01,TeacherHeader1
02,StudentHeader1
03,SubjectHeader1
10,Grade1,Score99
10,Grade2,Score99
48,SubjectTrailer1
49,StudentTrailer1
02,StudentHeader2
03,SubjectHeader1
10,Grade1,Score50
10,Grade2,Score50
48,SubjectTrailer1
49,StudentTrailer2
50,TeacherTrailer1
输出应该是
<FileHeader>
<id>01</id>
<name>TeacherHeader1</name>
</FileHeader>
<GroupRecord>
<GroupHeader>
<id>02</id>
<name>StudentHeader1</name>
</GroupHeader>
<AccountRecord>
<AccountHeader>
<id>03</id>
<name>SubjectHeader1</name>
</AccountHeader>
<AccountDetails>
<Details>
<id>10</id>
<name>Grade1</name>
<value>Score99</value>
</Details>
<Details>
<id>10</id>
<name>Grade2</name>
<value>Score99</value>
</Details>
</AccountDetails>
<AccountTrailer>
<id>48</id>
<name>SubjectTrailer1</name>
</AccountTrailer>
</AccountRecord>
<GroupTrailer>
<id>49</id>
<name>StudentTrailer1</name>
</GroupTrailer>
</GroupRecord>
<GroupRecord>
<GroupHeader>
<id>02</id>
<name>StudentHeader2</name>
</GroupHeader>
<AccountRecord>
<AccountHeader>
<id>03</id>
<name>SubjectHeader1</name>
</AccountHeader>
<AccountDetails>
<Details>
<id>10</id>
<name>Grade1</name>
<value>Score99</value>
</Details>
<Details>
<id>10</id>
<name>Grade2</name>
<value>Score99</value>
</Details>
</AccountDetails>
<AccountTrailer>
<id>48</id>
<name>SubjectTrailer1</name>
</AccountTrailer>
</AccountRecord>
<GroupTrailer>
<id>49</id>
<name>StudentTrailer2</name>
</GroupTrailer>
</GroupRecord>
<FileTrailer>
<id>50</id>
<name>TeacherTrailer1</name>
</FileTrailer>
其中
01 = FileHeader
02 = GroupHeader (grouped inside GroupRecord)
03 = AccountHeader (grouped inside AccountRecord)
10 = Details (grouped inside AccountDetails)
48 = AccountTrailer (grouped inside AccountRecord)
49 = GroupTrailer (group inside GroupRecord)
50 = FileTrailer
我想将上面的 CSV 转换为结构正确的 XML,如上所示。 任何帮助将不胜感激。谢谢
正如我在评论中所说,您可以使用 unparsed-text
和 tokenize
处理文本文件以将其转换为 XML(或使用 unparsed-text-lines
和 tokenize
in XSLT 3 if available),那么剩下的任务可以用嵌套的 xsl:for-each-group
s 完成,一旦建立了规则模式,甚至可以用一个或两个递归函数;以下尝试拼出嵌套的 for-each-group
s:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
expand-text="yes"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="#all"
version="3.0">
<xsl:param name="data" as="xs:string">01,TeacherHeader1
02,StudentHeader1
03,SubjectHeader1
10,Grade1,Score99
10,Grade2,Score99
48,SubjectTrailer1
49,StudentTrailer1
02,StudentHeader2
03,SubjectHeader1
10,Grade1,Score50
10,Grade2,Score50
48,SubjectTrailer1
49,StudentTrailer2
50,TeacherTrailer1</xsl:param>
<xsl:param name="header-ids" as="xs:string*"
select="'01', '02', '03', '10', '48', '49', '50'"/>
<xsl:param name="header-names" as="xs:string*"
select="'FileHeader ', 'GroupHeader', 'AccountHeader', 'Details', 'AccountTrailer', 'GroupTrailer', 'FileTrailer'"/>
<xsl:variable name="lines">
<xsl:for-each select="tokenize($data, '\r?\n')">
<line>
<xsl:variable name="tokens" as="xs:string*" select="tokenize(., ',')"/>
<id>{$tokens[1]}</id>
<name>{$tokens[2]}</name>
<xsl:if test="$tokens[3]">
<value>{$tokens[3]}</value>
</xsl:if>
</line>
</xsl:for-each>
</xsl:variable>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/" name="xsl:initial-template">
<xsl:for-each-group select="$lines/line" group-starting-with="line[id = '01']">
<File>
<xsl:apply-templates select="."/>
<xsl:for-each-group select="current-group() except ." group-ending-with="line[id = '50']">
<xsl:for-each-group select="current-group()[position() lt last()]" group-starting-with="line[id = '02']">
<GroupRecord>
<xsl:apply-templates select="."/>
<xsl:for-each-group select="current-group() except ." group-ending-with="line[id = '49']">
<xsl:for-each-group select="current-group()[position() lt last()]" group-starting-with="line[id = '03']">
<AccountRecord>
<xsl:apply-templates select="."/>
<AccountDetails>
<xsl:apply-templates select="(current-group() except .)[id != '48']"/>
</AccountDetails>
<xsl:apply-templates select="current-group()[id = '48']"/>
</AccountRecord>
</xsl:for-each-group>
<xsl:apply-templates select="current-group()[last()]"/>
</xsl:for-each-group>
</GroupRecord>
</xsl:for-each-group>
<xsl:apply-templates select="current-group()[last()]"/>
</xsl:for-each-group>
</File>
</xsl:for-each-group>
</xsl:template>
<xsl:template match="line">
<xsl:element name="{$header-names[index-of($header-ids, current()/id)]}">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
https://xsltfiddle.liberty-development.net/gWEaSv8。为了示例的完整性和紧凑性,示例数据已被内联,但您当然可以改用 <xsl:param name="data" as="xs:string" select="unparsed-text('file.txt')"/>
。我还使用了 xsl:mode
声明和 name="xsl:initial-template"
,这两个 XSLT 3 特性都需要为 XSLT 2 处理器进行调整,以拼出身份转换并使用不同的模板名称,例如name="main"
作为代码的入口点。我还在那里使用了像 <id>{$tokens[1]}</id>
这样的文本值模板,对于 XSLT 2 处理器,您需要使用例如<id><xsl:value-of select="$tokens[1]"/</id>
.