使用 xslt-3 合并两个或多个 xml 文件
merge two or more xml files, using xslt-3
我有很多 XML 个文件,我需要将它们合并为一个:
hotel1.xml
<?xml version="1.0" encoding="UTF-8"?>
<menu>
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>.95</price>
<description>Two of our famous Belgian Waffles with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food>
<name>Strawberry Belgian Waffles</name>
<price>.95</price>
<description>Light Belgian waffles covered with strawberries and whipped cream</description>
<calories>900</calories>
</food>
</breakfast_menu>
</menu>
酒店 2:
<?xml version="1.0" encoding="UTF-8"?>
<menu>
<breakfast_menu>
<food>
<name>Berry-Berry Belgian Waffles</name>
<price>.95</price>
<description>Light Belgian waffles covered with an assortment of fresh berries and whipped cream</description>
<calories>900</calories>
</food>
</breakfast_menu>
</menu>
hotel3.xml:
<?xml version="1.0" encoding="UTF-8"?>
<menu>
<breakfast_menu>
<food>
<name>French Toast</name>
<price>.50</price>
<description>Thick slices made from our homemade sourdough bread</description>
<calories>600</calories>
</food>
<food>
<name>Homestyle Breakfast</name>
<price>.95</price>
<description>Two eggs, bacon or sausage, toast, and our ever-popular hash browns</description>
<calories>950</calories>
</food>
</breakfast_menu>
</menu>
我需要先给 name 元素添加一个值,以便知道它来自哪个文件,然后合并所有 xml 个文件。
期望的输出:
<?xml version="1.0" encoding="UTF-8"?>
<menu>
<breakfast_menu>
<food>
<name>Belgian Waffles-hotel1</name>
<price>.95</price>
<description>Two of our famous Belgian Waffles with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food>
<name>Strawberry Belgian Waffles-hotel1</name>
<price>.95</price>
<description>Light Belgian waffles covered with strawberries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name>Berry-Berry Belgian Waffles-hotel2</name>
<price>.95</price>
<description>Light Belgian waffles covered with an assortment of fresh berries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name>French Toast-hotel3</name>
<price>.50</price>
<description>Thick slices made from our homemade sourdough bread</description>
<calories>600</calories>
</food>
<food>
<name>Homestyle Breakfast-hotel3</name>
<price>.95</price>
<description>Two eggs, bacon or sausage, toast, and our ever-popular hash browns</description>
<calories>950</calories>
</food>
</breakfast_menu>
</menu>
在给定的文件夹中,我需要将所有 xml 个文件合并为一个。只需合并它们的内容。没有检查,或更新。另外,我需要在名称元素中保留每个文件的来源,以供将来参考
这是我的试用版。
我需要更有经验的人帮助,以便使用最新的 xslt-3,以及给定文件夹中存在的 xml 文件。
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="breakfast_menu">
<xsl:copy>
<xsl:apply-templates select="*"/>
<xsl:apply-templates select="document('hotel1.xml')/menu/breakfast_menu/*" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
在 XSLT 中有多种方法可以做到这一点,一种是尝试新的 xsl:merge
指令,但是当我 运行 在 Saxon 9.8 中使用它时遇到问题(参见 https://saxonica.plan.io/issues/3883 and https://saxonica.plan.io/issues/3884) 这是一种不同的方式,似乎您只是想复制某个级别的所有元素,在您的情况下是第三级别的 food
元素;一个通用样式表来执行将 select 表达式作为静态参数的操作(我将其通用为 */*/*
,但您当然可以将其拼写为 menu/breakfast_menu/food
),输入文件的 URI 和文件名模式,然后以 xsl:initial-template
(Saxon 命令行的命令行选项 -it
)开始,如下所示:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="#all"
expand-text="yes"
version="3.0">
<xsl:param name="input-uri" as="xs:string" select="'.'"/>
<xsl:param name="file-pattern" as="xs:string" select="'hotel*.xml'"/>
<xsl:param name="merge-select-expression" as="xs:string" static="yes" select="'*/*/*'"/>
<xsl:param name="xslt-pattern-to-add-file-name" as="xs:string" static="yes" select="$merge-select-expression || '/name'"/>
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:mode on-no-match="shallow-copy" streamable="yes"/>
<xsl:template _match="{$xslt-pattern-to-add-file-name}">
<xsl:comment>Copied this {node-name()} element from {tokenize(document-uri(/), '/')[last()]}</xsl:comment>
<xsl:next-match/>
</xsl:template>
<xsl:template name="xsl:initial-template">
<xsl:sequence select="mf:append-docs(uri-collection($input-uri || '?select=' || $file-pattern))"/>
</xsl:template>
<xsl:function name="mf:append-docs" as="document-node()">
<xsl:param name="doc-uris" as="xs:anyURI+"/>
<xsl:source-document href="{head($doc-uris)}" streamable="yes">
<xsl:apply-templates select="." mode="construct">
<xsl:with-param name="remaining-doc-uris" as="xs:anyURI*" select="tail($doc-uris)" tunnel="yes"/>
</xsl:apply-templates>
</xsl:source-document>
</xsl:function>
<xsl:mode name="construct" on-no-match="shallow-copy" streamable="yes"/>
<xsl:template _match="{string-join(tokenize($merge-select-expression, '/')[position() lt last()], '/')}" mode="construct">
<xsl:param name="remaining-doc-uris" as="xs:anyURI*" tunnel="yes"/>
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:apply-templates/>
<xsl:for-each select="$remaining-doc-uris">
<xsl:source-document href="{.}" streamable="yes">
<xsl:apply-templates _select="{$merge-select-expression}"/>
</xsl:source-document>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
它应该可以通过简单地调整参数并可能通过调整 <xsl:template _match="{$xslt-pattern-to-add-file-name}">
模板的主体来工作,因为我选择在注释中输出文件名而不是将其放入元素的内容中.
对于样式表 append.xsl
所在的子目录 hotel-Whosebug-test
和 Saxon 命令行 -it -xsl:.\append.xsl input-uri=hotel-Whosebug-test file-pattern=hotel*.xml
中的三个示例,我得到了输出
<menu>
<breakfast_menu>
<food><!--Copied this name element from hotel1.xml-->
<name>Belgian Waffles</name>
<price>.95</price>
<description>Two of our famous Belgian Waffles with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food><!--Copied this name element from hotel1.xml-->
<name>Strawberry Belgian Waffles</name>
<price>.95</price>
<description>Light Belgian waffles covered with strawberries and whipped cream</description>
<calories>900</calories>
</food>
<food><!--Copied this name element from hotel2.xml-->
<name>Berry-Berry Belgian Waffles</name>
<price>.95</price>
<description>Light Belgian waffles covered with an assortment of fresh berries and whipped cream</description>
<calories>900</calories>
</food>
<food><!--Copied this name element from hotel3.xml-->
<name>French Toast</name>
<price>.50</price>
<description>Thick slices made from our homemade sourdough bread</description>
<calories>600</calories>
</food>
<food><!--Copied this name element from hotel3.xml-->
<name>Homestyle Breakfast</name>
<price>.95</price>
<description>Two eggs, bacon or sausage, toast, and our ever-popular hash browns</description>
<calories>950</calories>
</food>
</breakfast_menu>
</menu>
代码应该使用 Saxon 9.8 EE 的流式处理和 Saxon 9.8 HE 或 PE 的正常 XSLT 处理。
我有很多 XML 个文件,我需要将它们合并为一个: hotel1.xml
<?xml version="1.0" encoding="UTF-8"?>
<menu>
<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>.95</price>
<description>Two of our famous Belgian Waffles with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food>
<name>Strawberry Belgian Waffles</name>
<price>.95</price>
<description>Light Belgian waffles covered with strawberries and whipped cream</description>
<calories>900</calories>
</food>
</breakfast_menu>
</menu>
酒店 2:
<?xml version="1.0" encoding="UTF-8"?>
<menu>
<breakfast_menu>
<food>
<name>Berry-Berry Belgian Waffles</name>
<price>.95</price>
<description>Light Belgian waffles covered with an assortment of fresh berries and whipped cream</description>
<calories>900</calories>
</food>
</breakfast_menu>
</menu>
hotel3.xml:
<?xml version="1.0" encoding="UTF-8"?>
<menu>
<breakfast_menu>
<food>
<name>French Toast</name>
<price>.50</price>
<description>Thick slices made from our homemade sourdough bread</description>
<calories>600</calories>
</food>
<food>
<name>Homestyle Breakfast</name>
<price>.95</price>
<description>Two eggs, bacon or sausage, toast, and our ever-popular hash browns</description>
<calories>950</calories>
</food>
</breakfast_menu>
</menu>
我需要先给 name 元素添加一个值,以便知道它来自哪个文件,然后合并所有 xml 个文件。
期望的输出:
<?xml version="1.0" encoding="UTF-8"?>
<menu>
<breakfast_menu>
<food>
<name>Belgian Waffles-hotel1</name>
<price>.95</price>
<description>Two of our famous Belgian Waffles with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food>
<name>Strawberry Belgian Waffles-hotel1</name>
<price>.95</price>
<description>Light Belgian waffles covered with strawberries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name>Berry-Berry Belgian Waffles-hotel2</name>
<price>.95</price>
<description>Light Belgian waffles covered with an assortment of fresh berries and whipped cream</description>
<calories>900</calories>
</food>
<food>
<name>French Toast-hotel3</name>
<price>.50</price>
<description>Thick slices made from our homemade sourdough bread</description>
<calories>600</calories>
</food>
<food>
<name>Homestyle Breakfast-hotel3</name>
<price>.95</price>
<description>Two eggs, bacon or sausage, toast, and our ever-popular hash browns</description>
<calories>950</calories>
</food>
</breakfast_menu>
</menu>
在给定的文件夹中,我需要将所有 xml 个文件合并为一个。只需合并它们的内容。没有检查,或更新。另外,我需要在名称元素中保留每个文件的来源,以供将来参考
这是我的试用版。 我需要更有经验的人帮助,以便使用最新的 xslt-3,以及给定文件夹中存在的 xml 文件。
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="breakfast_menu">
<xsl:copy>
<xsl:apply-templates select="*"/>
<xsl:apply-templates select="document('hotel1.xml')/menu/breakfast_menu/*" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
在 XSLT 中有多种方法可以做到这一点,一种是尝试新的 xsl:merge
指令,但是当我 运行 在 Saxon 9.8 中使用它时遇到问题(参见 https://saxonica.plan.io/issues/3883 and https://saxonica.plan.io/issues/3884) 这是一种不同的方式,似乎您只是想复制某个级别的所有元素,在您的情况下是第三级别的 food
元素;一个通用样式表来执行将 select 表达式作为静态参数的操作(我将其通用为 */*/*
,但您当然可以将其拼写为 menu/breakfast_menu/food
),输入文件的 URI 和文件名模式,然后以 xsl:initial-template
(Saxon 命令行的命令行选项 -it
)开始,如下所示:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="#all"
expand-text="yes"
version="3.0">
<xsl:param name="input-uri" as="xs:string" select="'.'"/>
<xsl:param name="file-pattern" as="xs:string" select="'hotel*.xml'"/>
<xsl:param name="merge-select-expression" as="xs:string" static="yes" select="'*/*/*'"/>
<xsl:param name="xslt-pattern-to-add-file-name" as="xs:string" static="yes" select="$merge-select-expression || '/name'"/>
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:mode on-no-match="shallow-copy" streamable="yes"/>
<xsl:template _match="{$xslt-pattern-to-add-file-name}">
<xsl:comment>Copied this {node-name()} element from {tokenize(document-uri(/), '/')[last()]}</xsl:comment>
<xsl:next-match/>
</xsl:template>
<xsl:template name="xsl:initial-template">
<xsl:sequence select="mf:append-docs(uri-collection($input-uri || '?select=' || $file-pattern))"/>
</xsl:template>
<xsl:function name="mf:append-docs" as="document-node()">
<xsl:param name="doc-uris" as="xs:anyURI+"/>
<xsl:source-document href="{head($doc-uris)}" streamable="yes">
<xsl:apply-templates select="." mode="construct">
<xsl:with-param name="remaining-doc-uris" as="xs:anyURI*" select="tail($doc-uris)" tunnel="yes"/>
</xsl:apply-templates>
</xsl:source-document>
</xsl:function>
<xsl:mode name="construct" on-no-match="shallow-copy" streamable="yes"/>
<xsl:template _match="{string-join(tokenize($merge-select-expression, '/')[position() lt last()], '/')}" mode="construct">
<xsl:param name="remaining-doc-uris" as="xs:anyURI*" tunnel="yes"/>
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:apply-templates/>
<xsl:for-each select="$remaining-doc-uris">
<xsl:source-document href="{.}" streamable="yes">
<xsl:apply-templates _select="{$merge-select-expression}"/>
</xsl:source-document>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
它应该可以通过简单地调整参数并可能通过调整 <xsl:template _match="{$xslt-pattern-to-add-file-name}">
模板的主体来工作,因为我选择在注释中输出文件名而不是将其放入元素的内容中.
对于样式表 append.xsl
所在的子目录 hotel-Whosebug-test
和 Saxon 命令行 -it -xsl:.\append.xsl input-uri=hotel-Whosebug-test file-pattern=hotel*.xml
中的三个示例,我得到了输出
<menu>
<breakfast_menu>
<food><!--Copied this name element from hotel1.xml-->
<name>Belgian Waffles</name>
<price>.95</price>
<description>Two of our famous Belgian Waffles with plenty of real maple syrup</description>
<calories>650</calories>
</food>
<food><!--Copied this name element from hotel1.xml-->
<name>Strawberry Belgian Waffles</name>
<price>.95</price>
<description>Light Belgian waffles covered with strawberries and whipped cream</description>
<calories>900</calories>
</food>
<food><!--Copied this name element from hotel2.xml-->
<name>Berry-Berry Belgian Waffles</name>
<price>.95</price>
<description>Light Belgian waffles covered with an assortment of fresh berries and whipped cream</description>
<calories>900</calories>
</food>
<food><!--Copied this name element from hotel3.xml-->
<name>French Toast</name>
<price>.50</price>
<description>Thick slices made from our homemade sourdough bread</description>
<calories>600</calories>
</food>
<food><!--Copied this name element from hotel3.xml-->
<name>Homestyle Breakfast</name>
<price>.95</price>
<description>Two eggs, bacon or sausage, toast, and our ever-popular hash browns</description>
<calories>950</calories>
</food>
</breakfast_menu>
</menu>
代码应该使用 Saxon 9.8 EE 的流式处理和 Saxon 9.8 HE 或 PE 的正常 XSLT 处理。