如何将非常复杂的 XML 展平为包含根级别所有节点的新 XML
How to flatten a very complex XML into a new XML containing all nodes at root level
我目前正在尝试展平大型递归 XML 文档,以便所有嵌套元素都保留在根级别,但获得一个额外的新属性 ("parent_id=...") 以保持它们之间的关系节点。
每个节点都有很多子节点,我也需要抓取,所以内容必须保持不变。
文件非常大(500k 行 - 大小为 33 MB)
示例XML:
<product-catalog ...>
<category id="1">
<content>
...
</content>
<category id="2">
<content>
...
</content>
</category>
<category id="3">
<content>
...
</content>
<category id="4">
...
</category>
<category id="5">
...
</category>
</category>
</category>
</product-catalog>
所需的扁平化输出:
<product-catalog>
<category id="1" parent_id="0">
<content>...</content>
</category>
<category id="2" parent_id="1">
<content>...</content>
</category>
<category id="3" parent_id="1">
<content>...</content>
</category>
<category id="4" parent_id="3">
<content>...</content>
</category>
<category id="5" parent_id="3">
<content>...</content>
</category>
</product-catalog>
到目前为止已经试过了,但它只提供根类别(不是真正的 xslt-expert...;))
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="category">
<xsl:element name="category">
<xsl:apply-templates select="@* | node() [not(child::category)]"/>
</xsl:element>
</xsl:template>
<!-- remove -->
<xsl:template match="translations" />
</xsl:stylesheet>
考虑以下示例:
XML
<product-catalog>
<category id="1">
<content>A1</content>
<category id="2">
<content>B</content>
</category>
<category id="3">
<content>C1</content>
<content>C2</content>
<category id="4">
<content>D</content>
</category>
<category id="5">
<content>E</content>
</category>
</category>
<content>A2</content>
</category>
</product-catalog>
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/product-catalog">
<xsl:copy>
<xsl:apply-templates select="category"/>
</xsl:copy>
</xsl:template>
<xsl:template match="category">
<category id="{@id}" parent_id="{parent::category/@id}">
<xsl:copy-of select="content"/>
</category>
<xsl:apply-templates select="category"/>
</xsl:template>
</xsl:stylesheet>
结果
<?xml version="1.0" encoding="UTF-8"?>
<product-catalog>
<category id="1" parent_id="">
<content>A1</content>
<content>A2</content>
</category>
<category id="2" parent_id="1">
<content>B</content>
</category>
<category id="3" parent_id="1">
<content>C1</content>
<content>C2</content>
</category>
<category id="4" parent_id="3">
<content>D</content>
</category>
<category id="5" parent_id="3">
<content>E</content>
</category>
</product-catalog>
how could i copy all existing attributes of <category...>
and add only parent_id
尝试:
<xsl:template match="category">
<category parent_id="{parent::category/@id}">
<xsl:copy-of select="@* | content"/>
</category>
<xsl:apply-templates select="category"/>
</xsl:template>
我目前正在尝试展平大型递归 XML 文档,以便所有嵌套元素都保留在根级别,但获得一个额外的新属性 ("parent_id=...") 以保持它们之间的关系节点。
每个节点都有很多子节点,我也需要抓取,所以内容必须保持不变。
文件非常大(500k 行 - 大小为 33 MB)
示例XML:
<product-catalog ...>
<category id="1">
<content>
...
</content>
<category id="2">
<content>
...
</content>
</category>
<category id="3">
<content>
...
</content>
<category id="4">
...
</category>
<category id="5">
...
</category>
</category>
</category>
</product-catalog>
所需的扁平化输出:
<product-catalog>
<category id="1" parent_id="0">
<content>...</content>
</category>
<category id="2" parent_id="1">
<content>...</content>
</category>
<category id="3" parent_id="1">
<content>...</content>
</category>
<category id="4" parent_id="3">
<content>...</content>
</category>
<category id="5" parent_id="3">
<content>...</content>
</category>
</product-catalog>
到目前为止已经试过了,但它只提供根类别(不是真正的 xslt-expert...;))
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="category">
<xsl:element name="category">
<xsl:apply-templates select="@* | node() [not(child::category)]"/>
</xsl:element>
</xsl:template>
<!-- remove -->
<xsl:template match="translations" />
</xsl:stylesheet>
考虑以下示例:
XML
<product-catalog>
<category id="1">
<content>A1</content>
<category id="2">
<content>B</content>
</category>
<category id="3">
<content>C1</content>
<content>C2</content>
<category id="4">
<content>D</content>
</category>
<category id="5">
<content>E</content>
</category>
</category>
<content>A2</content>
</category>
</product-catalog>
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/product-catalog">
<xsl:copy>
<xsl:apply-templates select="category"/>
</xsl:copy>
</xsl:template>
<xsl:template match="category">
<category id="{@id}" parent_id="{parent::category/@id}">
<xsl:copy-of select="content"/>
</category>
<xsl:apply-templates select="category"/>
</xsl:template>
</xsl:stylesheet>
结果
<?xml version="1.0" encoding="UTF-8"?>
<product-catalog>
<category id="1" parent_id="">
<content>A1</content>
<content>A2</content>
</category>
<category id="2" parent_id="1">
<content>B</content>
</category>
<category id="3" parent_id="1">
<content>C1</content>
<content>C2</content>
</category>
<category id="4" parent_id="3">
<content>D</content>
</category>
<category id="5" parent_id="3">
<content>E</content>
</category>
</product-catalog>
how could i copy all existing attributes of
<category...>
and add only parent_id
尝试:
<xsl:template match="category">
<category parent_id="{parent::category/@id}">
<xsl:copy-of select="@* | content"/>
</category>
<xsl:apply-templates select="category"/>
</xsl:template>