XSL - 如何从 XML 创建 graphml 边以连接具有相同 author/actors 的节点?

XSL - How to create graphml edge from XML to connect node with same author/actors?

我有一个显示电影列表的 XML 文件。每部电影都有一些元数据来描述情节、演员、导演等。这是示例结构:

<movies>
    <movie>
    <title>The Shawshank Redemption</title>
    <year>1994</year>
    <rated>R</rated>
    <released>1994 Oct 14</released>
    <runtime>142 min</runtime>
    <genres>
        <genre>Crime</genre>
        <genre>Drama</genre>
    </genres>
    <directors>
        <director>Name Surname</director>
    </directors>
    <writers>
        <writer>Stephen King (short story 'Rita Hayworth and Shawshank Redemption')</writer>
        <writer>Frank Darabont (screenplay)</writer>
    </writers>
    <actors>
        <actor>Tim Robbins</actor>
        <actor>Morgan Freeman</actor>
        <actor>Bob Gunton</actor>
        <actor>William Sadler</actor>
    </actors>
    <plot>Two imprisoned men bond over a number of years, finding solace and eventual redemption through acts of common decency.</plot>
    <languages>
        <language>English</language>
    </languages>
    <countries>
        <country>USA</country>
    </countries>
    <awards>Nominated for 7 Oscars. Another 16 wins and 16 nominations.</awards>
    <poster>http://ia.media-imdb.com/images/M/MV5BODU4MjU4NjIwNl5BMl5BanBnXkFtZTgwMDU2MjEyMDE@._V1_SX300.jpg</poster>
    <metascore>80</metascore>
    <imdbRating>9.3</imdbRating>
    <imdbVotes>1358212</imdbVotes>
    <imdbID>tt0111161</imdbID>
    <type>movie</type>
    </movie>
    <movie>
    ...
    </movie>
    <movie>
    ...
    </movie>
    ...
</movies>

我必须创建一个 XSL 样式表以将此文件转换为 graphml 文件,该文件显示演员与电影的关系,其中节点是电影,如果演员出现在电影中,则两个节点之间存在边(节点) 连接的。 举个例子:

<key id="actors" for="edge" attr.name="actors" attr.type="int">
    <default>1</default>
</key>

<graph id="movies" edgedefault="undirected">

<node id="movie title 1"/>
<node id="movie title 2"/>
<node id="movie title 3"/>
...

<edge source="movie title 1" target="movie title 2">
    <data key="actors">2</data> (number of actors who appear in both "movie title 1" and "movie title 2")
</edge>

这是列出节点的 XSL 片段:

<xsl:for-each-group select="/movies/movie" group-by=".">
    <xsl:sort select="current-grouping-key()"/>         
    <node><xsl:attribute name="id"><xsl:value-of select="current-grouping-key()"/></xsl:attribute></node>
    <xsl:text>&#xa;</xsl:text>
</xsl:for-each-group>
<xsl:text>&#xa;</xsl:text>

提前感谢您的回答。

我觉得你的问题不是很清楚。如果 - 看起来 - 你想要一个连接具有相同演员的电影的图表,那么你应该从一个例子开始,(a)有多部电影,并且(b)其中一些有相同的演员:

XML

<movies>
   <movie>
      <title>Alpha</title>
      <actors>
         <actor>Adam</actor>
         <actor>Betty</actor>
         <actor>Cecil</actor>
      </actors>
   </movie>
   <movie>
      <title>Bravo</title>
      <actors>
         <actor>Adam</actor>
         <actor>Betty</actor>
         <actor>David</actor>
      </actors>
   </movie>
   <movie>
      <title>Charlie</title>
      <actors>
         <actor>Adam</actor>
         <actor>David</actor>
         <actor>Eve</actor>
      </actors>
   </movie>
   <movie>
      <title>Delta</title>
      <actors>
         <actor>Cecil</actor>
         <actor>Eve</actor>
      </actors>
   </movie>
   <movie>
      <title>Echo</title>
      <actors>
         <actor>Frank</actor>
         <actor>George</actor>
      </actors>
   </movie>
</movies>

现在,应用以下样式表:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:key name="movie-by-actor" match="movie" use="actors/actor" />

<xsl:template match="/movies">
    <graphml xmlns="http://graphml.graphdrawing.org/xmlns"  
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
        <key id="actors" for="edge" attr.name="actors" attr.type="int"/>
        <graph id="movies" edgedefault="undirected">
            <xsl:for-each select="movie">
                <xsl:variable name="source" select="." />
                <node id="{title}"/>
                    <xsl:for-each select="key('movie-by-actor', actors/actor)[not(title=$source/title)]">
                        <edge source="{$source/title}" target="{title}">
                            <data key="actors">
                                <xsl:value-of select="count(actors/actor[.=$source/actors/actor])"/>
                            </data>
                        </edge>
                    </xsl:for-each>
            </xsl:for-each>
        </graph>
    </graphml>
</xsl:template>

</xsl:stylesheet>

将产生以下结果

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
   <key id="actors" for="edge" attr.name="actors" attr.type="int"/>
   <graph id="movies" edgedefault="undirected">
      <node id="Alpha"/>
      <edge source="Alpha" target="Bravo">
         <data key="actors">2</data>
      </edge>
      <edge source="Alpha" target="Charlie">
         <data key="actors">1</data>
      </edge>
      <edge source="Alpha" target="Delta">
         <data key="actors">1</data>
      </edge>
      <node id="Bravo"/>
      <edge source="Bravo" target="Alpha">
         <data key="actors">2</data>
      </edge>
      <edge source="Bravo" target="Charlie">
         <data key="actors">2</data>
      </edge>
      <node id="Charlie"/>
      <edge source="Charlie" target="Alpha">
         <data key="actors">1</data>
      </edge>
      <edge source="Charlie" target="Bravo">
         <data key="actors">2</data>
      </edge>
      <edge source="Charlie" target="Delta">
         <data key="actors">1</data>
      </edge>
      <node id="Delta"/>
      <edge source="Delta" target="Alpha">
         <data key="actors">1</data>
      </edge>
      <edge source="Delta" target="Charlie">
         <data key="actors">1</data>
      </edge>
      <node id="Echo"/>
   </graph>
</graphml>

这很可能就是您正在寻找的结果(我找不到 GraphML 在线查看器,所以我不能确定)。

但是,在上图中,每条边出现两次 - 每个方向一次。如果这是一个问题,您可以通过以下方式消除它:

XSLT 1.0

<xsl:template match="/movies">
    <graphml xmlns="http://graphml.graphdrawing.org/xmlns"  
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
        <key id="actors" for="edge" attr.name="actors" attr.type="int"/>
        <graph id="movies" edgedefault="undirected">
            <xsl:for-each select="movie">
                <xsl:variable name="source" select="." />
                <node id="{title}"/>
                    <xsl:for-each select="following-sibling::movie[actors/actor=$source/actors/actor]">
                        <edge source="{$source/title}" target="{title}">
                            <data key="actors">
                                <xsl:value-of select="count(actors/actor[.=$source/actors/actor])"/>
                            </data>
                        </edge>
                    </xsl:for-each>
            </xsl:for-each>
        </graph>
    </graphml>
</xsl:template>

</xsl:stylesheet>

并获得 结果

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
   <key id="actors" for="edge" attr.name="actors" attr.type="int"/>
   <graph id="movies" edgedefault="undirected">
      <node id="Alpha"/>
      <edge source="Alpha" target="Bravo">
         <data key="actors">2</data>
      </edge>
      <edge source="Alpha" target="Charlie">
         <data key="actors">1</data>
      </edge>
      <edge source="Alpha" target="Delta">
         <data key="actors">1</data>
      </edge>
      <node id="Bravo"/>
      <edge source="Bravo" target="Charlie">
         <data key="actors">2</data>
      </edge>
      <node id="Charlie"/>
      <edge source="Charlie" target="Delta">
         <data key="actors">1</data>
      </edge>
      <node id="Delta"/>
      <node id="Echo"/>
   </graph>
</graphml>