使用 python 编辑 XML 个文件

Edit XML files with python

我正在尝试使用 python 同时编辑多个 XML 文件。 在原始 XML 文件中,我有 Spekers 和他们说的内容,但不是父子标签。像这样:

     <p Speaker>John</p>
     <p Text>Speech he's giving</p>
     <p Text>Speech he's giving</p>
     <p Speaker>Laura</p>
     <p Text>Speech she's giving</p>
     <p Text>Speech she's giving</p>

但我想在演讲者和文本之间建立父子关系。除了在我已有的包含演讲者信息的数据库中,我还想添加他们的信息,例如他们的演讲者 ID、他们的角色,并计算他们演讲的次数。就像这样:

    <u xml:id="speakercount.u1"
           who="#speakerid"
           ana="#role">
           <seg xml:id="speechcount.u1.1">text</seg>
           <seg xml:id="speechcount.u.1.2">text</seg>
       </u>
    <u xml:id="speakercount.u2"
           who="#speakerid"
           ana="#role">
           <seg xml:id="speechcount.u.2.1">text</seg>
           <seg xml:id="speechcount.u.2.2">text</seg>
    </u>

这可以吗?并一次对几个 XML 做?我需要哪些 python 模块?因为我似乎找不到这样做的必要信息...

您可以为此使用 xslt。

给你一些指导,即构建一个单独的 xml 文件,其中包含 db-info 与软管内容相结合 xml-files 像这样:

<root>
  <speakers>
    <speaker id="1" role="A" name="John"/>
    <speaker id="2" role="B" name="Laura"/>
  </speakers>
  <ps>
    <p type="Speaker">John</p>
    <p type="Text">Speech he's giving</p>
    <p type="Text">Speech he's giving</p>
    <p type="Speaker">Laura</p>
    <p type="Text">Speech she's giving</p>
    <p type="Text">Speech she's giving</p>    
  </ps>
</root>

然后像这样使用 xslt:

<?xml version='1.0' encoding='UTF-8'?>
<xsl:stylesheet 
  version="2.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
  <xsl:output indent="yes"/>
  
  <xsl:variable name="speakers" as="element()*" select="/*/speakers/speaker"/>

  <xsl:template match="*[p]">
    <us>
      <xsl:for-each-group select="p" group-starting-with="p[@type='Speaker']">
        <xsl:variable name="speakerName" select="current-group()[1]/text()"/>
        <xsl:variable name="speakerDbRecord" select="$speakers[@name=$speakerName]"/>
        <xsl:variable name="posSpeaker" select="position()"/>
        <u xml:id="speakercount.u{$posSpeaker}"
          who="#{$speakerDbRecord/@id}"
          ana="#{$speakerDbRecord/@role}">
          <xsl:for-each select="current-group()[ position() gt 1]">
            <xsl:variable name="posText" select="position() - 1"/>
            <seg xml:id="speechcount.u.{$posSpeaker}.{$posText}"><xsl:value-of select="."/></seg>
          </xsl:for-each>
        </u>
      </xsl:for-each-group>
    </us>
  </xsl:template>
  
</xsl:stylesheet>

会给你这个:

<us>
   <u xml:id="speakercount.u1" who="#1" ana="#A">
      <seg xml:id="speechcount.u.1.0">Speech he's giving</seg>
      <seg xml:id="speechcount.u.1.1">Speech he's giving</seg>
   </u>
   <u xml:id="speakercount.u2" who="#2" ana="#B">
      <seg xml:id="speechcount.u.2.0">Speech she's giving</seg>
      <seg xml:id="speechcount.u.2.1">Speech she's giving</seg>
   </u>
</us>