使用 python 编辑 XML 个文件
Edit XML files with python
我正在尝试使用 python 同时编辑多个 XML 文件。
在原始 XML 文件中,我有 Spekers 和他们说的内容,但不是父子标签。像这样:
<p Speaker>John</p>
<p Text>Speech he's giving</p>
<p Text>Speech he's giving</p>
<p Speaker>Laura</p>
<p Text>Speech she's giving</p>
<p Text>Speech she's giving</p>
但我想在演讲者和文本之间建立父子关系。除了在我已有的包含演讲者信息的数据库中,我还想添加他们的信息,例如他们的演讲者 ID、他们的角色,并计算他们演讲的次数。就像这样:
<u xml:id="speakercount.u1"
who="#speakerid"
ana="#role">
<seg xml:id="speechcount.u1.1">text</seg>
<seg xml:id="speechcount.u.1.2">text</seg>
</u>
<u xml:id="speakercount.u2"
who="#speakerid"
ana="#role">
<seg xml:id="speechcount.u.2.1">text</seg>
<seg xml:id="speechcount.u.2.2">text</seg>
</u>
这可以吗?并一次对几个 XML 做?我需要哪些 python 模块?因为我似乎找不到这样做的必要信息...
您可以为此使用 xslt。
给你一些指导,即构建一个单独的 xml 文件,其中包含 db-info 与软管内容相结合 xml-files 像这样:
<root>
<speakers>
<speaker id="1" role="A" name="John"/>
<speaker id="2" role="B" name="Laura"/>
</speakers>
<ps>
<p type="Speaker">John</p>
<p type="Text">Speech he's giving</p>
<p type="Text">Speech he's giving</p>
<p type="Speaker">Laura</p>
<p type="Text">Speech she's giving</p>
<p type="Text">Speech she's giving</p>
</ps>
</root>
然后像这样使用 xslt:
<?xml version='1.0' encoding='UTF-8'?>
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:output indent="yes"/>
<xsl:variable name="speakers" as="element()*" select="/*/speakers/speaker"/>
<xsl:template match="*[p]">
<us>
<xsl:for-each-group select="p" group-starting-with="p[@type='Speaker']">
<xsl:variable name="speakerName" select="current-group()[1]/text()"/>
<xsl:variable name="speakerDbRecord" select="$speakers[@name=$speakerName]"/>
<xsl:variable name="posSpeaker" select="position()"/>
<u xml:id="speakercount.u{$posSpeaker}"
who="#{$speakerDbRecord/@id}"
ana="#{$speakerDbRecord/@role}">
<xsl:for-each select="current-group()[ position() gt 1]">
<xsl:variable name="posText" select="position() - 1"/>
<seg xml:id="speechcount.u.{$posSpeaker}.{$posText}"><xsl:value-of select="."/></seg>
</xsl:for-each>
</u>
</xsl:for-each-group>
</us>
</xsl:template>
</xsl:stylesheet>
会给你这个:
<us>
<u xml:id="speakercount.u1" who="#1" ana="#A">
<seg xml:id="speechcount.u.1.0">Speech he's giving</seg>
<seg xml:id="speechcount.u.1.1">Speech he's giving</seg>
</u>
<u xml:id="speakercount.u2" who="#2" ana="#B">
<seg xml:id="speechcount.u.2.0">Speech she's giving</seg>
<seg xml:id="speechcount.u.2.1">Speech she's giving</seg>
</u>
</us>
我正在尝试使用 python 同时编辑多个 XML 文件。 在原始 XML 文件中,我有 Spekers 和他们说的内容,但不是父子标签。像这样:
<p Speaker>John</p>
<p Text>Speech he's giving</p>
<p Text>Speech he's giving</p>
<p Speaker>Laura</p>
<p Text>Speech she's giving</p>
<p Text>Speech she's giving</p>
但我想在演讲者和文本之间建立父子关系。除了在我已有的包含演讲者信息的数据库中,我还想添加他们的信息,例如他们的演讲者 ID、他们的角色,并计算他们演讲的次数。就像这样:
<u xml:id="speakercount.u1"
who="#speakerid"
ana="#role">
<seg xml:id="speechcount.u1.1">text</seg>
<seg xml:id="speechcount.u.1.2">text</seg>
</u>
<u xml:id="speakercount.u2"
who="#speakerid"
ana="#role">
<seg xml:id="speechcount.u.2.1">text</seg>
<seg xml:id="speechcount.u.2.2">text</seg>
</u>
这可以吗?并一次对几个 XML 做?我需要哪些 python 模块?因为我似乎找不到这样做的必要信息...
您可以为此使用 xslt。
给你一些指导,即构建一个单独的 xml 文件,其中包含 db-info 与软管内容相结合 xml-files 像这样:
<root>
<speakers>
<speaker id="1" role="A" name="John"/>
<speaker id="2" role="B" name="Laura"/>
</speakers>
<ps>
<p type="Speaker">John</p>
<p type="Text">Speech he's giving</p>
<p type="Text">Speech he's giving</p>
<p type="Speaker">Laura</p>
<p type="Text">Speech she's giving</p>
<p type="Text">Speech she's giving</p>
</ps>
</root>
然后像这样使用 xslt:
<?xml version='1.0' encoding='UTF-8'?>
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:output indent="yes"/>
<xsl:variable name="speakers" as="element()*" select="/*/speakers/speaker"/>
<xsl:template match="*[p]">
<us>
<xsl:for-each-group select="p" group-starting-with="p[@type='Speaker']">
<xsl:variable name="speakerName" select="current-group()[1]/text()"/>
<xsl:variable name="speakerDbRecord" select="$speakers[@name=$speakerName]"/>
<xsl:variable name="posSpeaker" select="position()"/>
<u xml:id="speakercount.u{$posSpeaker}"
who="#{$speakerDbRecord/@id}"
ana="#{$speakerDbRecord/@role}">
<xsl:for-each select="current-group()[ position() gt 1]">
<xsl:variable name="posText" select="position() - 1"/>
<seg xml:id="speechcount.u.{$posSpeaker}.{$posText}"><xsl:value-of select="."/></seg>
</xsl:for-each>
</u>
</xsl:for-each-group>
</us>
</xsl:template>
</xsl:stylesheet>
会给你这个:
<us>
<u xml:id="speakercount.u1" who="#1" ana="#A">
<seg xml:id="speechcount.u.1.0">Speech he's giving</seg>
<seg xml:id="speechcount.u.1.1">Speech he's giving</seg>
</u>
<u xml:id="speakercount.u2" who="#2" ana="#B">
<seg xml:id="speechcount.u.2.0">Speech she's giving</seg>
<seg xml:id="speechcount.u.2.1">Speech she's giving</seg>
</u>
</us>