尝试获取 ATOM 提要并解析出以 XSLT 格式的 XHTML 编写的部分
Trying to take an ATOM feed and parse out a section written in XHTML in XSLT format
我正在尝试使用 NOAA RSS 提要(NOAA 网站说它使用 ATOM 和 CAPS)并使用 XSLT 将其转换为 SharePoint。我对此很陌生,在 XSLT 方面的工作经验有限。这是 Feed 的示例。
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
xmlns:georss="http://www.georss.org/georss">
<id>urn:uuid:9ae4ae29-830f-4870-bace-0f70984b76bd</id><title>
TSUNAMI INFORMATION STATEMENT NUMBER 1 </title>
<updated>2022-01-29T03:00:32Z</updated>
<author>
<name>NWS PACIFIC TSUNAMI WARNING CENTER HONOLULU HI</name>
<uri>http://ntwc.arh.noaa.gov/</uri>
<email>ntwc@noaa.gov</email>
</author>
<icon>http://ntwc.arh.noaa.gov/images/favicon.ico</icon>
<link type="application/atom+xml" rel="self" title="self"
href="http://ntwc.arh.noaa.gov/events/xml/PAAQAtom.xml"/>
<link rel="related" title="Energy Map"
<entry>
<title>KERMADEC ISLANDS REGION</title><updated>2022-01-29T03:00:32Z</updated>
<geo:lat>-29.751</geo:lat>
<geo:long>-174.709</geo:long>
<summary type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<strong>Category:</strong> Information<br/>
<strong>Bulletin Issue Time: </strong> 2022.01.29 03:00:32 UTC
<br/><strong>Preliminary Magnitude: </strong>6.6(Mwp)<br/>
<strong>Lat/Lon: </strong>-29.751 / -174.709<br/>
<strong>Affected Region: </strong>KERMADEC ISLANDS REGION<br/>
</div>
</summary>
</entry>
</feed>
我的问题是尝试将“summary type=xhtml”部分转换为可读格式(如下所示),而不是 运行-on 长句。
CATEGORY: Information
BULLETIN ISSUE TIME:
PRELIMINARY MAGNITUDE:
有人可以就如何解析 XSLT 中的信息向我提供一些建议吗?
提前谢谢你。
据我所知,Atom summary
的内容没有标准格式。如果您的数据提供者遵循示例中显示的格式,那么 - 给定 well-formed XML 输入,例如:
XML
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
xmlns:georss="http://www.georss.org/georss">
<id>urn:uuid:9ae4ae29-830f-4870-bace-0f70984b76bd</id><title>
TSUNAMI INFORMATION STATEMENT NUMBER 1 </title>
<updated>2022-01-29T03:00:32Z</updated>
<author>
<name>NWS PACIFIC TSUNAMI WARNING CENTER HONOLULU HI</name>
<uri>http://ntwc.arh.noaa.gov/</uri>
<email>ntwc@noaa.gov</email>
</author>
<icon>http://ntwc.arh.noaa.gov/images/favicon.ico</icon>
<link type="application/atom+xml" rel="self" title="self"
href="http://ntwc.arh.noaa.gov/events/xml/PAAQAtom.xml"/>
<entry>
<title>KERMADEC ISLANDS REGION</title><updated>2022-01-29T03:00:32Z</updated>
<geo:lat>-29.751</geo:lat>
<geo:long>-174.709</geo:long>
<summary type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<strong>Category:</strong> Information<br/>
<strong>Bulletin Issue Time: </strong> 2022.01.29 03:00:32 UTC
<br/><strong>Preliminary Magnitude: </strong>6.6(Mwp)<br/>
<strong>Lat/Lon: </strong>-29.751 / -174.709<br/>
<strong>Affected Region: </strong>KERMADEC ISLANDS REGION<br/>
</div>
</summary>
</entry>
</feed>
你可以这样做:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:a="http://www.w3.org/2005/Atom"
xmlns:x="http://www.w3.org/1999/xhtml">
<xsl:output method="text" encoding="UTF-8" />
<xsl:template match="/a:feed">
<xsl:for-each select="a:entry/a:summary/x:div/x:strong">
<xsl:value-of select="." />
<xsl:value-of select="normalize-space(following-sibling::text()[1])" />
<xsl:text> </xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
获得:
结果
Category:Information
Bulletin Issue Time: 2022.01.29 03:00:32 UTC
Preliminary Magnitude: 6.6(Mwp)
Lat/Lon: -29.751 / -174.709
Affected Region: KERMADEC ISLANDS REGION
我正在尝试使用 NOAA RSS 提要(NOAA 网站说它使用 ATOM 和 CAPS)并使用 XSLT 将其转换为 SharePoint。我对此很陌生,在 XSLT 方面的工作经验有限。这是 Feed 的示例。
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
xmlns:georss="http://www.georss.org/georss">
<id>urn:uuid:9ae4ae29-830f-4870-bace-0f70984b76bd</id><title>
TSUNAMI INFORMATION STATEMENT NUMBER 1 </title>
<updated>2022-01-29T03:00:32Z</updated>
<author>
<name>NWS PACIFIC TSUNAMI WARNING CENTER HONOLULU HI</name>
<uri>http://ntwc.arh.noaa.gov/</uri>
<email>ntwc@noaa.gov</email>
</author>
<icon>http://ntwc.arh.noaa.gov/images/favicon.ico</icon>
<link type="application/atom+xml" rel="self" title="self"
href="http://ntwc.arh.noaa.gov/events/xml/PAAQAtom.xml"/>
<link rel="related" title="Energy Map"
<entry>
<title>KERMADEC ISLANDS REGION</title><updated>2022-01-29T03:00:32Z</updated>
<geo:lat>-29.751</geo:lat>
<geo:long>-174.709</geo:long>
<summary type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<strong>Category:</strong> Information<br/>
<strong>Bulletin Issue Time: </strong> 2022.01.29 03:00:32 UTC
<br/><strong>Preliminary Magnitude: </strong>6.6(Mwp)<br/>
<strong>Lat/Lon: </strong>-29.751 / -174.709<br/>
<strong>Affected Region: </strong>KERMADEC ISLANDS REGION<br/>
</div>
</summary>
</entry>
</feed>
我的问题是尝试将“summary type=xhtml”部分转换为可读格式(如下所示),而不是 运行-on 长句。
CATEGORY: Information
BULLETIN ISSUE TIME:
PRELIMINARY MAGNITUDE:
有人可以就如何解析 XSLT 中的信息向我提供一些建议吗?
提前谢谢你。
据我所知,Atom summary
的内容没有标准格式。如果您的数据提供者遵循示例中显示的格式,那么 - 给定 well-formed XML 输入,例如:
XML
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
xmlns:georss="http://www.georss.org/georss">
<id>urn:uuid:9ae4ae29-830f-4870-bace-0f70984b76bd</id><title>
TSUNAMI INFORMATION STATEMENT NUMBER 1 </title>
<updated>2022-01-29T03:00:32Z</updated>
<author>
<name>NWS PACIFIC TSUNAMI WARNING CENTER HONOLULU HI</name>
<uri>http://ntwc.arh.noaa.gov/</uri>
<email>ntwc@noaa.gov</email>
</author>
<icon>http://ntwc.arh.noaa.gov/images/favicon.ico</icon>
<link type="application/atom+xml" rel="self" title="self"
href="http://ntwc.arh.noaa.gov/events/xml/PAAQAtom.xml"/>
<entry>
<title>KERMADEC ISLANDS REGION</title><updated>2022-01-29T03:00:32Z</updated>
<geo:lat>-29.751</geo:lat>
<geo:long>-174.709</geo:long>
<summary type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<strong>Category:</strong> Information<br/>
<strong>Bulletin Issue Time: </strong> 2022.01.29 03:00:32 UTC
<br/><strong>Preliminary Magnitude: </strong>6.6(Mwp)<br/>
<strong>Lat/Lon: </strong>-29.751 / -174.709<br/>
<strong>Affected Region: </strong>KERMADEC ISLANDS REGION<br/>
</div>
</summary>
</entry>
</feed>
你可以这样做:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:a="http://www.w3.org/2005/Atom"
xmlns:x="http://www.w3.org/1999/xhtml">
<xsl:output method="text" encoding="UTF-8" />
<xsl:template match="/a:feed">
<xsl:for-each select="a:entry/a:summary/x:div/x:strong">
<xsl:value-of select="." />
<xsl:value-of select="normalize-space(following-sibling::text()[1])" />
<xsl:text> </xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
获得:
结果
Category:Information
Bulletin Issue Time: 2022.01.29 03:00:32 UTC
Preliminary Magnitude: 6.6(Mwp)
Lat/Lon: -29.751 / -174.709
Affected Region: KERMADEC ISLANDS REGION