从不同的 RSS 提要格式获取属性
Get attribute from different RSS Feeds Format
我必须提取 RSS 提要中文章的作者,问题是一个 RSS 的作者姓名属性列为 dc:creator 而另一个作为 author(代码如下)。关于如何使我的查询针对这两种情况动态化的任何方式?
查询:
CREATE PROCEDURE feed.usp_importXML(@file VARCHAR(8000))
AS
BEGIN
DECLARE @Query VARCHAR(8000)
SET @Query ='
DECLARE @xmlFile as XML
SET @xmlFile =(SELECT CONVERT(XML,BulkColumn) as BulkColumn
FROM OPENROWSET (BULK '''+@file+''', SINGLE_BLOB) AS t)
INSERT INTO feed.tempXML (source, title,link,author,[date])
SELECT
source = t.value (''../link[1]'', ''NVARCHAR(300)''),
title = t.value (''title[1]'', ''NVARCHAR(300)''),
link = t.value (''./link[1]'', ''NVARCHAR(300)''),
author = t.value(''(*:creator)[1]'',''NVARCHAR(50)''),
[date] = t.value(''pubDate[1]'',''NVARCHAR(50)'')
FROM @xmlFile.nodes(''/rss/channel/item'') AS xTable(t);'
EXEC(@Query)
END
GO
RSS 1:
<item>
<guid isPermaLink="false">http://www.espnfc.com/story/3154621/wojciech-szczesny-completes-transfer-to-juventus-from-arsenal3154621</guid>
<title><![CDATA[Wojciech Szczesny completes transfer to Juventus from Arsenal]]></title>
<description>
<img style="float: left; margin-right: 10px;" src="http://a.espncdn.com/combiner/i/?img=/photo/2016/1002/r134626_1296x729_16-9.jpg&amp;w=100&amp;h=80&amp;scale=crop&amp;site=espnfc" /><![CDATA[Douglas Costa hopes to evolve as a player with Juventus and gain recognition for Brazil's World Cup squad next year.
Juventus have completed the signing of Wojciech Szczesny from Arsenal for a fee of &#8364;12.2 million.
Poland goalkeeper Szczesny underwent his medical in Turin on Tuesday and officially became a Juventus player on Wednesday in a deal that could rise to &#8364;15.3 million, depending on performance.
The 27-year-old, who spent the last two seasons on loan at Roma, has signed a four-year contract for the Bianconeri, where he is expected to be understudy to Italy No. 1 Gianlugi Buffon in the coming...]]>
</description>
<link>http://www.espnfc.com/story/3154621/wojciech-szczesny-completes-transfer-to-juventus-from-arsenal</link>
<pubDate>Wed, 19 Jul 2017 06:19:00 PDT</pubDate>
<enclosure length="150" url="http://a.espncdn.com/combiner/i/?img=/photo/2016/1002/r134626_1296x729_16-9.jpg&amp;w=100&amp;h=80&amp;scale=crop&amp;site=espnfc" type="image/jpeg" />
<category>Story</category>
<category><![CDATA[Transfers]]></category>
<category><![CDATA[Juventus]]></category>
<category><![CDATA[Arsenal]]></category>
<category><![CDATA[Wojciech Szczesny]]></category>
<category><![CDATA[English Premier League]]></category>
<category><![CDATA[Italian Serie A]]></category>
<dc:creator>Ben Gladwell</dc:creator>
</item>
RSS 2:
-<item>
<title>Sampdoria Striker Patrick Schick Could Be Set to Join Inter After Collapse of Juventus Deal</title>
<link>http://www.90min.com/posts/5285895-sampdoria-striker-patrick-schick-could-be-set-to-join-inter-after-collapse-of-juventus-deal?utm_source=RSS</link>
<author>Callum Rice-Coates</author>
<guid isPermaLink="false">d5a2ba8b504a22fcdb405ec687f91956</guid>
<description>Sampdoria striker Patrick Schick could be on the verge of a move to Inter after a proposed deal to join Juventus fell through. Ginaluca Di Marzio has reported that the Czech forward's representatives have met with the Inter hierarchy to discuss the details of the potential transfer. According to the Italian journalist, Schick, who found the net 11 times in 32 Serie A appearances last season, 'could soon enjoy a new experience at Inter.' #Calciomercato | #Inter, incontro in corso con la...</description>
<media:thumbnail type="image/jpg" url="https://images0.minutemediacdn.com/production/912x516/596f80ed6bd5c5594b000001.jpg?main_image=true&imageType=.jpg"/>
<pubDate>Wed, 19 Jul 2017 19:43:56 +0000</pubDate>
</item>
不要将 BulkColumn 直接转换为 XML
,而是先将其转换为 NVARCHAR(MAX)
。
然后对该字符串使用 REPLACE
函数来查找 <dc:creator>
并将其替换为 <author>
并将 </dc:creator>
替换为 </author>
将新字符串转换为 XML 并使用 author 属性
继续 SELECT FROM XML
代码片段:
SET @Query ='
DECLARE @xmlFile as XML
DECLARE @xmlString NVARCHAR(MAX);
SET @xmlString =(SELECT CONVERT(NVARCHAR(MAX),BulkColumn) as BulkColumn
FROM OPENROWSET (BULK '''+@file+''', SINGLE_BLOB) AS t);
SET @xmlString = REPLACE(@xmlString, ''<dc:creator>'', ''<author>'')
SET @xmlString = REPLACE(@xmlString, ''</dc:creator>'', ''</author>'')
SELECT @xmlFile = CONVERT(XML, @xmlString);
INSERT INTO feed.tempXML (source, title,link,author,[date])
SELECT
source = t.value (''../link[1]'', ''NVARCHAR(300)''),
title = t.value (''title[1]'', ''NVARCHAR(300)''),
link = t.value (''./link[1]'', ''NVARCHAR(300)''),
author = t.value(''author[1]'',''NVARCHAR(50)''),
[date] = t.value(''pubDate[1]'',''NVARCHAR(50)'')
FROM @xmlFile.nodes(''/rss/channel/item'') AS xTable(t);'
您可以使用谓词调用 local-name()
来一般地获取此内容:
提示
你减少了你的 XML,这很好,但不完全有效的剩余部分必须更正一些东西(缺少名称空间)...
看看第二个 feed 中的 URL
。 &
标志会让你遇到麻烦...
declare @mockup TABLE(ID INT IDENTITY, YourXML XML);
INSERT INTO @mockup VALUES
(N'<item xmlns:dc="dummy">
<guid isPermaLink="false">http://www.espnfc.com/story/3154621/wojciech-szczesny-completes-transfer-to-juventus-from-arsenal3154621</guid>
<title><![CDATA[Wojciech Szczesny completes transfer to Juventus from Arsenal]]></title>
<description>
<img style="float: left; margin-right: 10px;" src="http://a.espncdn.com/combiner/i/?img=/photo/2016/1002/r134626_1296x729_16-9.jpg&amp;w=100&amp;h=80&amp;scale=crop&amp;site=espnfc" /><![CDATA[Douglas Costa hopes to evolve as a player with Juventus and gain recognition for Brazil's World Cup squad next year.
Juventus have completed the signing of Wojciech Szczesny from Arsenal for a fee of &#8364;12.2 million.
Poland goalkeeper Szczesny underwent his medical in Turin on Tuesday and officially became a Juventus player on Wednesday in a deal that could rise to &#8364;15.3 million, depending on performance.
The 27-year-old, who spent the last two seasons on loan at Roma, has signed a four-year contract for the Bianconeri, where he is expected to be understudy to Italy No. 1 Gianlugi Buffon in the coming...]]>
</description>
<link>http://www.espnfc.com/story/3154621/wojciech-szczesny-completes-transfer-to-juventus-from-arsenal</link>
<pubDate>Wed, 19 Jul 2017 06:19:00 PDT</pubDate>
<enclosure length="150" url="http://a.espncdn.com/combiner/i/?img=/photo/2016/1002/r134626_1296x729_16-9.jpg&amp;w=100&amp;h=80&amp;scale=crop&amp;site=espnfc" type="image/jpeg" />
<category>Story</category>
<category><![CDATA[Transfers]]></category>
<category><![CDATA[Juventus]]></category>
<category><![CDATA[Arsenal]]></category>
<category><![CDATA[Wojciech Szczesny]]></category>
<category><![CDATA[English Premier League]]></category>
<category><![CDATA[Italian Serie A]]></category>
<dc:creator>Ben Gladwell</dc:creator>
</item>')
,(N'<item xmlns:media="dummy">
<title>Sampdoria Striker Patrick Schick Could Be Set to Join Inter After Collapse of Juventus Deal</title>
<link>http://www.90min.com/posts/5285895-sampdoria-striker-patrick-schick-could-be-set-to-join-inter-after-collapse-of-juventus-deal?utm_source=RSS</link>
<author>Callum Rice-Coates</author>
<guid isPermaLink="false">d5a2ba8b504a22fcdb405ec687f91956</guid>
<description>Sampdoria striker Patrick Schick could be on the verge of a move to Inter after a proposed deal to join Juventus fell through. Ginaluca Di Marzio has reported that the Czech forward''s representatives have met with the Inter hierarchy to discuss the details of the potential transfer. According to the Italian journalist, Schick, who found the net 11 times in 32 Serie A appearances last season, ''could soon enjoy a new experience at Inter.'' #Calciomercato | #Inter, incontro in corso con la...</description>
<media:thumbnail type="image/jpg" />
<pubDate>Wed, 19 Jul 2017 19:43:56 +0000</pubDate>
</item>');
这是您的查询:
SELECT itm.value(N'(link/text())[1]','nvarchar(max)') AS link
,itm.value(N'(title/text())[1]','nvarchar(max)') AS title
,itm.value(N'(*[local-name()="creator" or local-name()="author"]/text())[1]','nvarchar(max)') AS author
FROM @mockup AS m
CROSS APPLY m.YourXML.nodes(N'/item') AS A(itm)
使用@Shnugo建议的一行代码,我通过替换这部分代码解决了问题:
author = t.value(''author[1]'',''NVARCHAR(50)'')
有了这部分:
author = t.value(N''(*[local-name()="creator" or local-name()="author"]/text())[1]'',''NVARCHAR(50)'')
所以现在查询如下所示:
SET @Query ='
DECLARE @xmlFile as XML
SET @xmlFile =(SELECT CONVERT(XML,BulkColumn) as BulkColumn
FROM OPENROWSET (BULK '''+@file+''', SINGLE_BLOB) AS t)
INSERT INTO feed.tempXML (source, title,link,author,[date])
SELECT
source = t.value (''../link[1]'', ''NVARCHAR(300)''),
title = t.value (''title[1]'', ''NVARCHAR(300)''),
link = t.value (''./link[1]'', ''NVARCHAR(300)''),
author = t.value(N''(*[local-name()="creator" or local-name()="author"]/text())[1]'',''NVARCHAR(50)''),
[date] = t.value(''pubDate[1]'',''NVARCHAR(50)'')
FROM @xmlFile.nodes(''/rss/channel/item'') AS xTable(t);'
我必须提取 RSS 提要中文章的作者,问题是一个 RSS 的作者姓名属性列为 dc:creator 而另一个作为 author(代码如下)。关于如何使我的查询针对这两种情况动态化的任何方式?
查询:
CREATE PROCEDURE feed.usp_importXML(@file VARCHAR(8000))
AS
BEGIN
DECLARE @Query VARCHAR(8000)
SET @Query ='
DECLARE @xmlFile as XML
SET @xmlFile =(SELECT CONVERT(XML,BulkColumn) as BulkColumn
FROM OPENROWSET (BULK '''+@file+''', SINGLE_BLOB) AS t)
INSERT INTO feed.tempXML (source, title,link,author,[date])
SELECT
source = t.value (''../link[1]'', ''NVARCHAR(300)''),
title = t.value (''title[1]'', ''NVARCHAR(300)''),
link = t.value (''./link[1]'', ''NVARCHAR(300)''),
author = t.value(''(*:creator)[1]'',''NVARCHAR(50)''),
[date] = t.value(''pubDate[1]'',''NVARCHAR(50)'')
FROM @xmlFile.nodes(''/rss/channel/item'') AS xTable(t);'
EXEC(@Query)
END
GO
RSS 1:
<item>
<guid isPermaLink="false">http://www.espnfc.com/story/3154621/wojciech-szczesny-completes-transfer-to-juventus-from-arsenal3154621</guid>
<title><![CDATA[Wojciech Szczesny completes transfer to Juventus from Arsenal]]></title>
<description>
<img style="float: left; margin-right: 10px;" src="http://a.espncdn.com/combiner/i/?img=/photo/2016/1002/r134626_1296x729_16-9.jpg&amp;w=100&amp;h=80&amp;scale=crop&amp;site=espnfc" /><![CDATA[Douglas Costa hopes to evolve as a player with Juventus and gain recognition for Brazil's World Cup squad next year.
Juventus have completed the signing of Wojciech Szczesny from Arsenal for a fee of &#8364;12.2 million.
Poland goalkeeper Szczesny underwent his medical in Turin on Tuesday and officially became a Juventus player on Wednesday in a deal that could rise to &#8364;15.3 million, depending on performance.
The 27-year-old, who spent the last two seasons on loan at Roma, has signed a four-year contract for the Bianconeri, where he is expected to be understudy to Italy No. 1 Gianlugi Buffon in the coming...]]>
</description>
<link>http://www.espnfc.com/story/3154621/wojciech-szczesny-completes-transfer-to-juventus-from-arsenal</link>
<pubDate>Wed, 19 Jul 2017 06:19:00 PDT</pubDate>
<enclosure length="150" url="http://a.espncdn.com/combiner/i/?img=/photo/2016/1002/r134626_1296x729_16-9.jpg&amp;w=100&amp;h=80&amp;scale=crop&amp;site=espnfc" type="image/jpeg" />
<category>Story</category>
<category><![CDATA[Transfers]]></category>
<category><![CDATA[Juventus]]></category>
<category><![CDATA[Arsenal]]></category>
<category><![CDATA[Wojciech Szczesny]]></category>
<category><![CDATA[English Premier League]]></category>
<category><![CDATA[Italian Serie A]]></category>
<dc:creator>Ben Gladwell</dc:creator>
</item>
RSS 2:
-<item>
<title>Sampdoria Striker Patrick Schick Could Be Set to Join Inter After Collapse of Juventus Deal</title>
<link>http://www.90min.com/posts/5285895-sampdoria-striker-patrick-schick-could-be-set-to-join-inter-after-collapse-of-juventus-deal?utm_source=RSS</link>
<author>Callum Rice-Coates</author>
<guid isPermaLink="false">d5a2ba8b504a22fcdb405ec687f91956</guid>
<description>Sampdoria striker Patrick Schick could be on the verge of a move to Inter after a proposed deal to join Juventus fell through. Ginaluca Di Marzio has reported that the Czech forward's representatives have met with the Inter hierarchy to discuss the details of the potential transfer. According to the Italian journalist, Schick, who found the net 11 times in 32 Serie A appearances last season, 'could soon enjoy a new experience at Inter.' #Calciomercato | #Inter, incontro in corso con la...</description>
<media:thumbnail type="image/jpg" url="https://images0.minutemediacdn.com/production/912x516/596f80ed6bd5c5594b000001.jpg?main_image=true&imageType=.jpg"/>
<pubDate>Wed, 19 Jul 2017 19:43:56 +0000</pubDate>
</item>
不要将 BulkColumn 直接转换为 XML
,而是先将其转换为 NVARCHAR(MAX)
。
然后对该字符串使用 REPLACE
函数来查找 <dc:creator>
并将其替换为 <author>
并将 </dc:creator>
替换为 </author>
将新字符串转换为 XML 并使用 author 属性
继续 SELECT FROM XML代码片段:
SET @Query ='
DECLARE @xmlFile as XML
DECLARE @xmlString NVARCHAR(MAX);
SET @xmlString =(SELECT CONVERT(NVARCHAR(MAX),BulkColumn) as BulkColumn
FROM OPENROWSET (BULK '''+@file+''', SINGLE_BLOB) AS t);
SET @xmlString = REPLACE(@xmlString, ''<dc:creator>'', ''<author>'')
SET @xmlString = REPLACE(@xmlString, ''</dc:creator>'', ''</author>'')
SELECT @xmlFile = CONVERT(XML, @xmlString);
INSERT INTO feed.tempXML (source, title,link,author,[date])
SELECT
source = t.value (''../link[1]'', ''NVARCHAR(300)''),
title = t.value (''title[1]'', ''NVARCHAR(300)''),
link = t.value (''./link[1]'', ''NVARCHAR(300)''),
author = t.value(''author[1]'',''NVARCHAR(50)''),
[date] = t.value(''pubDate[1]'',''NVARCHAR(50)'')
FROM @xmlFile.nodes(''/rss/channel/item'') AS xTable(t);'
您可以使用谓词调用 local-name()
来一般地获取此内容:
提示
你减少了你的 XML,这很好,但不完全有效的剩余部分必须更正一些东西(缺少名称空间)...
看看第二个 feed 中的 URL
。 &
标志会让你遇到麻烦...
declare @mockup TABLE(ID INT IDENTITY, YourXML XML);
INSERT INTO @mockup VALUES
(N'<item xmlns:dc="dummy">
<guid isPermaLink="false">http://www.espnfc.com/story/3154621/wojciech-szczesny-completes-transfer-to-juventus-from-arsenal3154621</guid>
<title><![CDATA[Wojciech Szczesny completes transfer to Juventus from Arsenal]]></title>
<description>
<img style="float: left; margin-right: 10px;" src="http://a.espncdn.com/combiner/i/?img=/photo/2016/1002/r134626_1296x729_16-9.jpg&amp;w=100&amp;h=80&amp;scale=crop&amp;site=espnfc" /><![CDATA[Douglas Costa hopes to evolve as a player with Juventus and gain recognition for Brazil's World Cup squad next year.
Juventus have completed the signing of Wojciech Szczesny from Arsenal for a fee of &#8364;12.2 million.
Poland goalkeeper Szczesny underwent his medical in Turin on Tuesday and officially became a Juventus player on Wednesday in a deal that could rise to &#8364;15.3 million, depending on performance.
The 27-year-old, who spent the last two seasons on loan at Roma, has signed a four-year contract for the Bianconeri, where he is expected to be understudy to Italy No. 1 Gianlugi Buffon in the coming...]]>
</description>
<link>http://www.espnfc.com/story/3154621/wojciech-szczesny-completes-transfer-to-juventus-from-arsenal</link>
<pubDate>Wed, 19 Jul 2017 06:19:00 PDT</pubDate>
<enclosure length="150" url="http://a.espncdn.com/combiner/i/?img=/photo/2016/1002/r134626_1296x729_16-9.jpg&amp;w=100&amp;h=80&amp;scale=crop&amp;site=espnfc" type="image/jpeg" />
<category>Story</category>
<category><![CDATA[Transfers]]></category>
<category><![CDATA[Juventus]]></category>
<category><![CDATA[Arsenal]]></category>
<category><![CDATA[Wojciech Szczesny]]></category>
<category><![CDATA[English Premier League]]></category>
<category><![CDATA[Italian Serie A]]></category>
<dc:creator>Ben Gladwell</dc:creator>
</item>')
,(N'<item xmlns:media="dummy">
<title>Sampdoria Striker Patrick Schick Could Be Set to Join Inter After Collapse of Juventus Deal</title>
<link>http://www.90min.com/posts/5285895-sampdoria-striker-patrick-schick-could-be-set-to-join-inter-after-collapse-of-juventus-deal?utm_source=RSS</link>
<author>Callum Rice-Coates</author>
<guid isPermaLink="false">d5a2ba8b504a22fcdb405ec687f91956</guid>
<description>Sampdoria striker Patrick Schick could be on the verge of a move to Inter after a proposed deal to join Juventus fell through. Ginaluca Di Marzio has reported that the Czech forward''s representatives have met with the Inter hierarchy to discuss the details of the potential transfer. According to the Italian journalist, Schick, who found the net 11 times in 32 Serie A appearances last season, ''could soon enjoy a new experience at Inter.'' #Calciomercato | #Inter, incontro in corso con la...</description>
<media:thumbnail type="image/jpg" />
<pubDate>Wed, 19 Jul 2017 19:43:56 +0000</pubDate>
</item>');
这是您的查询:
SELECT itm.value(N'(link/text())[1]','nvarchar(max)') AS link
,itm.value(N'(title/text())[1]','nvarchar(max)') AS title
,itm.value(N'(*[local-name()="creator" or local-name()="author"]/text())[1]','nvarchar(max)') AS author
FROM @mockup AS m
CROSS APPLY m.YourXML.nodes(N'/item') AS A(itm)
使用@Shnugo建议的一行代码,我通过替换这部分代码解决了问题:
author = t.value(''author[1]'',''NVARCHAR(50)'')
有了这部分:
author = t.value(N''(*[local-name()="creator" or local-name()="author"]/text())[1]'',''NVARCHAR(50)'')
所以现在查询如下所示:
SET @Query ='
DECLARE @xmlFile as XML
SET @xmlFile =(SELECT CONVERT(XML,BulkColumn) as BulkColumn
FROM OPENROWSET (BULK '''+@file+''', SINGLE_BLOB) AS t)
INSERT INTO feed.tempXML (source, title,link,author,[date])
SELECT
source = t.value (''../link[1]'', ''NVARCHAR(300)''),
title = t.value (''title[1]'', ''NVARCHAR(300)''),
link = t.value (''./link[1]'', ''NVARCHAR(300)''),
author = t.value(N''(*[local-name()="creator" or local-name()="author"]/text())[1]'',''NVARCHAR(50)''),
[date] = t.value(''pubDate[1]'',''NVARCHAR(50)'')
FROM @xmlFile.nodes(''/rss/channel/item'') AS xTable(t);'