使用 SimpleXMLElement 访问 RSS 提要中的项目

Accessing items in an RSS feed using SimpleXMLElement

我正在尝试使用 PHP 阅读 this RSS feed。来自 XML 的小片段:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:cc="http://web.resource.org/cc/"
   xmlns:rss="http://purl.org/rss/1.0/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   xmlns:prism="http://prismstandard.org/namespaces/basic/2.0/"
   xmlns:dc="http://purl.org/dc/elements/1.1/">
   <rss:channel rdf:about="https://onlinelibrary.wiley.com/loi/18630669?af=R">
      <rss:title>Wiley: CLEAN – Soil, Air, Water: Table of Contents</rss:title>
      <rss:description>Table of Contents for CLEAN – Soil, Air, Water. List of articles from both the latest and EarlyView issues.</rss:description>
      <rss:link>https://onlinelibrary.wiley.com/loi/18630669?af=R</rss:link>
      <dc:title>Wiley: CLEAN – Soil, Air, Water: Table of Contents</dc:title>
      <dc:publisher>Wiley</dc:publisher>
      <dc:language>en-US</dc:language>
      <prism:publicationName>CLEAN – Soil, Air, Water</prism:publicationName>
      <rss:items>
         <rdf:Seq>
            <rdf:li rdf:resource="https://onlinelibrary.wiley.com/doi/abs/10.1002/clen.201800305?af=R"/>
            <rdf:li rdf:resource="https://onlinelibrary.wiley.com/doi/abs/10.1002/clen.201700117?af=R"/>
         </rdf:Seq>
      </rss:items>
   </rss:channel>
   <rss:image rdf:about="http://www.atypon.com/images/atypon_logo_small.gif">
      <rss:title>CLEAN – Soil, Air, Water</rss:title>
      <rss:url>http://www.atypon.com/images/atypon_logo_small.gif</rss:url>
      <rss:link>https://onlinelibrary.wiley.com/loi/18630669?af=R</rss:link>
   </rss:image>
   <rss:item rdf:about="https://onlinelibrary.wiley.com/doi/abs/10.1002/clen.201800305?af=R">
      <rss:title>The Limiting Factor to the Outbreak of Lake Black Bloom: Roles of Ferrous Iron and Sulfide Ions</rss:title>
      <dc:description>
         abc
      </dc:description>
      <dc:creator>
         Qiushi Shen, 
         Chengxin Fan, 
         Cheng Liu, 
         Chao Chen
      </dc:creator>
      <rss:link>https://onlinelibrary.wiley.com/doi/abs/10.1002/clen.201800305?af=R</rss:link>
      <content:encoded>CLEAN – Soil, Air, Water, &lt;a href="https://onlinelibrary.wiley.com/toc/18630669/2018/46/9"&gt;Volume 46, Issue 9&lt;/a&gt;, September 2018. &lt;br/&gt;</content:encoded>
      <rss:description>CLEAN – Soil, Air, Water, Volume 46, Issue 9, September 2018. &lt;br/&gt;</rss:description>
      <dc:title>The Limiting Factor to the Outbreak of Lake Black Bloom: Roles of Ferrous Iron and Sulfide Ions</dc:title>
      <dc:identifier>doi:10.1002/clen.201800305</dc:identifier>
      <dc:source>CLEAN – Soil, Air, Water</dc:source>
      <dc:date>2018-08-19T07:00:00Z</dc:date>
      <prism:publicationName>CLEAN – Soil, Air, Water</prism:publicationName>
      <prism:volume>46</prism:volume>
      <prism:number>9</prism:number>
      <prism:coverDate>2018-08-19T07:00:00Z</prism:coverDate>
      <prism:coverDisplayDate>2018-08-19T07:00:00Z</prism:coverDisplayDate>
      <prism:doi>10.1002/clen.201800305</prism:doi>
      <prism:url>https://onlinelibrary.wiley.com/doi/abs/10.1002/clen.201800305?af=R</prism:url>
      <prism:copyright/>
   </rss:item>
   <rss:item rdf:about="https://onlinelibrary.wiley.com/doi/abs/10.1002/clen.201700117?af=R">
      <rss:title>A Pilot‐Scale Diatomite Membrane Bioreactor for Slightly Polluted Surface Water Treatment</rss:title>
      <dc:description>
         abc
      </dc:description>
      <dc:creator>
         Wen Sun, 
         Cuimei Li, 
         Bingzhi Dong, 
         Huaqiang Chu
      </dc:creator>
      <rss:link>https://onlinelibrary.wiley.com/doi/abs/10.1002/clen.201700117?af=R</rss:link>
      <content:encoded>CLEAN – Soil, Air, Water, &lt;a href="https://onlinelibrary.wiley.com/toc/18630669/2018/46/9"&gt;Volume 46, Issue 9&lt;/a&gt;, September 2018. &lt;br/&gt;</content:encoded>
      <rss:description>CLEAN – Soil, Air, Water, Volume 46, Issue 9, September 2018. &lt;br/&gt;</rss:description>
      <dc:title>A Pilot‐Scale Diatomite Membrane Bioreactor for Slightly Polluted Surface Water Treatment</dc:title>
      <dc:identifier>doi:10.1002/clen.201700117</dc:identifier>
      <dc:source>CLEAN – Soil, Air, Water</dc:source>
      <dc:date>2018-08-24T07:00:00Z</dc:date>
      <prism:publicationName>CLEAN – Soil, Air, Water</prism:publicationName>
      <prism:volume>46</prism:volume>
      <prism:number>9</prism:number>
      <prism:coverDate>2018-08-24T07:00:00Z</prism:coverDate>
      <prism:coverDisplayDate>2018-08-24T07:00:00Z</prism:coverDisplayDate>
      <prism:doi>10.1002/clen.201700117</prism:doi>
      <prism:url>https://onlinelibrary.wiley.com/doi/abs/10.1002/clen.201700117?af=R</prism:url>
      <prism:copyright/>
   </rss:item>

也就是说结构是这样的:

  rdf:RDF
    (some items I don't care about)
    rss:item
    rss:item
    rss:item

我试图访问的是那些 rss:item 对象(一个接一个,执行 foreach 循环)。我试过很多不同的版本:

$url = "https://onlinelibrary.wiley.com/action/showFeed?jc=18630669&type=etoc&feed=rss";
$xml = file_get_contents($url);
$xml = new SimpleXMLElement($xml);

$var = $xml->{'rdf:RDF'};

但它只有 returns object(SimpleXMLElement)[598]。我无法访问这些项目。我试过使用 $xml->children() 以及 $xml->{'rss:item'} 和许多其他选项,但我只得到 SimpleXML返回元素对象,无法访问信息;从来没有包含所有项目的数组。

如果你想输出<rss:item>数据,那么最简单的方法就是获取根元素的children(),但是从命名空间前缀为rss(调用$xml->children("rss", true)).这将允许您使用 object 符号访问所有数据(对于 SimpleXML 来说是正常的)。

$url = "https://onlinelibrary.wiley.com/action/showFeed?jc=18630669&type=etoc&feed=rss";
$xml = file_get_contents($url);
$xml = new SimpleXMLElement($xml);

foreach ( $xml->children("rss", true)->item as $item ) {
    echo (string)$item->title.PHP_EOL;
}

输出每个项目的标题元素(echo (string)$item->title.PHP_EOL 行)(缩写)...

The Limiting Factor to the Outbreak of Lake Black Bloom: Roles of Ferrous Iron and Sulfide Ions
A Pilot‐Scale Diatomite Membrane Bioreactor for Slightly Polluted Surface Water Treatment
Agronomic Valorization of Olive Mill Wastewaters: Effects on Medicago sativa Growth and Soil Characteristics

需要注意的一件事是您说您只能返回 object(SimpleXMLElement)[598] - 但 SimpleXMLElement 可能包含元素列表,更多的是您如何使用该内容的情况。此外,使用 print_r() 或许多其他查看内容的常规方式并不能为您提供完整数据。对于 SimpleXML - 使用 echo $xml->asXML(); 查看它包含的内容。