使用 NodeJS 从 RSS 提要中提取 CDATA
Extract CDATA from RSS feed using NodeJS
我正在使用 feedparser 版本 2.2.9 来解析提要:
“https://www.veganlifemag.com/feed/”。
关于 rss 提要的描述标签,它有 HTML (CDATA) 内容和将我需要提取的内容括起来的标签。我想知道是否有一种方法可以提取 CDATA 中的内容或特定内容。
提前致谢,
杰瑞
RSS 提要示例
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>VegNews.com (News)</title>
<description></description>
<link>https://vegnews.com/news</link>
<language>en</language>
<item>
<title>London Fashion Week Will Be Fur-Free This Year for the First Time</title>
<category>News</category>
<pubDate>Mon, 10 Sep 2018 01:50:00 -0700</pubDate>
<link>https://vegnews.com/2018/9/london-fashion-week-will-be-fur-free-this-year-for-the-first-time</link>
<guid>https://vegnews.com/2018/9/london-fashion-week-will-be-fur-free-this-year-for-the-first-time</guid>
<description>
<![CDATA[<img src="https://vegnews.com/media/W1siZiIsIjEyOTE1L1ZlZ05ld3MuRmFzaGlvbkxvbmRvbi5wbmciXSxbInAiLCJ0aHVtYiIsIjgwMHg0NzMjIix7ImZvcm1hdCI6ImpwZyJ9XSxbInAiLCJvcHRpbWl6ZSJdXQ/VegNews.FashionLondon.png?sha=ec3755007e36522e" /><p>Anticipated event London Fashion Week (LFW) kicks off September 14, this year with no fur in sight. While LFW did not impose a ban on fur, every designer that will present their collections this year has adopted a fur-free policy, including last-minute holdout Burberry. After more than a decade of pressure from animal-rights organizations, including <a href="http://www.hsi.org/" target="_blank" rel="noopener">Humane Society International UK</a> and <a href="https://www.peta.org/" target="_blank" rel="noopener">People for the Ethical Treatment of Animals</a>, Burberry announced this month that it would no longer use fur in its collections and appointed Riccardo Tisci as its new creative director to phase out any remaining fur items. “I don’t think it is compatible with modern luxury and with the environment in which we live, and Riccardo has a very strong view as well on this,” LFW CEO Marco Gobbetti told <a href="https://www.businessoffashion.com/articles/professional/burberry-stops-destroying-product-and-bans-real-fur" target="_blank" rel="noopener"><em>Business of Fashion</em></a>. “It’s part of what Burberry is today.” Similarly, animal fur is falling out of favor in the United States. Earlier this year, American designer <a href="https://vegnews.com/2018/3/dkny-and-donna-karan-go-fur-free" target="_blank" rel="noopener">Donna Karan</a> pledged to eliminate the material from her future collections, and the city of <a href="https://vegnews.com/2018/3/san-francisco-bans-fur-sales" target="_blank" rel="noopener">San Francisco</a> joined <a href="https://vegnews.com/2013/9/west-hollywood-says-no-to-real-fur-in-fashion" target="_blank" rel="noopener">West Hollywood</a> and <a href="https://vegnews.com/2017/4/berkeley-prohibits-fur-sales-citywide" target="_blank" rel="noopener">Berkeley</a> in banning fur sales within city limits.</p>]]>
</description>
</item>
CDATA 仅表示 "Treat this content at plain text",因此它忽略了通常在 XML 中具有特殊含义的字符的特殊含义(如 <
表示 "start of tag")。
描述的值是HTML的片段。如果您想从中提取特定内容,则 运行 通过 HTML 解析器。
我正在使用 feedparser 版本 2.2.9 来解析提要: “https://www.veganlifemag.com/feed/”。
关于 rss 提要的描述标签,它有 HTML (CDATA) 内容和将我需要提取的内容括起来的标签。我想知道是否有一种方法可以提取 CDATA 中的内容或特定内容。
提前致谢,
杰瑞
RSS 提要示例
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>VegNews.com (News)</title>
<description></description>
<link>https://vegnews.com/news</link>
<language>en</language>
<item>
<title>London Fashion Week Will Be Fur-Free This Year for the First Time</title>
<category>News</category>
<pubDate>Mon, 10 Sep 2018 01:50:00 -0700</pubDate>
<link>https://vegnews.com/2018/9/london-fashion-week-will-be-fur-free-this-year-for-the-first-time</link>
<guid>https://vegnews.com/2018/9/london-fashion-week-will-be-fur-free-this-year-for-the-first-time</guid>
<description>
<![CDATA[<img src="https://vegnews.com/media/W1siZiIsIjEyOTE1L1ZlZ05ld3MuRmFzaGlvbkxvbmRvbi5wbmciXSxbInAiLCJ0aHVtYiIsIjgwMHg0NzMjIix7ImZvcm1hdCI6ImpwZyJ9XSxbInAiLCJvcHRpbWl6ZSJdXQ/VegNews.FashionLondon.png?sha=ec3755007e36522e" /><p>Anticipated event London Fashion Week (LFW) kicks off September 14, this year with no fur in sight. While LFW did not impose a ban on fur, every designer that will present their collections this year has adopted a fur-free policy, including last-minute holdout Burberry. After more than a decade of pressure from animal-rights organizations, including <a href="http://www.hsi.org/" target="_blank" rel="noopener">Humane Society International UK</a> and <a href="https://www.peta.org/" target="_blank" rel="noopener">People for the Ethical Treatment of Animals</a>, Burberry announced this month that it would no longer use fur in its collections and appointed Riccardo Tisci as its new creative director to phase out any remaining fur items. “I don’t think it is compatible with modern luxury and with the environment in which we live, and Riccardo has a very strong view as well on this,” LFW CEO Marco Gobbetti told <a href="https://www.businessoffashion.com/articles/professional/burberry-stops-destroying-product-and-bans-real-fur" target="_blank" rel="noopener"><em>Business of Fashion</em></a>. “It’s part of what Burberry is today.” Similarly, animal fur is falling out of favor in the United States. Earlier this year, American designer <a href="https://vegnews.com/2018/3/dkny-and-donna-karan-go-fur-free" target="_blank" rel="noopener">Donna Karan</a> pledged to eliminate the material from her future collections, and the city of <a href="https://vegnews.com/2018/3/san-francisco-bans-fur-sales" target="_blank" rel="noopener">San Francisco</a> joined <a href="https://vegnews.com/2013/9/west-hollywood-says-no-to-real-fur-in-fashion" target="_blank" rel="noopener">West Hollywood</a> and <a href="https://vegnews.com/2017/4/berkeley-prohibits-fur-sales-citywide" target="_blank" rel="noopener">Berkeley</a> in banning fur sales within city limits.</p>]]>
</description>
</item>
CDATA 仅表示 "Treat this content at plain text",因此它忽略了通常在 XML 中具有特殊含义的字符的特殊含义(如 <
表示 "start of tag")。
描述的值是HTML的片段。如果您想从中提取特定内容,则 运行 通过 HTML 解析器。