如何使用 Python feedparser 访问 RSS 项目的 pubDate?

How do I access pubDate for RSS items using Python feedparser?

this example RSS feed, the optional item element pubDate is included in all entries. But it is not available as a item element in the Python module feedparser。此代码:

import feedparser
rss_object = feedparser.parse("http://cyber.law.harvard.edu/rss/examples/rss2sample.xml")
for entry in rss_object.entries:
    print entry.pubDate

导致错误 AttributeError: object has no attribute 'pubDate' 但我可以成功 print entry.description 并查看所有描述标签的内容。

feedparser 是一个自以为是的解析器,而不是简单地在字典中返回 XML。 pubDate 的文本可用作 entries[i].published

The date this entry was first published, as a string in the same format as it was published in the original feed.

工作代码:

for entry in rss_object.entries:
    print entry.published

注意:published 是从多个可能的 XML 标签之一中提取的,具体取决于提要的格式。有关列表,请参阅 the reference manual

本手册还声称 pubDate 元素在 entries[i].published_parsed 中被解析 "as a date"。 published_parsed 中的是一个 time.struct_time 对象;如果原始提要包含时区,您可能需要 re-parse the date yourself to maintain time zone information