ruby nokogiri 无法访问雅虎财经新闻 pubDate
Yahoo Finance news pubDate not accessable by ruby nokogiri
我可以访问雅虎财经新闻标题 title,但很难解析 pubDate 所以我只看上周的新闻而忽略旧的.
require 'nokogiri'
sym = "1313.HK"
url = "https://feeds.finance.yahoo.com/rss/2.0/headline?s=#{sym}®ion=US&lang=en-US"
doc = Nokogiri::HTML(open(url))
titles = doc.css("title")
puts titles.length # works, comes back with 0-20
puts titles.text # works
pubDates = doc.css("pubDate")
puts pubDates.length #does NOT work, always 0
puts pubDates.text #does NOT work, always blank
keywordregex = "bad news"
nodes = doc.search('title') # search title tags only, for keywords
puts found_title = nodes.select{ |n| n.name=='title' && n.text =~ keywordregex } # TODO && pubDate > 7 days old
试试 Nokogiri::XML,rss 真的是 XML。
doc = Nokogiri::XML(open(url))
pubdate
您的 XML 源中的节点名称是小写的。
> doc.css("pubdate").length
=> 7
我可以访问雅虎财经新闻标题 title,但很难解析 pubDate 所以我只看上周的新闻而忽略旧的.
require 'nokogiri'
sym = "1313.HK"
url = "https://feeds.finance.yahoo.com/rss/2.0/headline?s=#{sym}®ion=US&lang=en-US"
doc = Nokogiri::HTML(open(url))
titles = doc.css("title")
puts titles.length # works, comes back with 0-20
puts titles.text # works
pubDates = doc.css("pubDate")
puts pubDates.length #does NOT work, always 0
puts pubDates.text #does NOT work, always blank
keywordregex = "bad news"
nodes = doc.search('title') # search title tags only, for keywords
puts found_title = nodes.select{ |n| n.name=='title' && n.text =~ keywordregex } # TODO && pubDate > 7 days old
试试 Nokogiri::XML,rss 真的是 XML。
doc = Nokogiri::XML(open(url))
pubdate
您的 XML 源中的节点名称是小写的。
> doc.css("pubdate").length
=> 7