如何使用 Nokogiri 从网站上抓取数据

How to scrape data from a website using Nokogiri

When I try to scrape the table data from the following link it displays nothing.. `

我写了下面的代码,但它什么也没给出。我想要 table 数据,即上次更新、天气、温度 link 我给的请帮助我..

url = "http://w1.weather.gov/xml/current_obs/KM89.xml"

docs = Nokogiri::HTML(open(url))

puts docs.css("table")

转到该页面,打开您的开发工具,当您在“网络”选项卡下找到对 KM89.xml 的请求响应时,您会看到它没有返回 HTML,而是 XML 喜欢这个:

<?xml version="1.0" encoding="ISO-8859-1"?> 
<?xml-stylesheet href="latest_ob.xsl" type="text/xsl"?>
<current_observation version="1.0"
  <credit>NOAA's National Weather Service</credit>
    <title>NOAA's National Weather Service</title>
  <suggested_pickup>15 minutes after the hour</suggested_pickup>
  <location>Dexter B Florence Memorial Field Airport, AR</location>
  <observation_time>Last Updated on Nov 23 2012, 7:56 am CST</observation_time>
        <observation_time_rfc822>Fri, 23 Nov 2012 07:56:00 -0600</observation_time_rfc822>
  <weather>Light Rain</weather>
  <temperature_string>57.0 F (13.8 C)</temperature_string>
  <wind_string>Northeast at 8.1 MPH (7 KT)</wind_string>
  <pressure_string>1027.5 mb</pressure_string>
  <dewpoint_string>52.9 F (11.6 C)</dewpoint_string>
  <windchill_string>55 F (13 C)</windchill_string>


require 'open-uri'
require 'nokogiri'

url = 'http://w1.weather.gov/xml/current_obs/KM89.xml'
doc = Nokogiri::HTML(open(url))

p doc.at_css('station_id').text