R中的天气数据抓取和提取

Question

我正在从事一个研究项目，并被分配做一些数据抓取和用 R 编写代码，这可以帮助从 wunderground.com 等站点提取特定邮政编码的当前温度。现在这可能是一个抽象的问题，但有谁知道如何执行以下操作：我可以通过这样做提取特定邮政编码的当前温度：

    temps <- readLines("http://www.wunderground.com/q/zmw:20904.1.99999")
    edit(temps)
    temps //gives me the source code for the website where I can look at the line that contains the temperature
    ldata <- temps[lnumber]
    ldata
    #  then have a few gsub functions that basically extracts 
    # just the numerical data (57.8 for example) from that line of code

我有一个 cvs 文件，其中包含该国每个城市的邮政编码，我已将其导入 R。它根据邮政编码、城市和州排列在 table 中。我现在的挑战是编写一种方法（在这里使用 java 类比，因为我是 R 的新手）基本上提取 6-7 个连续的邮政编码（在指定一个特定的邮政编码之后）并通过修改 java 运行上述代码 link 在 readLines 函数中，并在 link 段 zmw:XXXXX 和运行之后的所有内容中输入相应的邮政编码 link。现在我不太清楚如何从 table 中提取数据。也许有一个 for-loop 函数？但是后来我不知道如何使用它来修改 link。我认为那是我真正陷入困境的地方。我有一点 Java 背景，所以我知道如何解决这个问题，只是不知道语法知识。我知道这是一个非常抽象的问题，因为我没有提供很多代码，但我只想知道它们 functions/syntax 将帮助我从 table 中提取数据并以某种方式使用它来修改link 通过函数而不是手动执行。

Answer 1

这是关于 Weather Underground 数据的。

您可以从 wunderground 中的各个气象站下载 csv 文件，但是您需要知道气象站标识符。这是华盛顿州柯克兰 (KWAKIRKL8) 气象站的示例 URL：

http://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=KWAKIRKL8&day=31&month=1&year=2014&graphspan=day&format=1

这是一些 R 代码：

  url <- 'http://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=KWAKIRKL8&day=31&month=1&year=2014&graphspan=day&format=1'
  s <- getURL(url)
  s <- gsub("<br>\n","",s)  
  wdf <- read.csv(con<-textConnection(s))

这是一个页面，您可以在其中手动查找电台及其代码。

http://www.wunderground.com/wundermap/

因为你只需要几个，你可以手动挑选它们。

R中的天气数据抓取和提取

Weather data scraping and extraction in R

regex

for-loop

r

weather

data-extraction