R - 从在线数据库中提取数据并为每次迭代具有唯一名称的循环脚本

R - loop script that pulls data from online database and have unique name for each iteration

我是 R 的新手,一直在努力使用 for 循环来简化一些代码。我正在尝试使用包 dataRetrieval 从在线数据库中提取水质数据。我目前已经为每个站点复制了代码并更改了站点编号和输出名称,但一直试图通过将脚本放在 for 循环 中来简化这一过程,但在创建使用唯一标识符分隔数据 table。

为每个站点创建数据的原始代码 table。唯一改变的变量是 siteNumbers 和数据 table name "x"_dataTable

#BW00A
siteNumbers = c("383652091125002")
parameterCode = c("00010","00095", "00300", "00400", "34475", "34485", "45617")
startDate = "1900-01-01"
endDate = "2020-12-01"

BW00A_dataTable <- readNWISqw(siteNumbers, parameterCode,
                             startDate, endDate)
#BW01
siteNumbers = c("383648091124501")
parameterCode = c("00010","00095", "00300", "00400", "34475", "34485", "45617")
startDate = "1900-01-01"
endDate = "2020-12-01"

BW01_dataTable <- readNWISqw(siteNumbers, parameterCode,
                             startDate, endDate)
#BW01A
siteNumbers = c("383648091124502")
parameterCode = c("00010","00095", "00300", "00400", "34475", "34485", "45617")
startDate = "1900-01-01"
endDate = "2020-12-01"

BW01A_dataTable <- readNWISqw(siteNumbers, parameterCode,
                             startDate, endDate)

新代码我无法上班。我已将 siteNumberssiteNames 放入数据框中。我想要的是 for 循环 中的脚本遍历 siteNumbers 以拉取数据,然后将新创建的数据 table 归因于相应的 siteNames 又名 unique_siteName。我不确定这是否可能。

df <- data.frame(
  siteNumbers = c("383652091125001",    "383652091125002",  "383648091124501",  "383648091124502",  "383506091132201",  "383508091132002",  "383508091132004",  "383519091133701",  "383544091132601",  "383544091132502",  "383628091124801",  "383639091125902",  "383639091125901",  "383638091125001",  "383638091125002",  "383631091124803",  "383631091124804",  "383631091124801",  "383631091124802",  "383636091123801",  "383636091123811",  "383616091125701",  "383640091130701",  "383640091130702",  "383621091130701",  "383621091130703",  "383621091130702",  "383624091130501",  "383624091130502",  "383616091130801",  "383616091130802",  "383644091131601",  "383627091130201",  "383622091130604",  "383622091130605",  "383557091132001",  "383614091132801"),
  siteName = c("BW-00", "BW-00A",   "BW-01",    "BW-01A",   "MW-04",    "MW-04A",   "MW-04B",   "MW-11",    "BW-21",    "BW-21A",   "210TB-C6", "Bates Spring", "Bates Spring below dam",   "BW-02",    "BW-02A",   "BW-04A-D", "BW-04A-S", "BW-04D",   "BW-04S",   "BW-05",    "BW-05A",   "BW-07",    "BW-08",    "BW-08A",   "BW-11",    "BW-11A-D", "BW-11A-S", "BW-13",    "BW-13A",   "BW-14",    "BW-14A",   "BW4-15",   "BW4-16",   "BW4-17",   "BW4-18",   "W3",   "W4")
)

parameterCode = c("00010","00095", "00300", "00400", "34475", "34485", "45617")
startDate = "1900-01-01"
endDate = "2020-12-01"

for (row in df)
{
 unique_siteName <- readNWISqw(siteNumbers, parameterCode,
                             startDate, endDate)  
  
}

感谢您的帮助!

您需要遍历行索引并在循环中引用具有行号的数据框,并创建一个list来累积结果:

results <- list()
for (row in 1:nrow(df)) {
 results[[i]] <- readNWISqw(df$siteNumbers[i], parameterCode,
                             startDate, endDate)  
}
names(results) <- df$siteName

R 还提供了 lapply 作为一种简化这种常见模式的方法。上面的循环等同于:

results <- lapply(df$siteNumbers, FUN = readNWISqs, parameterCode, startDate, endDate)
names(results) <- df$siteName

我建议阅读我在 How to make a list of data frames? 上的回答以进行更多讨论和解释,包括为什么我们这样做以及下一步有什么好处(例如,结合 results 列表成一个单一的数据框)。