定期从闪亮的网站更新数据

Question

我有一个闪亮的应用程序使用来自不同网站的数据，所有这些数据都是每月的。它们由网站处理程序在不同时间更新。我最初是直接在我的应用程序中从网站读取数据，但是其中一个站点维护了 2 天，我无法运行我的应用程序。我不希望那种情况再次发生。所以我想到将数据保存在我的本地文件中，这样代码就会运行.

由于需要将数据更新为最新的可用值，我需要有关日程安排的帮助。我希望此代码每月运行一次，以便我的数据始终是最新的。

dMean <- function(d){
                  dd <- d  %>% filter(!between(month, 4, 10)) %>%
                    arrange(Year, month) %>%
                    filter(!(Year == min(Year) & month %in% 1:3 | 
                               Year == max(Year) & month %in% 11:12)) %>%
                    group_by(grp = cumsum(month == 11)) %>%
                    summarise(Year = last(Year),
                              value = mean(value)) %>%
                    select(-grp)
                  return(dd)
               }
    
dG1 <- fread('https://www.metoffice.gov.uk/hadobs/hadcrut4/data/current/time_series/HadCRUT.4.6.0.0.annual_ns_avg.txt',
             header = FALSE,select = c(1:2))
GTA1 <- as.matrix(dG1)
saveRDS(GTA1,"GTA.rds")

dM1 <- fread('https://psl.noaa.gov/data/correlation/mei.data',header = FALSE,fill = TRUE)
dM2 <- dM1[complete.cases(replace(dM1, dM1 == -999.000, NA)),]
dM3 <- matrix(as.numeric(unlist(dM2)),nrow=nrow(dM2))
dM4 <- data.frame(Year = rep(unique(dM3[,1]), each = 12),month = 1:12,value = as.vector(t(dM3[,2:13])))
MEI1 <- as.matrix(dMean(dM4))
saveRDS(MEI1,"MEI.rds")

dS1 <- fread('https://www.cpc.ncep.noaa.gov/data/indices/sstoi.indices',header = TRUE,select = c(1,2,10))
dS2 <- as.matrix(dS1)
dS3 <- data.frame(Year = dS2[,1],month = dS2[,2], value = dS2[,3])
SST1 <- as.matrix(dMean(dS3))
saveRDS(SST1,"SST.rds")

如有任何帮助，我将不胜感激。

Answer 1

如果您将脚本放在基于 linux 的操作系统中，那么您可以运行使用 crontab 命令定期执行它。 crontab 命令按用户定义的预定方式执行应用程序，从每小时到每月。请阅读 linux crontab 命令的手册，如下所示：

https://phoenixnap.com/kb/set-up-cron-job-linux

例如，作业运行将在每个月的第一天编写一个 R 脚本：

0 0 1 * * Rscript userfile.R

定期从闪亮的网站更新数据

updating data regularly from a website shiny

r

auto-update

shiny