循环导入和合并文件

Importing and merging files in loop

我必须合并数据集。它们是 .sav 文件,我每个月、每年有 6-7 个数据集——总共 13 年。有很多数据集需要导入和合并,我想使用循环自动执行此操作。

因为我是初学者,所以我编写了第一个循环来简单地合并一年的数据集(因此只循环几个月)。这是我的代码,它完美地完成了我想要的。它不是最快的,当然也不是最漂亮或最高效的,但它确实有效。注意:为了简洁起见,我在发布的代码中缩短了 "C..." 路径:在我的真实代码中,它是完整路径。

 for (m in months) {
setwd(paste("C:.... survey\DANE 2005\",m,sep=""))
files_2005 <- list.files(path=(paste("C:\....survey\DANE 2005\",m,sep="")),  pattern=("Area.*.sav"))

#for (i in (paste("files_",m,sep=""))){
   df_2005 <- lapply(files_2005, read_sav)
  assign(paste("DANE2005_",m,sep=""), df_2005 %>% reduce(rbind.fill))

#}

df_2005 <- mget(ls(pattern="DANE2005_"))
dane_2005 <- df_2005 %>% reduce(rbind.fill)

}

这是我当前的代码,循环了数年和数月(感谢@Onyambu 的评论)。但是,它仍然不起作用;如果我不使用 setwd R 表示 "current file does not exist in the directory"(并指回我的主目录,而不是指定的路径)。如果我确实使用 setwd,我会收到 "cannot change working directory" 错误。

for (y in years) {
  for (m in months) {

    #Go to a folder per year/month
    path <- paste("C:.... survey\DANE ",y,"\",m,sep="")
    #Create a list of all the files in that folder by month, based on a pattern
    list_data<-list.files(path=path,  pattern=("Area.*.sav"))

    if(!is_empty(list_data)){
    #Read in all the files in the folder by month, based on the list
    df_2005 <- lapply(list_data, read_sav)
    #bind the files for one month together based on the list
    assign(paste("DANE2005_",m,sep=""), df_2005 %>% reduce(rbind.fill))
    }
  }
  #Bind together all the files for one year
  df_2005 <- mget(ls(pattern="DANE2005_"))
  dane_2005 <- df_2005 %>% reduce(left_join)
}

非常感谢任何帮助。

编辑:清理代码并在初始评论后重新提出问题以清晰起见。

以下是您需要尝试的内容:

# give the path only to the folder where the years are inside the folder
path <- "C:.... survey"

# read all the files in this path using recursive = TRUE-Gives all years,all months,all files
all_files <- list.files(path, pattern = "Area.*.sav", full.names=TRUE, recursive = TRUE)

# Now read all these files into a list. Of course you would like to have the year and the month for the file:

my_read <- function(x){
      nm <- unlist(strsplit(sub(".*survey/","",x),"/"))# Remove everything until survey. You only remain with year,month and file name
      cbind(year = nm[1],month = nm[2],file = nm[3], read_sav(x))
    }

# Now use myl_read function in read the data:

  dat_list <- lapply(all_files,my_read)