一个特定目录中的 .csv 文件列表

Question

我在一个目录中有 .csv 个文件（比方说 C:/Dowloads）。我可以使用 list.files("path") 从该目录中读取所有文件。但是我无法使用 for 循环读取指定数量的文件。也就是说，假设我有 332 个文件，我只想读取文件 1 到 10 或 5 到 10。

这是一个例子：

files <- list.files("path")
files ## displays all the files.

现在我做了测试：

k <- files[1:10]
k
## here it displays the files from 1 to 10.

所以我使用 for 循环保持同样的事情，因为我想一个一个地读取文件。

for(i in 1:length(k)){
  length(i) ## just tested the length 
}

但它给出的是 NA 或 Null 或 1。

任何人都可以解释我如何使用 for 循环或任何其他方式读取指定的 .csv 文件吗？

Answer 1

遗憾的是，没有可重现的示例可供使用。通常，当我必须做类似的任务时，我会这样做：

files <- list.files(pattern='*.csv') # this search all .csv files in current working directory 
for(i in 1:length(files){
    read.csv(files[i], stringsAsFactors=F)
}

您的代码无法正常工作，因为您测试的是索引的长度，而不是向量的长度。希望这有帮助

Answer 2

list.files return class character 的字符向量。字符向量是字符串（即字符）的向量。应用于字符向量 files 或字符向量中的一系列元素 files[1:10] 或字符向量 files[i] 中的单个元素的函数 length 将 return 该字符向量中的字符串数、范围内的字符串数或 1。使用 nchar 来获取字符向量的每个元素（每个字符串）的字符数。所以：

path.to.csv <- "/path/to/your/csv/files"
files<-list.files(path.to.csv)
print(files)  ## list all files in path

k<-files[1:10]
print(k)      ## list first 10 files in path

for(i in 1:length(k)) {  ## loop through the first 10 files
  print(k[i]) ## each file name
  print(nchar(k[i])) ## the number of characters in each file name
  df <- read.csv(paste0(path.to.csv,"/",k[i]))  ## read each as a csv file
  ## process each df in turn here
}

注意我们在调用read.csv时必须paste将"path"添加到文件名中

编辑：我想我添加这个作为替代：

path.to.csv <- "/path/to/your/csv/files"
files<-list.files(path.to.csv)

for(iFile in files) {  ## loop through the files
  print(iFile) ## each file name
  print(nchar(iFile)) ## the number of characters in each file name
  df <- read.csv(paste0(path.to.csv,"/",iFile))  ## read each as a csv file
  ## process each df in turn here
}

此处，for 循环遍历 files 的集合（向量），因此 iFile 是第 i 个文件名。

希望这对您有所帮助。

Answer 3

要一次读取特定数量的文件，您可以对文件向量进行子集化。首先创建一个文件向量，包括路径：

f = list.files("/dir/dir", full.names=T, pattern="csv")
# nb full.names returns the full path to each file

然后，将每个文件读取到单独的列表项（在本例中为前 10 个）：

dl = lapply(f[1:10], read.csv)

最后，看一下列表项 1：

head(dl[[1]])

一个特定目录中的 .csv 文件列表

list of .csv files in one specific directory

csv

r

data-science