R - 在循环的数据框中提取列

Question

我需要保存一个 csv 文件列表，并从每个数据帧的特定列（第二列）的第 13 行提取值。

这是我的尝试：

temp <- list.files(FILEPATH, pattern="*\.csv$", full.names = TRUE)

for (i in 1:length(temp)){ 
  assign(temp[i], read.csv(temp[i], header=TRUE, ski[=13, na.strings=c("", "NA")))
  subset(temp[i], select=2) #extract the second column of the dataframe
  temp[i] <- na.omit(temp[i])

但是，这不起作用。一方面，我认为这是因为 read.csv 命令的 skip 参数，因为它显然忽略了 headers。另一方面，如果不使用skip，则会弹出以下错误：

Error in subset.default(temp[i], select = 2) : argument "subset" is missing, with no default

当我在 subset 命令中插入参数 subset=TRUE 时，它没有给出任何错误，但没有执行提取。

任何可能的解决方案？

Answer 1

如果没有看到文件，很难分辨，但我会使用 lapply，而不是 for 循环。也许你可以从下面这样的东西中得到灵感。我使用 read.table 是因为 skip = 13 行和 read.csv 在第一行中读作列 headers。请注意，我避免使用 assign.

df_list <- lapply(temp, read.table, sep = ",", skip = 13, na.strings = c("", "NA"))
names(df_list) <- temp
col2_list <- lapply(df_list, `[[`, 2)
col2_list <- lapply(col2_list, na.omit)
names(col2_list) <- temp
col2_list

如果您希望 col2_list 成为 df 的列表，每列只有一列，即原始文件的第 2 列，那么，就像我在评论中所说的那样，使用

col2_list <- lapply(df_list, `[`, 2)

并重命名该列并连续重新编号行

new_name <- "the_column_of_choice"  #  change this!
col2_list <- lapply(col2_list, function(x){
            names(x) <- new_name
            row.names(x) <- NULL
            x
        })

R - 在循环的数据框中提取列

R - extracting column in dataframes of a loop

for-loop

r

subset

read.csv