使用函数修改数据框中的列

Question

我正在尝试修改我的数据框列和位置。最后我找到了一些解决方案来做到这一点，但我想在一个函数中对目录中的所有数据集进行所有处理并覆盖真实数据。

kw <- matrix(1:11400, ncol = 19) # to make sample data
kw <- kw[, !(colnames(kw) %in% c("V18","V19"))]  # to remove last two cols
add <- c(kw$V18 <- 0,kw$V19<- 0)   # add new columns with all zero values
kw$V1 <- kw$V1 * 1000  # to modify first col of data frame
kw <- kw[ ,c(1,18:19,2:17)] # to replace col positions

假设我在目录中设置了数据

   kw<-read.table("5LSTT-test10.avgm", header = FALSE,fill=FALSE) # example which shows how I read single data 
  `5LSTT-test10.avgm`
    .  
    .   
    .  
    .

  5LSTT-test10.avgm`

如何将此列修改过程分别应用于每个数据并覆盖或制作新数据？

编辑输出 readLines("5LSTT-test10.avgm", n = 1) 您可以看到 19 列并认为此数据有 600 行

[1] "  9.0000E-02  0.0000E+00   2.3075E-03 -6.4467E-03  9.9866E-01   9.8648E-02  4.5981E-02  9.8004E-01   1.2359E-01  6.1175E-02  9.7701E-01   8.6662E-02  3.0034E-02  9.7884E-01   7.0891E-02  8.2247E-03  9.8564E-01  -8.7967E-11  4.3105E-02"

Answer 1

使用 "data.table"，您可以执行以下操作：

setcolorder(
  fread(yourfile)[, c("V1", "V18", "V19") := list(V1 * 1000, 0, 0)], c(1, 18:19, 2:17))

因此，如果你真的需要一个函数，你可以这样做：

myFun <- function(infile) {
  require(data.table)
  write.table(
    setcolorder(
      fread(infile)[
        , c("V1", "V18", "V19") := list(V1 * 1000, 0, 0)], 
      c(1, 18:19, 2:17)), 
    file = gsub("(.*)(\..*)", "\1_new\2", infile), 
    row.names = FALSE)
}

然后您可以在 lapply 中使用 myFun 来处理您要读取和处理的文件的向量。

换句话说：

lapply(myListOfFilePaths, myFun)

默认情况下，此函数重命名（而不是覆盖）您的文件，在末尾但在扩展名之前附加“_new”。

Answer 2

这可能是另一种方式

读取所有文件并将其存储在这样的列表中

# to list down all the files in the directory
files.new = list.files(directory.path, recursive = TRUE, pattern=".avgm")

# to read all the files and store it in list
file.contents = lapply(paste(directory.path,files.new, sep="/"), read.table, sep='\t', header = TRUE)

接下来您可以像这样对列表中的每个数据集进行修改

outlist = lapply(file.contents, function(x){ 
# modifications 
kw <- x[, !(colnames(x) %in% c("V18","V19"))]
add <- c(kw$V18 <- 0,kw$V19<- 0)
kw$V1 <- kw$V1 * 1000
kw <- kw[ ,c(1,18:19,2:17)]
})

并使用下面的函数将修改后的数据写入新文件

# function to write files from a list object
write.files = function(modified.list, path){
  outlist = file.contents[sapply(modified.list, function(x) length(x) > 1)]
  sapply(names(outlist), function(x)
  write.table( outlist[[x]], file= paste(path, x, sep="/"), 
  sep="\t", row.names=FALSE))
}

正在将数据写入文件

write.files(outlist, "/directory/path")

使用函数修改数据框中的列

Modify columns in a dataframe by using function

recursion

r

dataframe

read.table