初学者使用管道
Beginner using pipes
我是初学者,我正在尝试找到最有效的方法来更改我将要创建的许多 CSV 文件的第一列的名称。创建 CSV 文件后,我将按如下方式将它们加载到 R 中:
data <- read.csv('filename.csv')
我已经使用names()
函数对单个文件进行了名称更改:
names(data)[1] <- 'Y'
但是,我想找到将 combining/piping 这个名称更改为 read.csv 的最有效方法,以便在每个文件打开时对它们应用相同的名称更改。我试图编写一个 'simple' 函数来执行此操作:
addName <- function(data) {
names(data)[1] <- 'Y'
data
}
但是,我还没有完全理解编写函数的语法,我无法让它工作。
这将读取一个文件名向量,将每个文件的第一列的名称更改为“Y”并将所有文件存储在一个列表中。
filenames <- c("filename1.csv","filename2.csv")
addName <- function(filename) {
data <- read.csv(filename)
names(data)[1] <- 'Y'
data
}
files <- list()
for (i in 1:length(filenames)) {
files[[i]] <- addName(filenames[i])
}
备注
如果您希望原始 addName
函数像这样“改变”现有对象
x <- data.frame(Column_1 = c(1, 2, 3), Column_2 = c("a", "b", "c"))
# Try (unsuccessfully) to change title of "Column_1" to "Y" in x.
addName(x)
# Print x.
x
请注意 R 是按值而不是按引用传递的,因此 x
本身将保持 不变:
Column_1 Column_2
1 1 a
2 2 b
3 3 c
任何“突变”都可以通过用函数的 return 值覆盖 x
来实现
x <- addName(x)
# Print x.
x
在这种情况下 x
本身 显然会 被改变:
Y Column_2
1 1 a
2 2 b
3 3 c
回答
无论如何,这是一个紧凑地结合了管道(来自 magrittr
包的 %>%
)和自定义函数的解决方案。 请注意,如果没有换行符和注释(我为清楚起见而添加的),这可以浓缩为只有几行代码。
# The dplyr package helps with easy renaming, and it includes the magrittr pipe.
library(dplyr)
# ...
filenames <- c("filename1.csv", "filename2.csv", "filename3.csv")
# A function to take a CSV filename and give back a renamed dataset taken from that file.
addName <- function(filename) {
return(# Read in the named file as a data.frame.
read.csv(file = filename) %>%
# Take the resulting data.frame, and rename its first column as "Y";
# quotes are optional, unless the name contains spaces: "My Column"
# or `My Column` are needed then.
dplyr::rename(Y = 1))
}
# Get a list of all the renamed datasets, as taken by addName() from each of the filenames.
all_files <- sapply(filenames, FUN = addName,
# Keep the list structure, in which each element is a
# data.frame.
simplify = FALSE,
# Name each list element by its filename, to help keep track.
USE.NAMES = TRUE)
事实上,您可以轻松 rename
任何您想要的栏目,一举完成:
dplyr::rename(Y = 1, 'X' = 2, "Z" = 3, "Column 4" = 4, `Column 5` = 5)
我是初学者,我正在尝试找到最有效的方法来更改我将要创建的许多 CSV 文件的第一列的名称。创建 CSV 文件后,我将按如下方式将它们加载到 R 中:
data <- read.csv('filename.csv')
我已经使用names()
函数对单个文件进行了名称更改:
names(data)[1] <- 'Y'
但是,我想找到将 combining/piping 这个名称更改为 read.csv 的最有效方法,以便在每个文件打开时对它们应用相同的名称更改。我试图编写一个 'simple' 函数来执行此操作:
addName <- function(data) {
names(data)[1] <- 'Y'
data
}
但是,我还没有完全理解编写函数的语法,我无法让它工作。
这将读取一个文件名向量,将每个文件的第一列的名称更改为“Y”并将所有文件存储在一个列表中。
filenames <- c("filename1.csv","filename2.csv")
addName <- function(filename) {
data <- read.csv(filename)
names(data)[1] <- 'Y'
data
}
files <- list()
for (i in 1:length(filenames)) {
files[[i]] <- addName(filenames[i])
}
备注
如果您希望原始 addName
函数像这样“改变”现有对象
x <- data.frame(Column_1 = c(1, 2, 3), Column_2 = c("a", "b", "c"))
# Try (unsuccessfully) to change title of "Column_1" to "Y" in x.
addName(x)
# Print x.
x
请注意 R 是按值而不是按引用传递的,因此 x
本身将保持 不变:
Column_1 Column_2
1 1 a
2 2 b
3 3 c
任何“突变”都可以通过用函数的 return 值覆盖 x
来实现
x <- addName(x)
# Print x.
x
在这种情况下 x
本身 显然会 被改变:
Y Column_2
1 1 a
2 2 b
3 3 c
回答
无论如何,这是一个紧凑地结合了管道(来自 magrittr
包的 %>%
)和自定义函数的解决方案。 请注意,如果没有换行符和注释(我为清楚起见而添加的),这可以浓缩为只有几行代码。
# The dplyr package helps with easy renaming, and it includes the magrittr pipe.
library(dplyr)
# ...
filenames <- c("filename1.csv", "filename2.csv", "filename3.csv")
# A function to take a CSV filename and give back a renamed dataset taken from that file.
addName <- function(filename) {
return(# Read in the named file as a data.frame.
read.csv(file = filename) %>%
# Take the resulting data.frame, and rename its first column as "Y";
# quotes are optional, unless the name contains spaces: "My Column"
# or `My Column` are needed then.
dplyr::rename(Y = 1))
}
# Get a list of all the renamed datasets, as taken by addName() from each of the filenames.
all_files <- sapply(filenames, FUN = addName,
# Keep the list structure, in which each element is a
# data.frame.
simplify = FALSE,
# Name each list element by its filename, to help keep track.
USE.NAMES = TRUE)
事实上,您可以轻松 rename
任何您想要的栏目,一举完成:
dplyr::rename(Y = 1, 'X' = 2, "Z" = 3, "Column 4" = 4, `Column 5` = 5)