将包含列表名称的列表传递给 R 中的“pmap”函数并命名结果数据帧或小标题
Pass a list containing list names to `pmap` function in R and name resulting dataframes or tibbles
我正在尝试在 R
中编写一个函数,该函数使用 pmap
函数调用,并使用从列表传递给pmap
函数。我认为最好用一个可重现的玩具示例来解释这一点。这是一个(假设用户 运行ning 在 windows 并且目录 C:\temp\ 已经创建并且当前为空,尽管您可以将下面的路径设置为您选择的任何目录:
#create some toy sample input data files
write.csv(x=data.frame(var1=c(42,43),var2=c(43,45)), file="C:\temp\AL.csv")
write.csv(x=data.frame(var1=c(22,43),var2=c(43,45)), file="C:\temp\AK.csv")
write.csv(x=data.frame(var1=c(90,98),var2=c(97,96)), file="C:\temp\AZ.csv")
write.csv(x=data.frame(var1=c(43,55),var2=c(85,43)), file="C:\temp\PossiblyUnknownName.csv")
#Get list of files in c:\temp directory - assumes only files to be read in exist there
pathnames<-list.files(path = "C:\temp\", full.names=TRUE)
ListIdNumber<-c("ID3413241", "ID3413242", "ID3413243", "ID3413244")
#Create a named list. In reality, my problem is more complex, but this gets at the root of the issue
mylistnames<-list(pathnames_in=pathnames, ListIdNumber_in=ListIdNumber)
#Functions that I've tried, where I'm passing the name ListIdNumber_in into the function so
#the resulting data frames are named.
#Attempt 1
get_data_files1<-function(pathnames_in, ListIdNumber_in){
tempdf <- read.csv(pathnames_in) %>% set_names(nm=ListIdNumber_in)
}
#Attempt 2
get_data_files2<-function(pathnames_in, ListIdNumber_in){
tempdf <- read.csv(pathnames_in)
names(tempdf)<-ListIdNumber_in
tempdf
}
#Attempt 3
get_data_files3<-function(pathnames_in, ListIdNumber_in){
tempdf <- read.csv(pathnames_in)
tempdf
}
#Fails
pmap(mylistnames, get_data_files1)->myoutput1
#Almost, but doesn't name the tibbles it creates and instead creates a variable named ListIdNumber_in
pmap(mylistnames, get_data_files2)->myoutput2
#This gets me the end result that I want, but I want to set the names inside the function
pmap(mylistnames, get_data_files3) %>% set_names(nm=mylistnames$ListIdNumber_in)->myoutput3
所以当我 运行 pmap
我想得到以下结果时,我只是想在函数内部完成嵌套数据的命名 frames/tibbles (而且我真的不需要我认为错误创建的 'X' 变量)。:
$ID3413241
X var1 var2
1 1 22 43
2 2 43 45
$ID3413242
X var1 var2
1 1 42 43
2 2 43 45
$ID3413243
X var1 var2
1 1 90 97
2 2 98 96
$ID3413244
X var1 var2
1 1 43 85
2 2 55 43
知道如何实现吗?
谢谢!
- 这里使用
map
- 无需创建命名列表,因为在读取 csv 时无法在顶层附加名称,请单独添加名称。
library(purrr)
map(pathnames, read.csv) %>% set_names(ListIdNumber)
#$ID3413241
# var1 var2
#1 22 43
#2 43 45
#$ID3413242
# var1 var2
#1 42 43
#2 43 45
#$ID3413243
# var1 var2
#1 90 97
#2 98 96
#$ID3413244
# var1 var2
#1 43 85
#2 55 43
在 base R 中,这可以这样完成:
setNames(lapply(pathnames, read.csv), ListIdNumber)
您获得额外 X
列的原因是因为在编写 csv 时您也在编写行名。将其设置为 row.names = FALSE
,您将没有该列。
write.csv(x=data.frame(var1=c(42,43),var2=c(43,45)),
file="C:\temp\AL.csv", row.names = FALSE)
为此创建您自己的 pmap
怎么样?
# assume that your names are always stored in `ListIdNumber_in`
named_pmap <- function(.l, .f, ...) set_names(pmap(.l, .f, ...), .l$ListIdNumber_in)
那你可以直接调用named_pmap(mylistnames, get_data_files3)
。除了命名部分,这个named_pmap
和pmap
基本一样。
我正在尝试在 R
中编写一个函数,该函数使用 pmap
函数调用,并使用从列表传递给pmap
函数。我认为最好用一个可重现的玩具示例来解释这一点。这是一个(假设用户 运行ning 在 windows 并且目录 C:\temp\ 已经创建并且当前为空,尽管您可以将下面的路径设置为您选择的任何目录:
#create some toy sample input data files
write.csv(x=data.frame(var1=c(42,43),var2=c(43,45)), file="C:\temp\AL.csv")
write.csv(x=data.frame(var1=c(22,43),var2=c(43,45)), file="C:\temp\AK.csv")
write.csv(x=data.frame(var1=c(90,98),var2=c(97,96)), file="C:\temp\AZ.csv")
write.csv(x=data.frame(var1=c(43,55),var2=c(85,43)), file="C:\temp\PossiblyUnknownName.csv")
#Get list of files in c:\temp directory - assumes only files to be read in exist there
pathnames<-list.files(path = "C:\temp\", full.names=TRUE)
ListIdNumber<-c("ID3413241", "ID3413242", "ID3413243", "ID3413244")
#Create a named list. In reality, my problem is more complex, but this gets at the root of the issue
mylistnames<-list(pathnames_in=pathnames, ListIdNumber_in=ListIdNumber)
#Functions that I've tried, where I'm passing the name ListIdNumber_in into the function so
#the resulting data frames are named.
#Attempt 1
get_data_files1<-function(pathnames_in, ListIdNumber_in){
tempdf <- read.csv(pathnames_in) %>% set_names(nm=ListIdNumber_in)
}
#Attempt 2
get_data_files2<-function(pathnames_in, ListIdNumber_in){
tempdf <- read.csv(pathnames_in)
names(tempdf)<-ListIdNumber_in
tempdf
}
#Attempt 3
get_data_files3<-function(pathnames_in, ListIdNumber_in){
tempdf <- read.csv(pathnames_in)
tempdf
}
#Fails
pmap(mylistnames, get_data_files1)->myoutput1
#Almost, but doesn't name the tibbles it creates and instead creates a variable named ListIdNumber_in
pmap(mylistnames, get_data_files2)->myoutput2
#This gets me the end result that I want, but I want to set the names inside the function
pmap(mylistnames, get_data_files3) %>% set_names(nm=mylistnames$ListIdNumber_in)->myoutput3
所以当我 运行 pmap
我想得到以下结果时,我只是想在函数内部完成嵌套数据的命名 frames/tibbles (而且我真的不需要我认为错误创建的 'X' 变量)。:
$ID3413241
X var1 var2
1 1 22 43
2 2 43 45
$ID3413242
X var1 var2
1 1 42 43
2 2 43 45
$ID3413243
X var1 var2
1 1 90 97
2 2 98 96
$ID3413244
X var1 var2
1 1 43 85
2 2 55 43
知道如何实现吗?
谢谢!
- 这里使用
map
- 无需创建命名列表,因为在读取 csv 时无法在顶层附加名称,请单独添加名称。
library(purrr)
map(pathnames, read.csv) %>% set_names(ListIdNumber)
#$ID3413241
# var1 var2
#1 22 43
#2 43 45
#$ID3413242
# var1 var2
#1 42 43
#2 43 45
#$ID3413243
# var1 var2
#1 90 97
#2 98 96
#$ID3413244
# var1 var2
#1 43 85
#2 55 43
在 base R 中,这可以这样完成:
setNames(lapply(pathnames, read.csv), ListIdNumber)
您获得额外 X
列的原因是因为在编写 csv 时您也在编写行名。将其设置为 row.names = FALSE
,您将没有该列。
write.csv(x=data.frame(var1=c(42,43),var2=c(43,45)),
file="C:\temp\AL.csv", row.names = FALSE)
为此创建您自己的 pmap
怎么样?
# assume that your names are always stored in `ListIdNumber_in`
named_pmap <- function(.l, .f, ...) set_names(pmap(.l, .f, ...), .l$ListIdNumber_in)
那你可以直接调用named_pmap(mylistnames, get_data_files3)
。除了命名部分,这个named_pmap
和pmap
基本一样。