如何构建一个变量来汇总多个变量

Question

我有一个数据是这样的：

示例数据可以通过以下代码获取：

ID<-c(1,1,1,1,2,2,2,3,3,3,4,4,4,4)
Days<-c(-5,1,18,30,1,8,16,1,8,6,-6,1,7,15)
Event_P<-c("","","P","","","","P","","","P","","","P","P")
Event_N<-c("","","","","N","","N","","","N","N","","N","N")
Event_C<-c("C","","C","","","","C","","","C","","","","")

Sample.data <- data.frame(ID, Days, Event_P, Event_N,Event_C)

我想建立一个变量“事件”来捕获所有事件。最终结果将如下所示：

我该怎么办？我想知道尽可能多的方法。谢谢

Answer 1

一个选项可以像这样使用 apply()。 @AllanCameron 的建议也是一个不错的选择。这里的代码作为您的选择：

#Vectors
ID<-c(1,1,1,1,2,2,2,3,3,3,4,4,4,4)
Days<-c(-5,1,18,30,1,8,16,1,8,6,-6,1,7,15)
Event_P<-c("","","P","","","","P","","","P","","","P","P")
Event_N<-c("","","","","N","","N","","","N","N","","N","N")
Event_C<-c("C","","C","","","","C","","","C","","","","")
#Data
Sample.data <- data.frame(ID, Days, Event_P, Event_N,Event_C,stringsAsFactors = F)
#Option 1
index <- which(grepl('Event',names(Sample.data)))
Sample.data$Event <- apply(Sample.data[,index],1,function(x) paste0(x[x!=''],collapse='/'))

输出：

   ID Days Event_P Event_N Event_C Event
1   1   -5                       C     C
2   1    1                              
3   1   18       P               C   P/C
4   1   30                              
5   2    1               N             N
6   2    8                              
7   2   16       P       N       C P/N/C
8   3    1                              
9   3    8                              
10  3    6       P       N       C P/N/C
11  4   -6               N             N
12  4    1                              
13  4    7       P       N           P/N
14  4   15       P       N           P/N

Answer 2

鸭子的回答很好，但是你提到你想要尽可能多的方法所以这里还有两种方法：

你也可以使用 tidyverse 的 mutate 和 base r 的 interaction 来组合列，然后使用 gsub 清除所有不必要的东西：

ID<-c(1,1,1,1,2,2,2,3,3,3,4,4,4,4)
Days<-c(-5,1,18,30,1,8,16,1,8,6,-6,1,7,15)
Event_P<-c("","","P","","","","P","","","P","","","P","P")
Event_N<-c("","","","","N","","N","","","N","N","","N","N")
Event_C<-c("C","","C","","","","C","","","C","","","","")

Sample.data <- data.frame(ID, Days, Event_P, Event_N,Event_C)

library(tidyverse)

Sample.data %>% 
  mutate(Event = paste(Event_P, Event_N, Event_C, sep='/'),
         Event = gsub('^/|^//|/$|//$', '', Event),
         Event = gsub('//', '/', Event))
#>    ID Days Event_P Event_N Event_C Event
#> 1   1   -5                       C     C
#> 2   1    1                              
#> 3   1   18       P               C   P/C
#> 4   1   30                              
#> 5   2    1               N             N
#> 6   2    8                              
#> 7   2   16       P       N       C P/N/C
#> 8   3    1                              
#> 9   3    8                              
#> 10  3    6       P       N       C P/N/C
#> 11  4   -6               N             N
#> 12  4    1                              
#> 13  4    7       P       N           P/N
#> 14  4   15       P       N           P/N

Sample.data$Event <- 
  interaction(Sample.data$Event_P, Sample.data$Event_N, Sample.data$Event_C, sep = '/') %>% 
  gsub('^/|^//|/$|//$', '', .) %>% 
  gsub('//', '/', .)

Sample.data
#>    ID Days Event_P Event_N Event_C Event
#> 1   1   -5                       C     C
#> 2   1    1                              
#> 3   1   18       P               C   P/C
#> 4   1   30                              
#> 5   2    1               N             N
#> 6   2    8                              
#> 7   2   16       P       N       C P/N/C
#> 8   3    1                              
#> 9   3    8                              
#> 10  3    6       P       N       C P/N/C
#> 11  4   -6               N             N
#> 12  4    1                              
#> 13  4    7       P       N           P/N
#> 14  4   15       P       N           P/N

^{由 reprex package (v0.3.0)}

于 2020-09-18 创建

gsub(^/|^//|/$|//$)里面的作用是

^/|^//：取出字符串

开头的所有/或//

/$|//$：取出字符串

结尾的所有/或//

如何构建一个变量来汇总多个变量

how to build a variable to summarized muti variables

r

paste

gsub