使用具有零值的列名制作字符串

Making a character string with column names with zero values

第 4 列是我想要的列。 Video、Webinar、Meeting、Conference 是不同客户(名称)可以参与的 4 种类型的活动。您可以看到,在给定行中,所有值为零的列名称都在最后一列(NextStep)和那里的值(以逗号分隔的字符串)排除具有非零值的列名。最后一列中的字符串(列名)通常按列顺序出现,但有两个例外。如果网络研讨会的值为零,则网络研讨会始终首先出现,如果视频的值为零,则视频始终出现在最后。

    library(data.table)
     dt <- fread('
 Name     Video   Webinar Meeting Conference   NextStep
  John       1         0        0       0         Webinar,Meeting,Conference
  John       1         1        0       0         Meeting,Conference
  John       1         1        1       0         Conference      
  Tom        0         0        1       0         Webinar,Conference,Video
  Tom        0         0        1       1         Webinar,Video   
  Kyle       0         0        0       1         Webinar,Meeting,Video

                                    ')

我的问题是如何创建下一步列。非常感谢您的帮助!

可能的解决方案:

DT[, nextstep := paste0(names(.SD)[.SD==0], collapse = ','), 1:nrow(DT), .SDcols = 2:5][]

给出:

   Name Video Webinar Meeting Conference                   nextstep
1: John     1       0       0          0 Webinar,Meeting,Conference
2: John     1       1       0          0         Meeting,Conference
3: John     1       1       1          0                 Conference
4:  Tom     0       0       1          0   Video,Webinar,Conference
5:  Tom     0       0       1          1              Video,Webinar
6: Kyle     0       0       0          1      Video,Webinar,Meeting

如果您想按照评论中指定的顺序排列名称,您可以这样做:

lvls <- c('Webinar', 'Meeting', 'Conference', 'Video')
DT[, nextstep := paste0(lvls[lvls %in% names(.SD)[.SD==0]], collapse = ','), 
   1:nrow(DT), .SDcols = 2:5][]

给出:

   Name Video Webinar Meeting Conference                   nextstep
1: John     1       0       0          0 Webinar,Meeting,Conference
2: John     1       1       0          0         Meeting,Conference
3: John     1       1       1          0                 Conference
4:  Tom     0       0       1          0   Webinar,Conference,Video
5:  Tom     0       0       1          1              Webinar,Video
6: Kyle     0       0       0          1      Webinar,Meeting,Video

除了使用 paste0(使用 collapse = ','),您还可以使用 toString.


已用数据:

DT <- fread('Name     Video   Webinar  Meeting  Conference
             John       1         0        0        0
             John       1         1        0        0
             John       1         1        1        0
             Tom        0         0        1        0
             Tom        0         0        1        1
             Kyle       0         0        0        1')

给你:

setcolorder(dt, c("Name", "Webinar", "Meeting", "Conference", "Video", "NextStep"))
dt[, NextStepNew:=apply(dt, 1, function(x) paste0(names(x)[x==0], collapse=","))][]
   Name Webinar Meeting Conference Video                   NextStep                NextStepNew
1: John       0       0          0     1 Webinar,Meeting,Conference Webinar,Meeting,Conference
2: John       1       0          0     1         Meeting,Conference         Meeting,Conference
3: John       1       1          0     1                 Conference                 Conference
4:  Tom       0       1          0     0   Webinar,Conference,Video   Webinar,Conference,Video
5:  Tom       0       1          1     0              Webinar,Video              Webinar,Video
6: Kyle       0       0          1     0      Webinar,Meeting,Video      Webinar,Meeting,Video

如果您正在寻找一种方法来执行此操作而无需简单地按您想要的顺序重新排序列(事实上我看不出为什么不这样做,但无论如何..)您可以尝试以下方法。它 melts 并通过连接中的引用进行更新:

lvls <- c("Webinar", "Meeting", "Conference", "Video")  # make sure order is correct
dt[, row := .I]   # add a row-identifier
dtm <- melt(dt, id.vars = c("Name", "row"), measure.vars = lvls) # melt to long format
# summarise dtm by using factor, sorting it and converting to strin; then join to dt
dt[dtm[value == 0, list(NextStep2 = toString(sort(factor(variable, levels = lvls)))), 
    by = row], NextStep2 := NextStep2, on = "row"][, row := NULL]

#    Name Video Webinar Meeting Conference                   NextStep                    NextStep2
# 1: John     1       0       0          0 Webinar,Meeting,Conference Webinar, Meeting, Conference
# 2: John     1       1       0          0         Meeting,Conference          Meeting, Conference
# 3: John     1       1       1          0                 Conference                   Conference
# 4:  Tom     0       0       1          0   Webinar,Conference,Video   Webinar, Conference, Video
# 5:  Tom     0       0       1          1              Webinar,Video               Webinar, Video
# 6: Kyle     0       0       0          1      Webinar,Meeting,Video      Webinar, Meeting, Video

如果您想在没有 activity 的情况下将所有列名称粘贴到数据中,您可以将以下行添加到您的代码中:

dt[rowSums(dt[, mget(lvls)]) == 0, NextStep2 := toString(names(dt)[2:5])]