使用具有零值的列名制作字符串
Making a character string with column names with zero values
第 4 列是我想要的列。 Video、Webinar、Meeting、Conference 是不同客户(名称)可以参与的 4 种类型的活动。您可以看到,在给定行中,所有值为零的列名称都在最后一列(NextStep)和那里的值(以逗号分隔的字符串)排除具有非零值的列名。最后一列中的字符串(列名)通常按列顺序出现,但有两个例外。如果网络研讨会的值为零,则网络研讨会始终首先出现,如果视频的值为零,则视频始终出现在最后。
library(data.table)
dt <- fread('
Name Video Webinar Meeting Conference NextStep
John 1 0 0 0 Webinar,Meeting,Conference
John 1 1 0 0 Meeting,Conference
John 1 1 1 0 Conference
Tom 0 0 1 0 Webinar,Conference,Video
Tom 0 0 1 1 Webinar,Video
Kyle 0 0 0 1 Webinar,Meeting,Video
')
我的问题是如何创建下一步列。非常感谢您的帮助!
可能的解决方案:
DT[, nextstep := paste0(names(.SD)[.SD==0], collapse = ','), 1:nrow(DT), .SDcols = 2:5][]
给出:
Name Video Webinar Meeting Conference nextstep
1: John 1 0 0 0 Webinar,Meeting,Conference
2: John 1 1 0 0 Meeting,Conference
3: John 1 1 1 0 Conference
4: Tom 0 0 1 0 Video,Webinar,Conference
5: Tom 0 0 1 1 Video,Webinar
6: Kyle 0 0 0 1 Video,Webinar,Meeting
如果您想按照评论中指定的顺序排列名称,您可以这样做:
lvls <- c('Webinar', 'Meeting', 'Conference', 'Video')
DT[, nextstep := paste0(lvls[lvls %in% names(.SD)[.SD==0]], collapse = ','),
1:nrow(DT), .SDcols = 2:5][]
给出:
Name Video Webinar Meeting Conference nextstep
1: John 1 0 0 0 Webinar,Meeting,Conference
2: John 1 1 0 0 Meeting,Conference
3: John 1 1 1 0 Conference
4: Tom 0 0 1 0 Webinar,Conference,Video
5: Tom 0 0 1 1 Webinar,Video
6: Kyle 0 0 0 1 Webinar,Meeting,Video
除了使用 paste0
(使用 collapse = ','
),您还可以使用 toString
.
已用数据:
DT <- fread('Name Video Webinar Meeting Conference
John 1 0 0 0
John 1 1 0 0
John 1 1 1 0
Tom 0 0 1 0
Tom 0 0 1 1
Kyle 0 0 0 1')
给你:
setcolorder(dt, c("Name", "Webinar", "Meeting", "Conference", "Video", "NextStep"))
dt[, NextStepNew:=apply(dt, 1, function(x) paste0(names(x)[x==0], collapse=","))][]
Name Webinar Meeting Conference Video NextStep NextStepNew
1: John 0 0 0 1 Webinar,Meeting,Conference Webinar,Meeting,Conference
2: John 1 0 0 1 Meeting,Conference Meeting,Conference
3: John 1 1 0 1 Conference Conference
4: Tom 0 1 0 0 Webinar,Conference,Video Webinar,Conference,Video
5: Tom 0 1 1 0 Webinar,Video Webinar,Video
6: Kyle 0 0 1 0 Webinar,Meeting,Video Webinar,Meeting,Video
如果您正在寻找一种方法来执行此操作而无需简单地按您想要的顺序重新排序列(事实上我看不出为什么不这样做,但无论如何..)您可以尝试以下方法。它 melt
s 并通过连接中的引用进行更新:
lvls <- c("Webinar", "Meeting", "Conference", "Video") # make sure order is correct
dt[, row := .I] # add a row-identifier
dtm <- melt(dt, id.vars = c("Name", "row"), measure.vars = lvls) # melt to long format
# summarise dtm by using factor, sorting it and converting to strin; then join to dt
dt[dtm[value == 0, list(NextStep2 = toString(sort(factor(variable, levels = lvls)))),
by = row], NextStep2 := NextStep2, on = "row"][, row := NULL]
# Name Video Webinar Meeting Conference NextStep NextStep2
# 1: John 1 0 0 0 Webinar,Meeting,Conference Webinar, Meeting, Conference
# 2: John 1 1 0 0 Meeting,Conference Meeting, Conference
# 3: John 1 1 1 0 Conference Conference
# 4: Tom 0 0 1 0 Webinar,Conference,Video Webinar, Conference, Video
# 5: Tom 0 0 1 1 Webinar,Video Webinar, Video
# 6: Kyle 0 0 0 1 Webinar,Meeting,Video Webinar, Meeting, Video
如果您想在没有 activity 的情况下将所有列名称粘贴到数据中,您可以将以下行添加到您的代码中:
dt[rowSums(dt[, mget(lvls)]) == 0, NextStep2 := toString(names(dt)[2:5])]
第 4 列是我想要的列。 Video、Webinar、Meeting、Conference 是不同客户(名称)可以参与的 4 种类型的活动。您可以看到,在给定行中,所有值为零的列名称都在最后一列(NextStep)和那里的值(以逗号分隔的字符串)排除具有非零值的列名。最后一列中的字符串(列名)通常按列顺序出现,但有两个例外。如果网络研讨会的值为零,则网络研讨会始终首先出现,如果视频的值为零,则视频始终出现在最后。
library(data.table)
dt <- fread('
Name Video Webinar Meeting Conference NextStep
John 1 0 0 0 Webinar,Meeting,Conference
John 1 1 0 0 Meeting,Conference
John 1 1 1 0 Conference
Tom 0 0 1 0 Webinar,Conference,Video
Tom 0 0 1 1 Webinar,Video
Kyle 0 0 0 1 Webinar,Meeting,Video
')
我的问题是如何创建下一步列。非常感谢您的帮助!
可能的解决方案:
DT[, nextstep := paste0(names(.SD)[.SD==0], collapse = ','), 1:nrow(DT), .SDcols = 2:5][]
给出:
Name Video Webinar Meeting Conference nextstep
1: John 1 0 0 0 Webinar,Meeting,Conference
2: John 1 1 0 0 Meeting,Conference
3: John 1 1 1 0 Conference
4: Tom 0 0 1 0 Video,Webinar,Conference
5: Tom 0 0 1 1 Video,Webinar
6: Kyle 0 0 0 1 Video,Webinar,Meeting
如果您想按照评论中指定的顺序排列名称,您可以这样做:
lvls <- c('Webinar', 'Meeting', 'Conference', 'Video')
DT[, nextstep := paste0(lvls[lvls %in% names(.SD)[.SD==0]], collapse = ','),
1:nrow(DT), .SDcols = 2:5][]
给出:
Name Video Webinar Meeting Conference nextstep
1: John 1 0 0 0 Webinar,Meeting,Conference
2: John 1 1 0 0 Meeting,Conference
3: John 1 1 1 0 Conference
4: Tom 0 0 1 0 Webinar,Conference,Video
5: Tom 0 0 1 1 Webinar,Video
6: Kyle 0 0 0 1 Webinar,Meeting,Video
除了使用 paste0
(使用 collapse = ','
),您还可以使用 toString
.
已用数据:
DT <- fread('Name Video Webinar Meeting Conference
John 1 0 0 0
John 1 1 0 0
John 1 1 1 0
Tom 0 0 1 0
Tom 0 0 1 1
Kyle 0 0 0 1')
给你:
setcolorder(dt, c("Name", "Webinar", "Meeting", "Conference", "Video", "NextStep"))
dt[, NextStepNew:=apply(dt, 1, function(x) paste0(names(x)[x==0], collapse=","))][]
Name Webinar Meeting Conference Video NextStep NextStepNew
1: John 0 0 0 1 Webinar,Meeting,Conference Webinar,Meeting,Conference
2: John 1 0 0 1 Meeting,Conference Meeting,Conference
3: John 1 1 0 1 Conference Conference
4: Tom 0 1 0 0 Webinar,Conference,Video Webinar,Conference,Video
5: Tom 0 1 1 0 Webinar,Video Webinar,Video
6: Kyle 0 0 1 0 Webinar,Meeting,Video Webinar,Meeting,Video
如果您正在寻找一种方法来执行此操作而无需简单地按您想要的顺序重新排序列(事实上我看不出为什么不这样做,但无论如何..)您可以尝试以下方法。它 melt
s 并通过连接中的引用进行更新:
lvls <- c("Webinar", "Meeting", "Conference", "Video") # make sure order is correct
dt[, row := .I] # add a row-identifier
dtm <- melt(dt, id.vars = c("Name", "row"), measure.vars = lvls) # melt to long format
# summarise dtm by using factor, sorting it and converting to strin; then join to dt
dt[dtm[value == 0, list(NextStep2 = toString(sort(factor(variable, levels = lvls)))),
by = row], NextStep2 := NextStep2, on = "row"][, row := NULL]
# Name Video Webinar Meeting Conference NextStep NextStep2
# 1: John 1 0 0 0 Webinar,Meeting,Conference Webinar, Meeting, Conference
# 2: John 1 1 0 0 Meeting,Conference Meeting, Conference
# 3: John 1 1 1 0 Conference Conference
# 4: Tom 0 0 1 0 Webinar,Conference,Video Webinar, Conference, Video
# 5: Tom 0 0 1 1 Webinar,Video Webinar, Video
# 6: Kyle 0 0 0 1 Webinar,Meeting,Video Webinar, Meeting, Video
如果您想在没有 activity 的情况下将所有列名称粘贴到数据中,您可以将以下行添加到您的代码中:
dt[rowSums(dt[, mget(lvls)]) == 0, NextStep2 := toString(names(dt)[2:5])]