将值布尔列与 R 中的优先级组合在一起
Combining values Boolean columns to one with Priority in R
通过以下链接,但它部分解决了我的问题。
merge multiple TRUE/FALSE columns into one
Combining a matrix of TRUE/FALSE into one
R: Converting multiple boolean columns to single factor column
我有一个数据框,看起来像:
dat <- data.frame(Id = c(1,2,3,4,5,6,7,8),
A = c('Y','N','N','N','N','N','N','N'),
B = c('N','Y','N','N','N','N','Y','N'),
C = c('N','N','Y','N','N','Y','N','N'),
D = c('N','N','N','Y','N','Y','N','N'),
E = c('N','N','N','N','Y','N','Y','N')
)
我想用一列重塑我的 df,但是当有 2 个 "Y" 连续时它必须给出优先级。
优先级是 A>B>C>D>E 这意味着如果它们在 A 中是 "Y" 那么结果值应该是 A。类似地,在上面的例子中 df C 和 D 都有 "Y" 但结果 df 中应该有 "C"。
因此输出应该如下所示:
resultant_dat <- data.frame(Id = c(1,2,3,4,5,6,7,8),
Result = c('A','B','C','D','E','C','B','NA')
)
我试过这个:
library(reshape2)
new_df <- melt(dat, "Id", variable.name = "Result")
new_df <-new_df[new_df$value == "Y", c("Id", "Result")]
但问题是没有处理优先级的事情,它为相同的 ID 创建了 2 行。
tmp = data.frame(ID = dat[,1],
Result = col_order[apply(
X = dat[col_order],
MARGIN = 1,
FUN = function(x) which(x == "Y")[1])],
stringsAsFactors = FALSE)
tmp$Result[is.na(tmp$Result)] = "Not Present"
tmp
# ID Result
#1 1 A
#2 2 B
#3 3 C
#4 4 D
#5 5 E
#6 6 C
#7 7 B
#8 8 Not Present
通过以下链接,但它部分解决了我的问题。
merge multiple TRUE/FALSE columns into one
Combining a matrix of TRUE/FALSE into one
R: Converting multiple boolean columns to single factor column
我有一个数据框,看起来像:
dat <- data.frame(Id = c(1,2,3,4,5,6,7,8),
A = c('Y','N','N','N','N','N','N','N'),
B = c('N','Y','N','N','N','N','Y','N'),
C = c('N','N','Y','N','N','Y','N','N'),
D = c('N','N','N','Y','N','Y','N','N'),
E = c('N','N','N','N','Y','N','Y','N')
)
我想用一列重塑我的 df,但是当有 2 个 "Y" 连续时它必须给出优先级。
优先级是 A>B>C>D>E 这意味着如果它们在 A 中是 "Y" 那么结果值应该是 A。类似地,在上面的例子中 df C 和 D 都有 "Y" 但结果 df 中应该有 "C"。 因此输出应该如下所示:
resultant_dat <- data.frame(Id = c(1,2,3,4,5,6,7,8),
Result = c('A','B','C','D','E','C','B','NA')
)
我试过这个:
library(reshape2)
new_df <- melt(dat, "Id", variable.name = "Result")
new_df <-new_df[new_df$value == "Y", c("Id", "Result")]
但问题是没有处理优先级的事情,它为相同的 ID 创建了 2 行。
tmp = data.frame(ID = dat[,1],
Result = col_order[apply(
X = dat[col_order],
MARGIN = 1,
FUN = function(x) which(x == "Y")[1])],
stringsAsFactors = FALSE)
tmp$Result[is.na(tmp$Result)] = "Not Present"
tmp
# ID Result
#1 1 A
#2 2 B
#3 3 C
#4 4 D
#5 5 E
#6 6 C
#7 7 B
#8 8 Not Present