一列匹配所有列的条件
Conditions to match all columns with one column
我有一个数据框 (df),我想在其中将每一列与最后一列相匹配,以便为每一列提供新值。
这是我的示例数据框 (df):
> df
S1 S2 S3 S4 S5 main
Gene1 1 1 1 1 2 1
Gene2 1 2 1 1 1 1
Gene3 1 1 1 1 2 2
Gene4 2 1 1 1 1 1
Gene5 1 2 1 2 1 1
Gene6 1 1 1 1 1 2
Gene7 NA NA 2 1 1 1
Gene8 1 2 1 1 1 2
Gene9 2 1 1 2 1 1
我想将 1 到 5 的每一列与具有以下条件的最后一列相匹配。 'S'下面指的是从1到5的每一列。
If S = 2 and main = 2, then value is True Positive (TP)
If S = 2 and main = 1, then value is False Positive (FP)
If S = 1 and main = 2, then value is False Negative (FN)
If S = 1 and main = 1, then value is True Negative (TN)
And NAs to remain as NAs.
因此我的新数据框 (df_updated) 应该如下所示。
> df_updated
S1 S2 S3 S4 S5
Gene1 TN TN TN TN FP
Gene2 TN FP TN TN TN
Gene3 FN FN FN FN TP
Gene4 FP TN TN TN TN
Gene5 TN FP TN FP TN
Gene6 FN FN FN FN FN
Gene7 NA NA FP TN TN
Gene8 FN TP FN FN FN
Gene9 FP TN TN FP TN
我知道匹配函数,但我不确定如何循环它们并为每一列使用上述特定匹配。
感谢任何帮助,
谢谢。
你可以使用 dplyr 的 case_when
:
library(dplyr)
mutate_all(df, ~case_when(
.x < main ~ "FN",
.x > main ~ "FP",
near(.x, 1) & near(.x, main) ~ "TN",
near(.x, 2) & near(.x, main) ~ "TP"
)) %>%
select(-main)
#> S1 S2 S3 S4 S5
#> 1 TN TN TN TN FP
#> 2 TN FP TN TN TN
#> 3 FN FN FN FN TP
#> 4 FP TN TN TN TN
#> 5 TN FP TN FP TN
#> 6 FN FN FN FN FN
#> 7 <NA> <NA> FP TN TN
#> 8 FN TP FN FN FN
#> 9 FP TN TN FP TN
使用 base R,您还可以创建一个带有嵌套 ifelse
的函数,并将该函数应用于每一列并获取值。
get_value <- function(x,main) {
ifelse(main == 2 & x == 2, "TP",
ifelse(main == 1 & x == 2, "FP",
ifelse(main == 2 & x == 1, "FN",
ifelse(main == 1 & x == 1 ,"TN", NA))))
}
df1 <- df[-ncol(df)]
df1[] <- lapply(df1, get_value, df$main)
df1
# S1 S2 S3 S4 S5
#Gene1 TN TN TN TN FP
#Gene2 TN FP TN TN TN
#Gene3 FN FN FN FN TP
#Gene4 FP TN TN TN TN
#Gene5 TN FP TN FP TN
#Gene6 FN FN FN FN FN
#Gene7 <NA> <NA> FP TN TN
#Gene8 FN TP FN FN FN
#Gene9 FP TN TN FP TN
我有一个数据框 (df),我想在其中将每一列与最后一列相匹配,以便为每一列提供新值。
这是我的示例数据框 (df):
> df
S1 S2 S3 S4 S5 main
Gene1 1 1 1 1 2 1
Gene2 1 2 1 1 1 1
Gene3 1 1 1 1 2 2
Gene4 2 1 1 1 1 1
Gene5 1 2 1 2 1 1
Gene6 1 1 1 1 1 2
Gene7 NA NA 2 1 1 1
Gene8 1 2 1 1 1 2
Gene9 2 1 1 2 1 1
我想将 1 到 5 的每一列与具有以下条件的最后一列相匹配。 'S'下面指的是从1到5的每一列。
If S = 2 and main = 2, then value is True Positive (TP)
If S = 2 and main = 1, then value is False Positive (FP)
If S = 1 and main = 2, then value is False Negative (FN)
If S = 1 and main = 1, then value is True Negative (TN)
And NAs to remain as NAs.
因此我的新数据框 (df_updated) 应该如下所示。
> df_updated
S1 S2 S3 S4 S5
Gene1 TN TN TN TN FP
Gene2 TN FP TN TN TN
Gene3 FN FN FN FN TP
Gene4 FP TN TN TN TN
Gene5 TN FP TN FP TN
Gene6 FN FN FN FN FN
Gene7 NA NA FP TN TN
Gene8 FN TP FN FN FN
Gene9 FP TN TN FP TN
我知道匹配函数,但我不确定如何循环它们并为每一列使用上述特定匹配。
感谢任何帮助, 谢谢。
你可以使用 dplyr 的 case_when
:
library(dplyr)
mutate_all(df, ~case_when(
.x < main ~ "FN",
.x > main ~ "FP",
near(.x, 1) & near(.x, main) ~ "TN",
near(.x, 2) & near(.x, main) ~ "TP"
)) %>%
select(-main)
#> S1 S2 S3 S4 S5
#> 1 TN TN TN TN FP
#> 2 TN FP TN TN TN
#> 3 FN FN FN FN TP
#> 4 FP TN TN TN TN
#> 5 TN FP TN FP TN
#> 6 FN FN FN FN FN
#> 7 <NA> <NA> FP TN TN
#> 8 FN TP FN FN FN
#> 9 FP TN TN FP TN
使用 base R,您还可以创建一个带有嵌套 ifelse
的函数,并将该函数应用于每一列并获取值。
get_value <- function(x,main) {
ifelse(main == 2 & x == 2, "TP",
ifelse(main == 1 & x == 2, "FP",
ifelse(main == 2 & x == 1, "FN",
ifelse(main == 1 & x == 1 ,"TN", NA))))
}
df1 <- df[-ncol(df)]
df1[] <- lapply(df1, get_value, df$main)
df1
# S1 S2 S3 S4 S5
#Gene1 TN TN TN TN FP
#Gene2 TN FP TN TN TN
#Gene3 FN FN FN FN TP
#Gene4 FP TN TN TN TN
#Gene5 TN FP TN FP TN
#Gene6 FN FN FN FN FN
#Gene7 <NA> <NA> FP TN TN
#Gene8 FN TP FN FN FN
#Gene9 FP TN TN FP TN