如何在数据框上使用 Coalesce 函数
How to use Coalesce function on a dataframe
我试图找到一种方法来处理 NA 数据,但无法做到,请协助我。
我有以下数据,
R1 <- c("15515","5156","65656","1566", "2857","8888","65656","1566","65651")
R2 <- c("515","5156.11-","415-","1455-","886","888","777","666","4457")
RC1 <- c("AW","FG","ZA","ZI","","CW","","","")
RC2 <- c("SSSBB","","","ZXXQA","","CQAER","","KKHDY","TTQWW")
RC3 <- c("KKAJDJHW","XVVJAKWA","","","","CDDGAJJA","GGGAJTTD","","BBNMNJJI")
df <- data.frame(R1,R2,RC1,RC2,RC3)
我正在使用下面的代码来处理 RC1、RC2 和 RC3 中的 NA
df_1$RCC <- with(df_1, coalesce(df_1$RC3,df_1$RC2,df_1$RC1))
df_1
我无法从 RC1 和 RC2 获取数据。需要您的帮助。
要使 coalesce
正常工作,您需要 NA
而不是空白。将空格更改为 NA
并尝试:
library(dplyr)
df[df == ''] <- NA
df %>% mutate(RCC = coalesce(RC3, RC2, RC1))
# R1 R2 RC1 RC2 RC3 RCC
#1 15515 515 AW SSSBB KKAJDJHW KKAJDJHW
#2 5156 5156.11- FG <NA> XVVJAKWA XVVJAKWA
#3 65656 415- ZA <NA> <NA> ZA
#4 1566 1455- ZI ZXXQA <NA> ZXXQA
#5 2857 886 <NA> <NA> <NA> <NA>
#6 8888 888 CW CQAER CDDGAJJA CDDGAJJA
#7 65656 777 <NA> <NA> GGGAJTTD GGGAJTTD
#8 1566 666 <NA> KKHDY <NA> KKHDY
#9 65651 4457 <NA> TTQWW BBNMNJJI BBNMNJJI
我们也可以这样做而不必键入所有列名
library(dplyr)# >= 1.0.0
df %>%
mutate(across(everything(), na_if, "")) %>%
mutate(RCC = coalesce(!!! select(., RC3:RC1)))
# R1 R2 RC1 RC2 RC3 RCC
#1 15515 515 AW SSSBB KKAJDJHW KKAJDJHW
#2 5156 5156.11- FG <NA> XVVJAKWA XVVJAKWA
#3 65656 415- ZA <NA> <NA> ZA
#4 1566 1455- ZI ZXXQA <NA> ZXXQA
#5 2857 886 <NA> <NA> <NA> <NA>
#6 8888 888 CW CQAER CDDGAJJA CDDGAJJA
#7 65656 777 <NA> <NA> GGGAJTTD GGGAJTTD
#8 1566 666 <NA> KKHDY <NA> KKHDY
#9 65651 4457 <NA> TTQWW BBNMNJJI BBNMNJJI
我试图找到一种方法来处理 NA 数据,但无法做到,请协助我。
我有以下数据,
R1 <- c("15515","5156","65656","1566", "2857","8888","65656","1566","65651")
R2 <- c("515","5156.11-","415-","1455-","886","888","777","666","4457")
RC1 <- c("AW","FG","ZA","ZI","","CW","","","")
RC2 <- c("SSSBB","","","ZXXQA","","CQAER","","KKHDY","TTQWW")
RC3 <- c("KKAJDJHW","XVVJAKWA","","","","CDDGAJJA","GGGAJTTD","","BBNMNJJI")
df <- data.frame(R1,R2,RC1,RC2,RC3)
我正在使用下面的代码来处理 RC1、RC2 和 RC3 中的 NA
df_1$RCC <- with(df_1, coalesce(df_1$RC3,df_1$RC2,df_1$RC1))
df_1
我无法从 RC1 和 RC2 获取数据。需要您的帮助。
要使 coalesce
正常工作,您需要 NA
而不是空白。将空格更改为 NA
并尝试:
library(dplyr)
df[df == ''] <- NA
df %>% mutate(RCC = coalesce(RC3, RC2, RC1))
# R1 R2 RC1 RC2 RC3 RCC
#1 15515 515 AW SSSBB KKAJDJHW KKAJDJHW
#2 5156 5156.11- FG <NA> XVVJAKWA XVVJAKWA
#3 65656 415- ZA <NA> <NA> ZA
#4 1566 1455- ZI ZXXQA <NA> ZXXQA
#5 2857 886 <NA> <NA> <NA> <NA>
#6 8888 888 CW CQAER CDDGAJJA CDDGAJJA
#7 65656 777 <NA> <NA> GGGAJTTD GGGAJTTD
#8 1566 666 <NA> KKHDY <NA> KKHDY
#9 65651 4457 <NA> TTQWW BBNMNJJI BBNMNJJI
我们也可以这样做而不必键入所有列名
library(dplyr)# >= 1.0.0
df %>%
mutate(across(everything(), na_if, "")) %>%
mutate(RCC = coalesce(!!! select(., RC3:RC1)))
# R1 R2 RC1 RC2 RC3 RCC
#1 15515 515 AW SSSBB KKAJDJHW KKAJDJHW
#2 5156 5156.11- FG <NA> XVVJAKWA XVVJAKWA
#3 65656 415- ZA <NA> <NA> ZA
#4 1566 1455- ZI ZXXQA <NA> ZXXQA
#5 2857 886 <NA> <NA> <NA> <NA>
#6 8888 888 CW CQAER CDDGAJJA CDDGAJJA
#7 65656 777 <NA> <NA> GGGAJTTD GGGAJTTD
#8 1566 666 <NA> KKHDY <NA> KKHDY
#9 65651 4457 <NA> TTQWW BBNMNJJI BBNMNJJI