R 无模式并排除 NA
R No Mode and Exclude NA
我正在寻找 R 中可用于 dplyr 的模式函数。我看到的两个帖子对待 "ties" 的方式非常不同。 This post (Ken Williams) treats ties by selecting the first-appearing value of the set of modes. 通过在同一单元格中记录两个值来处理平局。
我正在寻找一种将关系视为 NA 并排除缺失值的模式函数。我使用 将关系视为 NA,但我似乎无法排除缺失值。
变量DF$Color是字符类型。
这是一个例子 DF
Category<-c("A","B","B","C","A","A","A","B","C","B","C","C", "D", "D")
Color<-c("Red","Blue","Yellow","Blue","Green","Blue","Green","Yellow","Blue","Red","Red","Red","Yellow", NA)
DF<-data.frame(Category,Color)
DF <- arrange(DF, Category)
DF
DF$Color <- as.character(DF$Color)
包含 NA 后,代码如下所示:
mode <- function(x) {
ux <- unique(x)
tx <- tabulate(match(x, ux))
if(length(unique(tx)) == 1) {
return(NA)
}
max_tx <- tx == max(tx)
return(ux[max_tx])
}
DF %>%
group_by(Category) %>%
summarise(Mode = mode(Color))
我正在尝试找出排除 NA 的代码。 df 看起来像:
Category Mode
<fct> <fct>
1 A Green
2 B Yellow
3 C NA
4 D Yellow
函数的以下更改确保根据输入返回正确类型的 NA
值,并且它适用于长度为 1 的向量。
mode <- function(x) {
ux <- unique(na.omit(x))
tx <- tabulate(match(x, ux))
if(length(ux) != 1 & sum(max(tx) == tx) > 1) {
if (is.character(ux)) return(NA_character_) else return(NA_real_)
}
max_tx <- tx == max(tx)
return(ux[max_tx])
}
DF %>%
group_by(Category) %>%
summarise(Mode = mode(Color))
# A tibble: 4 x 2
Category Mode
<fct> <chr>
1 A Green
2 B Yellow
3 C NA
4 D Yellow
我正在寻找 R 中可用于 dplyr 的模式函数。我看到的两个帖子对待 "ties" 的方式非常不同。 This post (Ken Williams) treats ties by selecting the first-appearing value of the set of modes.
我正在寻找一种将关系视为 NA 并排除缺失值的模式函数。我使用
变量DF$Color是字符类型。
这是一个例子 DF
Category<-c("A","B","B","C","A","A","A","B","C","B","C","C", "D", "D")
Color<-c("Red","Blue","Yellow","Blue","Green","Blue","Green","Yellow","Blue","Red","Red","Red","Yellow", NA)
DF<-data.frame(Category,Color)
DF <- arrange(DF, Category)
DF
DF$Color <- as.character(DF$Color)
包含 NA 后,代码如下所示:
mode <- function(x) {
ux <- unique(x)
tx <- tabulate(match(x, ux))
if(length(unique(tx)) == 1) {
return(NA)
}
max_tx <- tx == max(tx)
return(ux[max_tx])
}
DF %>%
group_by(Category) %>%
summarise(Mode = mode(Color))
我正在尝试找出排除 NA 的代码。 df 看起来像:
Category Mode
<fct> <fct>
1 A Green
2 B Yellow
3 C NA
4 D Yellow
函数的以下更改确保根据输入返回正确类型的 NA
值,并且它适用于长度为 1 的向量。
mode <- function(x) {
ux <- unique(na.omit(x))
tx <- tabulate(match(x, ux))
if(length(ux) != 1 & sum(max(tx) == tx) > 1) {
if (is.character(ux)) return(NA_character_) else return(NA_real_)
}
max_tx <- tx == max(tx)
return(ux[max_tx])
}
DF %>%
group_by(Category) %>%
summarise(Mode = mode(Color))
# A tibble: 4 x 2
Category Mode
<fct> <chr>
1 A Green
2 B Yellow
3 C NA
4 D Yellow