如何为一个因素绘制具有多个频率的数据框?

How to plot a data frame with multiple frequencies for a factor?

我有这个数据框:

df <- data.frame(make = c("dodge", "dodge", "toyota", "ford", "dodge", "toyota","toyota","ford",  "ford", "dodge"),
                  grn = c(    1,      1,        NA,      1,     NA,      NA,       1,         1,      NA,      NA),
                  blu = c(    NA,     NA,       1,       NA,    1,       NA,       NA,        NA,     1,       NA),
                  blk = c(    NA,     NA,       NA,      NA,    NA,      1,        NA,        NA,     NA,       1))   

我正在尝试创建一个图,x 轴为 "make",y 轴为总计数 "make",并使用颜色填充。我想我需要对品牌和颜色进行计数 table,但我不确定该怎么做。例如 table 看起来像这样:

 DF <- read.table(text = "make  grn  blu  blk
                          dodge  2    1    1
                          ford   2    1    0
                          toyota 1    1    1", header = TRUE)

那么解决方法就很简单了

library(reshape2)
library(ggplot2)

DF1 <- melt(DF, id.var="make")

ggplot(DF1, aes(x = make, y = value, fill = variable)) +
geom_bar(stat = "identity")

那么如何将数据框 "df" 转换为 "DF"?

您可以使用 dplyr::summarise_all:

library(dplyr)
df %>% group_by(make) %>% summarise_all(sum, na.rm=TRUE)

# A tibble: 3 × 4
#    make   grn   blu   blk
#  <fctr> <dbl> <dbl> <dbl>
#1  dodge     2     1     1
#2   ford     2     1     0
#3 toyota     1     1     1

好吧,您不必进行所有这些转换。更短的情节制作方式:

df %>%
  gather(key=col, value=num, -make) %>%
  na.omit() %>%
  ggplot(aes(make, fill=col)) +
    geom_bar()

前三行创建输入数据的长格式。然后它被传递给 ggplot 为您进行统计转换。