更改 geom_bar ggplot 中 20% 条形的颜色

change the color of 20% of bars in geom_bar ggplot

我正在尝试更改下图中 9 个状态的颜色。这些州是最大的采矿州,我希望它们在下面的图片中脱颖而出。我可能需要修改我的数据框作为最简单的步骤。但是还有其他想法吗?

ggplot(data = media_impact_by_state) +
  #geom_hline(yintercept=0,linetype="dashed", color = "red") +
  geom_bar(aes(x= reorder(GeoName,trustclimsciSSTOppose - mean(trustclimsciSSTOppose)), 
               y= CO2limitsOppose-mean(CO2limitsOppose), fill = "fill1"),
           stat = 'identity') +
  geom_point(aes(x = GeoName,  
                 y = trustclimsciSSTOppose - mean(trustclimsciSSTOppose),
                color = "dot1"),
                 size=3) +
  scale_color_manual(values = c("black"),
                     label = "Distrust of Scientists",
                     name = "Mean Deviation") +
  scale_fill_manual(values = c(fill1 = "darkorange1",fill2 = "blue"),
                    labels = c(fill1 = "Oppose Limits to Co2 Emissions",fill2 = "poop"),
                    name = "Mean Deviation") +
  labs(x = "State",
       y = "(%)",
       title = "Distrust of Scientists") +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1,size=12),
        axis.text.y = element_text(size=14),
        axis.title.y = element_text(size=16),
        axis.title.x = element_text(size=16),
        plot.title = element_text(size=16,hjust=0.5))

如果不查看您的数据子集,将很难提供指导。要提供一些建议,请尝试使用 ifelse() 修改适当的列(即变量),然后再将其提供给 fill 美学。确保它包含在 aes() 调用中。您标题为“平均偏差”的图例应适当地分为两类。然后,根据需要修改scale_fill_manual()里面的颜色即可。

ggplot(data = media_impact_by_state) +
  geom_bar(aes(x = reorder(GeoName, trustclimsciSSTOppose - mean(trustclimsciSSTOppose)), 
               y = CO2limitsOppose - mean(CO2limitsOppose), 
               fill = factor(ifelse(GeoName %in% c(...), "Top 20", "Bottom 80"))),  # index the states
           stat = 'identity') +
  geom_point(aes(x = GeoName,  
                 y = trustclimsciSSTOppose - mean(trustclimsciSSTOppose),
                 color = "dot1"),
             size = 3) +
  scale_color_manual(name = "Mean Deviation"
                     values = c("black"),
                     labels = "Distrust of Scientists") +
  scale_fill_manual(name = "Mean Deviation", 
                    values = c("darkorange1",  # supply the vector of colors
                               "blue"),
                    labels = c("Oppose (Top 20)",  # supply the vector of labels
                               "Oppose (Bottom 80)") +
  labs(x = "State",
       y = "(%)",
       title = "Distrust of Scientists") +
  theme(
    axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1, size = 12),
    axis.text.y = element_text(size = 14),
    axis.title.y = element_text(size = 16),
    axis.title.x = element_text(size = 16),
    plot.title = element_text(size = 16, hjust = 0.5)
    )

但是,如果您想通过任何其他任意的挖掘输出度量来标记前 20% 的状态,那么也许您应该考虑使用 R 的通用函数之一修改现有数据框。我不确定您使用什么标准来确定“顶级”采矿状态,但这由您决定。例如,尝试提前创建一个变量,将其命名为 fill_col 并将其传递给 aes() 调用中的 fill。这是 pre-process 数据的方法:

media_impact_by_state %>% 
  arrange(GeoName, desc(mining_output)) %>%  # order in descending order by mining output
  mutate(fill_col = mining_output > quantile(mining_output, .8))  # flag the top 20 percent

最后,手动输入要突出显示的所有状态并没有错,但如果状态超过 50 个(如果包括哥伦比亚特区)。

希望对您有所帮助!