R ggplot2 geom_smooth 不添加负值

R ggplot2 geom_smooth wihtout adding negative values

假设我们运行 install.packages("ggplot2") install.packages("babynames")

然后

library(babynames)
data(babynames)
my_d <- babynames %>%
  filter(
    name == "Josiah"   & sex == "M" |
      name == "Alicia"  & sex == "F" |
      name == "Gabriel"  & sex == "M" |
      name == "Joshua" & sex == "M"
  ) %>%
  group_by(name, year, sex) %>%
  summarise(n = sum(n) ) %>%
  arrange(year)

ggplot(my_d, aes(x = year, y = n, color = name) ) +
  geom_line(se = FALSE) +
  scale_x_continuous(breaks = seq(1900, 2020, by = 10) )

给予

很好,但我想平滑它"a little",所以我这样做了

ggplot(my_d, aes(x = year, y = n, color = name) ) +
  geom_smooth(se = FALSE) +
  scale_x_continuous(breaks = seq(1900, 2020, by = 10) )

这给出了

这很顺利,但它为 "Joshua" 添加了负值。

我怎样才能避免 "side effect"?

编辑:更改

geom_smooth(se = FALSE)

geom_smooth(se = FALSE, method = "loess") +
  ylim(0, 30000)

删除了负值,但 "smoothing is still to coarse, so too speak. It shows "Gabriel" 具有增加的趋势,但事实并非如此。 这是结果

span =参数应该根据documentation进行平滑"wigglier"。也许玩弄它会解决你的问题。下面是 span = .1 的示例。粗糙的边缘已从线条中消失,但总体趋势往往保持可见和真实。设置 span 太低可能会导致内存问题,具体取决于数据大小。

library(babynames)
library(tidyverse)
data(babynames)
my_d <- babynames %>%
  filter(
    name == "Josiah"   & sex == "M" |
      name == "Alicia"  & sex == "F" |
      name == "Gabriel"  & sex == "M" |
      name == "Joshua" & sex == "M"
  ) %>%
  group_by(name, year, sex) %>%
  summarise( n = sum(n) ) %>%
  arrange( year )

ggplot( my_d, aes(x = year, y = n, color = name) ) +
  geom_smooth(se = FALSE, method = "loess", span = .1) +
  scale_x_continuous( breaks = seq(1900, 2020, by = 10) )

reprex package (v0.3.0)

于 2020-02-21 创建