为什么 stats::loess 和 geom_smooth(method = "loess") 不同?
Why is stats::loess and geom_smooth(method = "loess") different?
geom_smooth()
(红色)在 ggplot2
中绘制时比我用 geom_line()
(蓝色)绘制 stats::loess
的值时看起来更“平滑”。
为什么?以及如何使 geom_line()
像 geom_smooth()
生成的行一样?
代表:
# Data
data <- structure(list(date_int = c(0.834136630343671, 0.848910310142498,
0.851948868398994, 0.857082984073764, 0.866093880972339, 0.86955155071249,
0.874895222129086, 0.925660100586756, 0.937709555741827, 0.957355406538139,
0.977525146689019, 0.996070829840738, 0.998428331936295, 0.998428331936295,
0.998480720871752, 0.998795054484493, 0.999161777032691, 0.999528499580889,
0.999895222129086, 1, 1), value = c(51.78, 46.2, 44.01, 41.1,
39.1, 38.19, 42.87, 42.47, 37.22, 41.6, 44.7, 39.7, 23, 28.7,
23, 30.9, 35.4, 35.8, 32.4, 31, 31)), row.names = c(NA, -21L), class = c("tbl_df",
"tbl", "data.frame"))
# Add manually added loess values
data <- data %>%
mutate(pred_loess = stats::loess(value ~ date_int, method = "loess")$fitted)
# Plot red and blue
ggplot(data,
aes(x = date_int,
y = value)) +
geom_point() +
geom_smooth(colour = "red", size = 1, se = FALSE) +
geom_line(aes(y = pred_loess), colour = "blue", size = 1, se = FALSE) +
labs(title = "RED (geom_smooth) is smoother\nthan BLUE (geom_line)")
要手动绘制黄土线,请制作一个具有规则间隔 x-values 的新数据框,并使用 predict()
函数查找 y-variable.
的值
library(dplyr)
library(ggplot2)
# Data
data <- structure(list(date_int = c(0.834136630343671, 0.848910310142498,
0.851948868398994, 0.857082984073764, 0.866093880972339, 0.86955155071249,
0.874895222129086, 0.925660100586756, 0.937709555741827, 0.957355406538139,
0.977525146689019, 0.996070829840738, 0.998428331936295, 0.998428331936295,
0.998480720871752, 0.998795054484493, 0.999161777032691, 0.999528499580889,
0.999895222129086, 1, 1), value = c(51.78, 46.2, 44.01, 41.1,
39.1, 38.19, 42.87, 42.47, 37.22, 41.6, 44.7, 39.7, 23, 28.7,
23, 30.9, 35.4, 35.8, 32.4, 31, 31)), row.names = c(NA, -21L), class = c("tbl_df",
"tbl", "data.frame"))
fit <- stats::loess(value ~ date_int, data = data)
# Make data.frame for loess trend
fit_df <- data.frame(
date_int = seq(min(data$date_int), max(data$date_int), length.out = 500)
)
fit_df$value <- predict(fit, newdata = fit_df)
# Plot red and blue
ggplot(data,
aes(x = date_int,
y = value)) +
geom_point() +
geom_smooth(colour = "red", size = 1, se = FALSE) +
geom_line(data = fit_df, colour = "blue", size = 1) +
labs(title = "RED (geom_smooth) is smoother\nthan BLUE (geom_line)")
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
由 reprex package (v0.3.0)
于 2022-04-20 创建
如评论中所述,您之前的方法仅给出数据框中数据点的拟合值(而不是 x-axis 中的序列)。
geom_smooth()
(红色)在 ggplot2
中绘制时比我用 geom_line()
(蓝色)绘制 stats::loess
的值时看起来更“平滑”。
为什么?以及如何使 geom_line()
像 geom_smooth()
生成的行一样?
代表:
# Data
data <- structure(list(date_int = c(0.834136630343671, 0.848910310142498,
0.851948868398994, 0.857082984073764, 0.866093880972339, 0.86955155071249,
0.874895222129086, 0.925660100586756, 0.937709555741827, 0.957355406538139,
0.977525146689019, 0.996070829840738, 0.998428331936295, 0.998428331936295,
0.998480720871752, 0.998795054484493, 0.999161777032691, 0.999528499580889,
0.999895222129086, 1, 1), value = c(51.78, 46.2, 44.01, 41.1,
39.1, 38.19, 42.87, 42.47, 37.22, 41.6, 44.7, 39.7, 23, 28.7,
23, 30.9, 35.4, 35.8, 32.4, 31, 31)), row.names = c(NA, -21L), class = c("tbl_df",
"tbl", "data.frame"))
# Add manually added loess values
data <- data %>%
mutate(pred_loess = stats::loess(value ~ date_int, method = "loess")$fitted)
# Plot red and blue
ggplot(data,
aes(x = date_int,
y = value)) +
geom_point() +
geom_smooth(colour = "red", size = 1, se = FALSE) +
geom_line(aes(y = pred_loess), colour = "blue", size = 1, se = FALSE) +
labs(title = "RED (geom_smooth) is smoother\nthan BLUE (geom_line)")
要手动绘制黄土线,请制作一个具有规则间隔 x-values 的新数据框,并使用 predict()
函数查找 y-variable.
library(dplyr)
library(ggplot2)
# Data
data <- structure(list(date_int = c(0.834136630343671, 0.848910310142498,
0.851948868398994, 0.857082984073764, 0.866093880972339, 0.86955155071249,
0.874895222129086, 0.925660100586756, 0.937709555741827, 0.957355406538139,
0.977525146689019, 0.996070829840738, 0.998428331936295, 0.998428331936295,
0.998480720871752, 0.998795054484493, 0.999161777032691, 0.999528499580889,
0.999895222129086, 1, 1), value = c(51.78, 46.2, 44.01, 41.1,
39.1, 38.19, 42.87, 42.47, 37.22, 41.6, 44.7, 39.7, 23, 28.7,
23, 30.9, 35.4, 35.8, 32.4, 31, 31)), row.names = c(NA, -21L), class = c("tbl_df",
"tbl", "data.frame"))
fit <- stats::loess(value ~ date_int, data = data)
# Make data.frame for loess trend
fit_df <- data.frame(
date_int = seq(min(data$date_int), max(data$date_int), length.out = 500)
)
fit_df$value <- predict(fit, newdata = fit_df)
# Plot red and blue
ggplot(data,
aes(x = date_int,
y = value)) +
geom_point() +
geom_smooth(colour = "red", size = 1, se = FALSE) +
geom_line(data = fit_df, colour = "blue", size = 1) +
labs(title = "RED (geom_smooth) is smoother\nthan BLUE (geom_line)")
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
由 reprex package (v0.3.0)
于 2022-04-20 创建如评论中所述,您之前的方法仅给出数据框中数据点的拟合值(而不是 x-axis 中的序列)。