mgcv：如何设置样条节点的数量和/或位置

Question

我想在 mgcv 个包中使用函数 gam：

 x <- seq(0,60, len =600)
 y <- seq(0,1, len=600) 
 prova <- gam(y ~ s(x, bs='cr')

我可以在s()中设置个节吗？然后我能知道样条线使用的结在哪里吗？谢谢！

Answer 1

虽然设置 k 是正确的方法，但 fx = TRUE 绝对不对：它会强制使用纯回归样条而不会受到惩罚。

节点位置

对于惩罚回归样条，具体位置并不重要，只要：

k足够大；
结的分布具有良好、合理的覆盖范围。

默认：

自然三次回归样条 bs = 'cr' 按 分位数 ;
B 样条线族（bs = 'bs'、bs = 'ps'、bs = 'ad'）放置节点均匀。

比较以下：

library(mgcv)

## toy data
set.seed(0); x <- sort(rnorm(400, 0, pi))  ## note, my x are not uniformly sampled
set.seed(1); e <- rnorm(400, 0, 0.4)
y0 <- sin(x) + 0.2 * x + cos(abs(x))
y <- y0 + e

## fitting natural cubic spline
cr_fit <- gam(y ~ s(x, bs = 'cr', k = 20))
cr_knots <- cr_fit$smooth[[1]]$xp  ## extract knots locations

## fitting B-spline
bs_fit <- gam(y ~ s(x, bs = 'bs', k = 20))
bs_knots <- bs_fit$smooth[[1]]$knots  ## extract knots locations

## summary plot
par(mfrow = c(1,2))
plot(x, y, col= "grey", main = "natural cubic spline");
lines(x, cr_fit$linear.predictors, col = 2, lwd = 2)
abline(v = cr_knots, lty = 2)
plot(x, y, col= "grey", main = "B-spline");
lines(x, bs_fit$linear.predictors, col = 2, lwd = 2)
abline(v = bs_knots, lty = 2)

您可以看到结位置的不同。

设置您自己的结点位置：

您还可以通过 gam() 的 knots 参数提供您自定义的结点位置（是的，结点不会馈送到 s()，而是馈送到 gam()）。例如，您可以为 cr:

做均匀间隔的结

xlim <- range(x)  ## get range of x
myfit <- gam(y ~ s(x, bs = 'cr', k =20),
         knots = list(x = seq(xlim[1], xlim[2], length = 20)))

现在你可以看到：

my_knots <- myfit$smooth[[1]]$xp
plot(x, y, col= "grey", main = "my knots");
lines(x, myfit$linear.predictors, col = 2, lwd = 2)
abline(v = my_knots, lty = 2)

不过，通常不需要自己打结。但如果你真的想这样做，你必须清楚你在做什么。此外，您提供的节数必须与 s().

中的 k 匹配

mgcv：如何设置样条节点的数量和/或位置

mgcv: How to set number and / or locations of knots for splines

regression

r

spline

gam

mgcv