在 Python 中获得等效的 `bs`(样条曲线)

Get `bs` (splines) equivalent in Python

r编程语言中,如下

require(stats)
require(splines)
knots = quantile(women$height, seq(0.1,0.9,length.out = 5))
bs(women$height, knots=knots, degree=3)

returns

1   2   3   4   5   6   7   8
0.0000000000    0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.0000000000    0.00000000
0.6284418529    0.323939099 0.024295432 0.000000000 0.000000000 0.000000000 0.0000000000    0.00000000
0.2155814707    0.599894720 0.182883868 0.001639942 0.000000000 0.000000000 0.0000000000    0.00000000
0.0349854227    0.495626822 0.438289602 0.031098154 0.000000000 0.000000000 0.0000000000    0.00000000
0.0001619695    0.245586330 0.620809038 0.133442663 0.000000000 0.000000000 0.0000000000    0.00000000
0.0000000000    0.072886297 0.584548105 0.338678328 0.003887269 0.000000000 0.0000000000    0.00000000
0.0000000000    0.009110787 0.384718173 0.561892614 0.044278426 0.000000000 0.0000000000    0.00000000
0.0000000000    0.000000000 0.166666667 0.666666667 0.166666667 0.000000000 0.0000000000    0.00000000
0.0000000000    0.000000000 0.044278426 0.561892614 0.384718173 0.009110787 0.0000000000    0.00000000
0.0000000000    0.000000000 0.003887269 0.338678328 0.584548105 0.072886297 0.0000000000    0.00000000
0.0000000000    0.000000000 0.000000000 0.133442663 0.620809038 0.245586330 0.0001619695    0.00000000
0.0000000000    0.000000000 0.000000000 0.031098154 0.438289602 0.495626822 0.0349854227    0.00000000
0.0000000000    0.000000000 0.000000000 0.001639942 0.182883868 0.599894720 0.2155814707    0.00000000
0.0000000000    0.000000000 0.000000000 0.000000000 0.024295432 0.323939099 0.6284418529    0.02332362
0.0000000000    0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.0000000000    1.00000000

是否有 Python 等价物?我试过 BSplinescipy,但它要求系数已知并传入。

我怎样才能生成 B 样条基矩阵,传入数组、节点和度数?


要重现输入 Python,您可以这样做:

import numpy as np

women_height = np.array([58,59,60,61,62,63,64,65,66,67,68,69,70,71,72])
knots = array([59.4, 62.2, 65. , 67.8, 70.6])

将评论变成答案,BSpline.design_matrix 正在以 csr 稀疏格式构建您想要的内容。它将在 scipy 1.8 发布时提供。在那之前,您可以获取 scipy 的 master 分支,或者使用文档建议的解决方法 (https://scipy.github.io/devdocs/reference/generated/scipy.interpolate.BSpline.design_matrix.html#scipy.interpolate.BSpline.design_matrix) :

t = ... 
c = np.eye(len(t) - k - 1)
design_matrix_gh = BSpline(t, c, k)(x)

编辑:R 文档,https://www.rdocumentation.org/packages/splines/versions/3.6.2/topics/bs,指出 knots 参数是 内部 节。 scipy的BSpline不会自动补结,需要自己补。使用 OP 数据:

In [22]: women_height = np.array([58,59,60,61,62,63,64,65,66,67,68,69,70,71,72])
    ...: knots = np.array([59.4, 62.2, 65. , 67.8, 70.6])

In [23]: t = np.r_[(58,)*4, knots, (72,)*4]   # <<<<<< here

In [24]: m = BSpline.design_matrix(women_height, t, k=3)

In [25]: with np.printoptions(linewidth=120, precision=5):
    ...:     print(m.toarray())
    ...:
[[1.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00]
 [2.33236e-02 6.28442e-01 3.23939e-01 2.42954e-02 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00]
 [0.00000e+00 2.15581e-01 5.99895e-01 1.82884e-01 1.63994e-03 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00]
 [0.00000e+00 3.49854e-02 4.95627e-01 4.38290e-01 3.10982e-02 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00]
 [0.00000e+00 1.61970e-04 2.45586e-01 6.20809e-01 1.33443e-01 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00]
 [0.00000e+00 0.00000e+00 7.28863e-02 5.84548e-01 3.38678e-01 3.88727e-03 0.00000e+00 0.00000e+00 0.00000e+00]
 [0.00000e+00 0.00000e+00 9.11079e-03 3.84718e-01 5.61893e-01 4.42784e-02 0.00000e+00 0.00000e+00 0.00000e+00]
 [0.00000e+00 0.00000e+00 0.00000e+00 1.66667e-01 6.66667e-01 1.66667e-01 0.00000e+00 0.00000e+00 0.00000e+00]
 [0.00000e+00 0.00000e+00 0.00000e+00 4.42784e-02 5.61893e-01 3.84718e-01 9.11079e-03 0.00000e+00 0.00000e+00]
 [0.00000e+00 0.00000e+00 0.00000e+00 3.88727e-03 3.38678e-01 5.84548e-01 7.28863e-02 0.00000e+00 0.00000e+00]
 [0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.33443e-01 6.20809e-01 2.45586e-01 1.61970e-04 0.00000e+00]
 [0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 3.10982e-02 4.38290e-01 4.95627e-01 3.49854e-02 0.00000e+00]
 [0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.63994e-03 1.82884e-01 5.99895e-01 2.15581e-01 0.00000e+00]
 [0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 2.42954e-02 3.23939e-01 6.28442e-01 2.33236e-02]
 [0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 1.00000e+00]]

这看起来类似于 OP 模第一列的 R 输出。从 R 的文档中不能立即清楚它是如何填充结向量的,但如果你想要相同的输出,你可以将第一列砍掉(m.toarray()[1:, :] 或类似的)。