通过 r 中的线性插值查找值
Finding values by linear interpolation in r
我有大量数据需要找到标准高度的几个变量的值。
我想对 Height=c(0,100,200,250,400,500)
处的其他变量的值进行线性插值,并将它们作为新列添加到现有数据中。这是我尝试获取一个变量的值作为标准 Height=c(0,100,200,250,400,500)
。这仅适用于一个变量:
data2<-approx(data2$Height,data2$ozone,xout=c(0,100,200,250,400,500))
预期结果应该是一个18行4列的数据框。
这是示例数据(data2)
:
ozone Height Temp Wind
23.224833 0.000000 253.005798 3.631531
23.750044 35.218689 253.299332 5.178889
24.589071 70.661133 253.538574 6.892455
25.619747 106.267334 253.492661 8.050934
26.443541 142.014648 253.279053 8.648781
27.235945 213.897034 252.815262 9.263882
27.698713 286.280518 252.10556 9.269853
27.865248 359.172363 251.390045 9.3006
28.361752 432.788086 251.379913 8.90488
30.279163 507.276733 251.849655 7.817647
23.048151 0.000000 251.528275 4.174027
23.477306 34.998413 251.6698 5.630364
24.16725 70.187622 251.759369 7.237537
25.239206 105.544006 251.744934 8.859097
26.319073 141.05011 251.601654 9.928196
27.409718 212.47052 251.214279 10.75243
27.825275 284.45282 250.738007 10.812123
28.214966 357.184631 250.87706 9.980968
29.726873 430.919983 251.84964 9.139032
32.482925 505.574097 252.471924 8.063484
22.369734 0.000000 250.876144 3.82036
22.916582 34.908447 251.044205 5.281044
23.732521 70.014038 251.170456 6.970277
24.998178 105.296021 251.221603 8.801399
26.30809 140.736084 251.133591 10.039667
27.572966 212.052795 250.852631 11.118568
28.233795 283.998474 250.61908 10.677624
29.079391 356.812012 251.179962 9.466641
31.244007 430.597534 252.042175 9.016301
33.636559 505.305542 252.659393 8.103294
提前感谢您的帮助。
更新
这是想要的答案:
Height ozone Temp Wind
0 23.22483 253.0058 3.631531
100 25.43833 253.5007 7.847021
200 27.08275 252.9049 9.144964
300 27.73006 251.9709 9.275640
400 28.14061 251.3844 9.081132
500 30.09185 251.8038 7.923858
0 23.04815 251.5283 4.174027
100 25.07112 251.7472 8.604831
200 27.21928 251.2819 10.608513
300 27.90858 250.7677 10.634455
400 29.09287 251.4418 9.492087
500 32.27714 252.4255 8.143790
0 22.36973 250.8761 3.820360
100 24.80820 251.2139 8.526537
200 27.35920 250.9001 10.936230
300 28.41962 250.7423 10.411498
400 30.34638 251.6846 9.203049
500 33.46665 252.6156 8.168133
您只需使用 lapply
浏览列。另外,您不能将内插值附加到 data2
。 data2
有 30 行,而 xout
的长度为 6。您需要另一个数据框来保存插值结果。
cbind.data.frame(data.frame(Height = 0:5 * 100),
lapply(data2[-2], function (u) approx(data2[[2]], u, 0:5 * 100)$y))
# Height ozone Temp Wind
#1 0 22.88091 251.8034 3.875306
#2 100 24.93562 251.5759 8.509502
#3 200 27.37860 251.2702 10.693545
#4 300 27.96728 251.9255 9.308131
#5 400 29.79659 251.7628 9.138091
#6 500 33.25064 252.5658 8.161940
跟进
The original data is model output for 3 days, and I want to keep it to some standard heights for comparing with other data. So each data frame represents one-day data. So I merge them in one big data frame data2
, with the same height as the other variables vary each day.
好的,你的data2
有时间属性,每10行对应一天的数据。好吧,您不应该逐行堆叠不同日期的数据。如果这样做,您应该添加一个新列,比如 day
以突出显示这种块/组结构。
所以,你真正需要的是对每个数据进行独立的线性插值。我最初的答案是使用所有三天的数据进行统一插值。由于您在 Height
上绑定了值,它实际上是在 3 天内对 ozone
、Temp
和 Wind
的平均值进行插值。以下代码可以满足您的期望。
## change my previous code to a function
result_per_day <- function (dat) {
cbind.data.frame(data.frame(Height = 0:5 * 100),
lapply(dat[-2], function (u) approx(dat[[2]], u, 0:5 * 100)$y))
}
datalst <- split(data2, gl(3, 10, labels = 1:3))
do.call(rbind.data.frame, lapply(datalst, result_per_day))
# Height ozone Temp Wind
#1.1 0 23.22483 253.0058 3.631531
#1.2 100 25.43833 253.5007 7.847021
#1.3 200 27.08275 252.9049 9.144964
#1.4 300 27.73006 251.9709 9.275640
#1.5 400 28.14061 251.3844 9.081132
#1.6 500 30.09185 251.8038 7.923858
#2.1 0 23.04815 251.5283 4.174027
#2.2 100 25.07112 251.7472 8.604831
#2.3 200 27.21928 251.2819 10.608513
#2.4 300 27.90858 250.7677 10.634455
#2.5 400 29.09287 251.4418 9.492087
#2.6 500 32.27714 252.4255 8.143790
#3.1 0 22.36973 250.8761 3.820360
#3.2 100 24.80820 251.2139 8.526537
#3.3 200 27.35920 250.9001 10.936230
#3.4 300 28.41962 250.7423 10.411498
#3.5 400 30.34638 251.6846 9.203049
#3.6 500 33.46665 252.6156 8.168133
这个最终数据框的行名非常具有解释性。 "1.1"
到 "1.6"
是第 1 天,而 "2.1"
到 "2.6"
是第 2 天,依此类推。
我有大量数据需要找到标准高度的几个变量的值。
我想对 Height=c(0,100,200,250,400,500)
处的其他变量的值进行线性插值,并将它们作为新列添加到现有数据中。这是我尝试获取一个变量的值作为标准 Height=c(0,100,200,250,400,500)
。这仅适用于一个变量:
data2<-approx(data2$Height,data2$ozone,xout=c(0,100,200,250,400,500))
预期结果应该是一个18行4列的数据框。
这是示例数据(data2)
:
ozone Height Temp Wind
23.224833 0.000000 253.005798 3.631531
23.750044 35.218689 253.299332 5.178889
24.589071 70.661133 253.538574 6.892455
25.619747 106.267334 253.492661 8.050934
26.443541 142.014648 253.279053 8.648781
27.235945 213.897034 252.815262 9.263882
27.698713 286.280518 252.10556 9.269853
27.865248 359.172363 251.390045 9.3006
28.361752 432.788086 251.379913 8.90488
30.279163 507.276733 251.849655 7.817647
23.048151 0.000000 251.528275 4.174027
23.477306 34.998413 251.6698 5.630364
24.16725 70.187622 251.759369 7.237537
25.239206 105.544006 251.744934 8.859097
26.319073 141.05011 251.601654 9.928196
27.409718 212.47052 251.214279 10.75243
27.825275 284.45282 250.738007 10.812123
28.214966 357.184631 250.87706 9.980968
29.726873 430.919983 251.84964 9.139032
32.482925 505.574097 252.471924 8.063484
22.369734 0.000000 250.876144 3.82036
22.916582 34.908447 251.044205 5.281044
23.732521 70.014038 251.170456 6.970277
24.998178 105.296021 251.221603 8.801399
26.30809 140.736084 251.133591 10.039667
27.572966 212.052795 250.852631 11.118568
28.233795 283.998474 250.61908 10.677624
29.079391 356.812012 251.179962 9.466641
31.244007 430.597534 252.042175 9.016301
33.636559 505.305542 252.659393 8.103294
提前感谢您的帮助。
更新
这是想要的答案:
Height ozone Temp Wind
0 23.22483 253.0058 3.631531
100 25.43833 253.5007 7.847021
200 27.08275 252.9049 9.144964
300 27.73006 251.9709 9.275640
400 28.14061 251.3844 9.081132
500 30.09185 251.8038 7.923858
0 23.04815 251.5283 4.174027
100 25.07112 251.7472 8.604831
200 27.21928 251.2819 10.608513
300 27.90858 250.7677 10.634455
400 29.09287 251.4418 9.492087
500 32.27714 252.4255 8.143790
0 22.36973 250.8761 3.820360
100 24.80820 251.2139 8.526537
200 27.35920 250.9001 10.936230
300 28.41962 250.7423 10.411498
400 30.34638 251.6846 9.203049
500 33.46665 252.6156 8.168133
您只需使用 lapply
浏览列。另外,您不能将内插值附加到 data2
。 data2
有 30 行,而 xout
的长度为 6。您需要另一个数据框来保存插值结果。
cbind.data.frame(data.frame(Height = 0:5 * 100),
lapply(data2[-2], function (u) approx(data2[[2]], u, 0:5 * 100)$y))
# Height ozone Temp Wind
#1 0 22.88091 251.8034 3.875306
#2 100 24.93562 251.5759 8.509502
#3 200 27.37860 251.2702 10.693545
#4 300 27.96728 251.9255 9.308131
#5 400 29.79659 251.7628 9.138091
#6 500 33.25064 252.5658 8.161940
跟进
The original data is model output for 3 days, and I want to keep it to some standard heights for comparing with other data. So each data frame represents one-day data. So I merge them in one big data frame
data2
, with the same height as the other variables vary each day.
好的,你的data2
有时间属性,每10行对应一天的数据。好吧,您不应该逐行堆叠不同日期的数据。如果这样做,您应该添加一个新列,比如 day
以突出显示这种块/组结构。
所以,你真正需要的是对每个数据进行独立的线性插值。我最初的答案是使用所有三天的数据进行统一插值。由于您在 Height
上绑定了值,它实际上是在 3 天内对 ozone
、Temp
和 Wind
的平均值进行插值。以下代码可以满足您的期望。
## change my previous code to a function
result_per_day <- function (dat) {
cbind.data.frame(data.frame(Height = 0:5 * 100),
lapply(dat[-2], function (u) approx(dat[[2]], u, 0:5 * 100)$y))
}
datalst <- split(data2, gl(3, 10, labels = 1:3))
do.call(rbind.data.frame, lapply(datalst, result_per_day))
# Height ozone Temp Wind
#1.1 0 23.22483 253.0058 3.631531
#1.2 100 25.43833 253.5007 7.847021
#1.3 200 27.08275 252.9049 9.144964
#1.4 300 27.73006 251.9709 9.275640
#1.5 400 28.14061 251.3844 9.081132
#1.6 500 30.09185 251.8038 7.923858
#2.1 0 23.04815 251.5283 4.174027
#2.2 100 25.07112 251.7472 8.604831
#2.3 200 27.21928 251.2819 10.608513
#2.4 300 27.90858 250.7677 10.634455
#2.5 400 29.09287 251.4418 9.492087
#2.6 500 32.27714 252.4255 8.143790
#3.1 0 22.36973 250.8761 3.820360
#3.2 100 24.80820 251.2139 8.526537
#3.3 200 27.35920 250.9001 10.936230
#3.4 300 28.41962 250.7423 10.411498
#3.5 400 30.34638 251.6846 9.203049
#3.6 500 33.46665 252.6156 8.168133
这个最终数据框的行名非常具有解释性。 "1.1"
到 "1.6"
是第 1 天,而 "2.1"
到 "2.6"
是第 2 天,依此类推。