R - 将外推 (lm) 值添加到观察矩阵
R - Adding an extrapolated (lm) value to a matrix of observations
我正在尝试将一组外推 "observations" 添加到 R 中的矩阵。我知道如何使用正常的编程技术(阅读;一堆嵌套循环和函数)来做到这一点,但我觉得这必须通过使用内置的 R 功能,可以以更简洁的方式实现。
下面的代码说明了这一点,以及它在哪里崩溃
非常感谢您的帮助!
谨致问候
西尔万
library(dplyr)
# The idea is that i have a table of observations for e.g. x=5, 6, 7, 8, 9 and 10. The observations (in this example 2)
# conform fairly decently to sets of 2nd order polynomials.
# Now, I want to add an extrapolated value to this table (e.g. x=4). I know how to do this programmically
# but I feel there must be a cleaner solution to do this.
#generate dummy data table
x <- 5:10
myData <- tibble(x, a = x^2 * 2 + x * 3 + 4 + rnorm(1,0,0.01), b = x^2 * 3 + x * 4 + 5 + rnorm(1,0,0.01) )
#Gather (put in Data-Key format)
myDataKeyFormat <- gather(myData,key = "someLabel", value = "myObservation", -x)
fitted_models <- myDataKeyFormat %>% group_by(someLabel) %>% do(model = lm(myObservation ~ poly(x,2), data = .))
myExtrapolatedDataPointx <- tibble(x = 4)
#Add the x=4 field
fitted_points <- fitted_models %>% group_by(someLabel) %>% do(predict(.$model,myExtrapolatedDataPointx)) #R really doesnt like this bit
#append the fitted_points to the myDataKeyFormat
myDataKeyFormatWithExtrapolation <- union(myDataKeyFormat,fitted_points)
#use spread to
myDataWithExtrapolation <- myDataKeyFormatWithExtrapolation %>% spread(someLabel,myObservation)
这是 tidyverse 中的解决方案,并使用 purrr
创建不同的模型。思路是嵌套(使用tidyr::nest
)然后purrr::map
来训练模型。然后我将添加新值并使用 modelr::add_predictions
计算预测。在这里,您将所有数据都放在同一个地方:训练数据、模型、测试数据和预测,由您的变量 someLabel
。我还为您提供了一种可视化数据的方法。
您可以查看 Hadley Wickham 和 Garrett Grolemund 的 R for Data Science,尤其是有关模型的部分以获取更多信息。
library(dplyr)
library(tibble)
library(tidyr)
library(purrr)
library(modelr)
library(ggplot2)
set.seed(1) # For reproducibility
x <- 5:10
myData <- tibble(x,
a = x^2 * 2 + x * 3 + 4 + rnorm(1,0,0.01),
b = x^2 * 3 + x * 4 + 5 + rnorm(1,0,0.01))
#Gather (put in Data-Key format)
myDataKeyFormat <- gather(myData,key = "someLabel", value = "myObservation", -x)
myModels <- myDataKeyFormat %>%
nest(-someLabel) %>%
mutate(model = map(data, ~lm(myObservation ~ poly(x,2), data = .x)))
此时的结果如下:对于 someLabel 的每个值,您都有一个模型。
# A tibble: 2 × 3
someLabel data model
<chr> <list> <list>
1 a <tibble [6 × 2]> <S3: lm>
2 b <tibble [6 × 2]> <S3: lm>
我将在新列中添加一些数据点(map
是将其创建为数据框每一行的小标题)。
# New data
new_data <- myModels %>%
mutate(new = map(data, ~tibble(x = c(3, 4, 11, 12))))
我添加预测:add_predictions
将数据框和模型作为参数,因此我使用 map2
映射新数据和模型。
fitted_models <- new_data %>%
mutate(new = map2(new, model, ~add_predictions(.x, .y)))
fitted_models
# A tibble: 2 × 4
someLabel data model new
<chr> <list> <list> <list>
1 a <tibble [6 × 2]> <S3: lm> <tibble [4 × 2]>
2 b <tibble [6 × 2]> <S3: lm> <tibble [4 × 2]>
好了:每个标签都有数据和基于该数据训练的模型,以及带有预测的新数据。
为了绘制它,我使用 unnest
将数据取回数据框,然后绑定行以将 "old" 数据和新值放在一起。
my_points <- bind_rows(unnest(fitted_models, data),
unnest(fitted_models, new))
ggplot(my_points)+
geom_point(aes(x = x, y = myObservation), color = "black") +
geom_point(aes(x = x, y = pred), color = "red")+
facet_wrap(~someLabel)
我正在尝试将一组外推 "observations" 添加到 R 中的矩阵。我知道如何使用正常的编程技术(阅读;一堆嵌套循环和函数)来做到这一点,但我觉得这必须通过使用内置的 R 功能,可以以更简洁的方式实现。
下面的代码说明了这一点,以及它在哪里崩溃
非常感谢您的帮助!
谨致问候
西尔万
library(dplyr)
# The idea is that i have a table of observations for e.g. x=5, 6, 7, 8, 9 and 10. The observations (in this example 2)
# conform fairly decently to sets of 2nd order polynomials.
# Now, I want to add an extrapolated value to this table (e.g. x=4). I know how to do this programmically
# but I feel there must be a cleaner solution to do this.
#generate dummy data table
x <- 5:10
myData <- tibble(x, a = x^2 * 2 + x * 3 + 4 + rnorm(1,0,0.01), b = x^2 * 3 + x * 4 + 5 + rnorm(1,0,0.01) )
#Gather (put in Data-Key format)
myDataKeyFormat <- gather(myData,key = "someLabel", value = "myObservation", -x)
fitted_models <- myDataKeyFormat %>% group_by(someLabel) %>% do(model = lm(myObservation ~ poly(x,2), data = .))
myExtrapolatedDataPointx <- tibble(x = 4)
#Add the x=4 field
fitted_points <- fitted_models %>% group_by(someLabel) %>% do(predict(.$model,myExtrapolatedDataPointx)) #R really doesnt like this bit
#append the fitted_points to the myDataKeyFormat
myDataKeyFormatWithExtrapolation <- union(myDataKeyFormat,fitted_points)
#use spread to
myDataWithExtrapolation <- myDataKeyFormatWithExtrapolation %>% spread(someLabel,myObservation)
这是 tidyverse 中的解决方案,并使用 purrr
创建不同的模型。思路是嵌套(使用tidyr::nest
)然后purrr::map
来训练模型。然后我将添加新值并使用 modelr::add_predictions
计算预测。在这里,您将所有数据都放在同一个地方:训练数据、模型、测试数据和预测,由您的变量 someLabel
。我还为您提供了一种可视化数据的方法。
您可以查看 Hadley Wickham 和 Garrett Grolemund 的 R for Data Science,尤其是有关模型的部分以获取更多信息。
library(dplyr)
library(tibble)
library(tidyr)
library(purrr)
library(modelr)
library(ggplot2)
set.seed(1) # For reproducibility
x <- 5:10
myData <- tibble(x,
a = x^2 * 2 + x * 3 + 4 + rnorm(1,0,0.01),
b = x^2 * 3 + x * 4 + 5 + rnorm(1,0,0.01))
#Gather (put in Data-Key format)
myDataKeyFormat <- gather(myData,key = "someLabel", value = "myObservation", -x)
myModels <- myDataKeyFormat %>%
nest(-someLabel) %>%
mutate(model = map(data, ~lm(myObservation ~ poly(x,2), data = .x)))
此时的结果如下:对于 someLabel 的每个值,您都有一个模型。
# A tibble: 2 × 3
someLabel data model
<chr> <list> <list>
1 a <tibble [6 × 2]> <S3: lm>
2 b <tibble [6 × 2]> <S3: lm>
我将在新列中添加一些数据点(map
是将其创建为数据框每一行的小标题)。
# New data
new_data <- myModels %>%
mutate(new = map(data, ~tibble(x = c(3, 4, 11, 12))))
我添加预测:add_predictions
将数据框和模型作为参数,因此我使用 map2
映射新数据和模型。
fitted_models <- new_data %>%
mutate(new = map2(new, model, ~add_predictions(.x, .y)))
fitted_models
# A tibble: 2 × 4
someLabel data model new
<chr> <list> <list> <list>
1 a <tibble [6 × 2]> <S3: lm> <tibble [4 × 2]>
2 b <tibble [6 × 2]> <S3: lm> <tibble [4 × 2]>
好了:每个标签都有数据和基于该数据训练的模型,以及带有预测的新数据。
为了绘制它,我使用 unnest
将数据取回数据框,然后绑定行以将 "old" 数据和新值放在一起。
my_points <- bind_rows(unnest(fitted_models, data),
unnest(fitted_models, new))
ggplot(my_points)+
geom_point(aes(x = x, y = myObservation), color = "black") +
geom_point(aes(x = x, y = pred), color = "red")+
facet_wrap(~someLabel)