向现有模型提供新数据并使用 broom::augment 添加预测

Feeding new data to existing model and using broom::augment to add predictions

我正在使用 tidyversebroompurrr 将模型按组拟合到某些数据。然后,我尝试使用此模型再次按组预测一些新数据。 broom 的 'augment' 函数不仅很好地添加了预测,还添加了其他值,如标准错误等。但是,我无法让 'augment' 函数使用新数据的旧数据。结果,我的两组预测完全一致。问题是 - 如何让 'augment' 使用新数据而不是旧数据(用于拟合模型)?

这是一个可重现的例子:

library(tidyverse)
library(broom)
library(purrr)

# nest the iris dataset by Species and fit a linear model
iris.nest <- nest(iris, data = c(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)) %>% 
  mutate(model = map(data, function(df) lm(Sepal.Width ~ Sepal.Length, data=df)))

# create a new dataset where the Sepal.Length is 5x as big
newdata <- iris %>% 
  mutate(Sepal.Length = Sepal.Length*5) %>% 
  nest(data = c(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)) %>% 
  rename("newdata"="data")

# join these two nested datasets together
iris.nest.new <- left_join(iris.nest, newdata)

# now form two new columns of predictions -- one using the "old" data that the model was
# initially fit on, and the second using the new data where the Sepal.Length has been increased
iris.nest.new <- iris.nest.new %>% 
  mutate(preds = map(model, broom::augment),
         preds.new = map2(model, newdata, broom::augment))  # THIS LINE DOESN'T WORK ****
                             
# unnest the predictions on the "old" data
preds <-select(iris.nest.new, preds) %>% 
 unnest(cols = c(preds))
# rename the columns prior to merging
names(preds)[3:9] <- paste0("old", names(preds)[3:9])

# now unnest the predictions on the "new" data
preds.new <-select(iris.nest.new, preds.new) %>% 
 unnest(cols = c(preds.new))
#... and also rename columns prior to merging
names(preds.new)[3:9] <- paste0("new", names(preds.new)[3:9])

# merge the two sets of predictions and compare
compare <- bind_cols(preds, preds.new) 

# compare
select(compare, old.fitted, new.fitted) %>% View(.) # EXACTLY THE SAME!!!!

调用broom::augment时,注意newdata=参数是第三个参数。当您使用 purr::map2 时,您迭代的值默认在前两个参数中传递。将传入的列表命名为什么并不重要。您需要将新数据显式放置在 newdata= 参数中。

iris.nest.new <- iris.nest.new %>% 
  mutate(preds = map(model, broom::augment),
         preds.new = map2(model, newdata, ~broom::augment(.x, newdata=.y)))

可以看出区别运行这两个命令

broom::augment(iris.nest.new$model[[1]], iris.nest.new$newdata[[1]])
broom::augment(iris.nest.new$model[[1]], newdata=iris.nest.new$newdata[[1]])