为跨多个数值变量的多个组拟合黄土平滑器

Fit loess smoothers for multiple groups across multiple numeric variables

我需要通过跨多个数字列(Var1、Var2)的分组变量(Animal)来拟合许多黄土样条,并提取这些值。

我找到了执行此任务的代码一次一个变量;

# Create dataframe 1
OneVarDF <- data.frame(Day = c(replicate(1,sample(1:50,200,rep=TRUE))),
                 Animal = c(c(replicate(100,"Greyhound"), c(replicate(100,"Horse")))),
                 Var1 = c(c(replicate(1,sample(2:10,100,rep=TRUE))), c(replicate(1,sample(15:20,100,rep=TRUE)))))


library(dplyr)
library(tidyr)
library(purrr)

# Get fitted values from each model
Models <- OneVarDF %>%
  tidyr::nest(-Animal) %>%
  dplyr::mutate(m = purrr::map(data, loess, formula = Var1 ~ Day, span = 0.30),
                fitted = purrr::map(m, `[[`, "fitted")
  )

# Create prediction column
Results <- Models %>%
  dplyr::select(-m) %>%
  tidyr::unnest()

这个"Results"数据框对于下游任务(去除许多非参数分布的趋势)是必不可少的。

我们如何使用具有多个数字列的数据框(下面的代码)实现这一点,并提取 "Results"数据框?谢谢。

# Create dataframe 2
TwoVarDF <- data.frame(Day = c(replicate(1,sample(1:50,200,rep=TRUE))),
                       Animal = c(c(replicate(100,"Greyhound"), c(replicate(100,"Horse")))),
                       Var1 = c(c(replicate(1,sample(2:10,100,rep=TRUE))), c(replicate(1,sample(15:20,100,rep=TRUE)))),
                       Var2 = c(c(replicate(1,sample(22:27,100,rep=TRUE))), c(replicate(1,sample(29:35,100,rep=TRUE)))))

我们可以使用长格式获取数据。 pivot_longergroup_by Animal 和列名并将 loess 应用于每个组合。

library(dplyr)
library(tidyr)

TwoVarDF %>%
  pivot_longer(cols = starts_with('Var')) %>%
  group_by(Animal, name) %>%
  mutate(model = loess(value~Day, span = 0.3)$fitted)

包括一个 gather() 函数以像您之前的代码一样继续。

Models2 <- TwoVarDF %>%
  gather(varName, varVal, 3:4) %>% 
  tidyr::nest(-Animal, -varName) %>%
  dplyr::mutate(m = purrr::map(data, loess, formula = varVal ~ Day, span = 0.30),
                fitted = purrr::map(m, `[[`, "fitted")
  )