使用 dplyr::mutate 内的 purrr::map 将不同的启动参数应用于模型
Apply different start parameters to model using purrr::map within dplyr::mutate
试图在 ggplot2 邮件列表上回答某人的问题,但我无法弄明白:
https://groups.google.com/forum/#!topic/ggplot2/YgCqQX8JbPM
OP 想要将不同的起始参数应用于他的 nls 模型的数据子集。我的想法是他应该阅读有关 dplyr 和 purrr 的内容,但经过几个小时的尝试,我遇到了瓶颈。不确定这是一个错误还是我缺乏使用 purrr 的经验。
library(tidyverse)
# input dataset
df <- data.frame(Group = c(rep("A", 7), rep("B", 7), rep("C", 7)),
Time = c(rep(c(1:7), 3)),
Result = c(100, 96.9, 85.1, 62.0, 30.7, 15.2, 9.6,
10.2, 14.8, 32.26, 45.85, 56.25, 70.1, 100,
100, 55.61, 3.26, -4.77, -7.21, -3.2, -5.6))
# nest the datasets for computing models
df_p <-
df %>%
group_by(Group) %>%
nest
# add model parameters as rows/columns
df_p$starta = c(-3, 4,-3)
df_p$startb = c(85, 85, 85)
df_p$startc = c(4, 4, 4)
df_p$startd = c(10,10,10)
# compute models using nls
df_p %>%
mutate(model2 = map(data, ~nls(Result ~ a+(b-a)/(1+(Time/c)^d), data = ., start = c(a = starta, b = startb, c = startc, d = startd)))
)
#Error in mutate_impl(.data, dots) :
# parameters without starting value in 'data': a, b, d
感觉与这个错误有关,但现在已经修复了一段时间...
https://github.com/hadley/dplyr/issues/1447
据我所知,它正在寻找嵌套 tibble 范围内的变量,但我希望它在 mutate 调用的范围内。我不知道是否有办法解决这个问题。
无法找到一组参数来生成您设置的模型,但我认为就设置模型拟合过程而言,这是您可以做的;基本上你可以将所有参数 starta, startb .. etc
以及 Result
和 Time
列包装到数据中,然后你可以使用 .$
访问参数,注意在这种情况下您将需要 unique
函数来选择一个值,因为该值在取消嵌套时已被广播。使用简单的模型公式 a + b*Time
,它会在 model2
列中生成模型,您可以按照此路线调整传递给 nls
的初始参数以适应更复杂的公式已指定:
library(tidyverse)
df_p %>% unnest %>% group_by(Group) %>% nest %>%
mutate(model2 = map(data, ~nls(Result ~ a + b*Time, data = .,
start = c(a = unique(.$starta),
b = unique(.$startb))
)
)
)
# A tibble: 3 × 3
# Group data model2
# <fctr> <list> <list>
#1 A <tibble [7 × 6]> <S3: nls>
#2 B <tibble [7 × 6]> <S3: nls>
#3 C <tibble [7 × 6]> <S3: nls>
示例数据比较棘手,因为 B 组基本上时间倒转。为此寻找好的初始值不是我的问题。所以我为 B 组制作了新数据。以下是如何设置数据框以在 map2()
.
中应用 nls()
library(tidyverse)
df <- data.frame(Group = c(rep("A", 7), rep("B", 7), rep("C", 7)),
Time = c(rep(c(1:7), 3)),
Result = c(100, 96.9, 85.1, 62.0, 30.7, 15.2, 9.6,
## I replaced these values!!
## Group B initial values are NOT MY PROBLEM
105, 90, 82, 55, 40, 23, 7,
100, 55.61, 3.26, -4.77, -7.21, -3.2, -5.6))
## ggplot(df, aes(x = Time, y = Result, group = Group)) + geom_line()
df_p <-
df %>%
group_by(Group) %>%
nest() %>%
## init vals are all the same, but this shows how to make them different
mutate(start = list(
list(a = -3, b = 85, c = 4, d = 10),
list(a = -3, b = 85, c = 4, d = 10),
list(a = -3, b = 85, c = 4, d = 10)
)
)
df_p %>%
mutate(model2 = map2(data, start,
~ nls(Result ~ a+(b-a)/(1+(Time/c)^d),
data = .x, start = .y)))
#> # A tibble: 3 × 4
#> Group data start model2
#> <fctr> <list> <list> <list>
#> 1 A <tibble [7 × 2]> <list [4]> <S3: nls>
#> 2 B <tibble [7 × 2]> <list [4]> <S3: nls>
#> 3 C <tibble [7 × 2]> <list [4]> <S3: nls>
试图在 ggplot2 邮件列表上回答某人的问题,但我无法弄明白: https://groups.google.com/forum/#!topic/ggplot2/YgCqQX8JbPM
OP 想要将不同的起始参数应用于他的 nls 模型的数据子集。我的想法是他应该阅读有关 dplyr 和 purrr 的内容,但经过几个小时的尝试,我遇到了瓶颈。不确定这是一个错误还是我缺乏使用 purrr 的经验。
library(tidyverse)
# input dataset
df <- data.frame(Group = c(rep("A", 7), rep("B", 7), rep("C", 7)),
Time = c(rep(c(1:7), 3)),
Result = c(100, 96.9, 85.1, 62.0, 30.7, 15.2, 9.6,
10.2, 14.8, 32.26, 45.85, 56.25, 70.1, 100,
100, 55.61, 3.26, -4.77, -7.21, -3.2, -5.6))
# nest the datasets for computing models
df_p <-
df %>%
group_by(Group) %>%
nest
# add model parameters as rows/columns
df_p$starta = c(-3, 4,-3)
df_p$startb = c(85, 85, 85)
df_p$startc = c(4, 4, 4)
df_p$startd = c(10,10,10)
# compute models using nls
df_p %>%
mutate(model2 = map(data, ~nls(Result ~ a+(b-a)/(1+(Time/c)^d), data = ., start = c(a = starta, b = startb, c = startc, d = startd)))
)
#Error in mutate_impl(.data, dots) :
# parameters without starting value in 'data': a, b, d
感觉与这个错误有关,但现在已经修复了一段时间... https://github.com/hadley/dplyr/issues/1447
据我所知,它正在寻找嵌套 tibble 范围内的变量,但我希望它在 mutate 调用的范围内。我不知道是否有办法解决这个问题。
无法找到一组参数来生成您设置的模型,但我认为就设置模型拟合过程而言,这是您可以做的;基本上你可以将所有参数 starta, startb .. etc
以及 Result
和 Time
列包装到数据中,然后你可以使用 .$
访问参数,注意在这种情况下您将需要 unique
函数来选择一个值,因为该值在取消嵌套时已被广播。使用简单的模型公式 a + b*Time
,它会在 model2
列中生成模型,您可以按照此路线调整传递给 nls
的初始参数以适应更复杂的公式已指定:
library(tidyverse)
df_p %>% unnest %>% group_by(Group) %>% nest %>%
mutate(model2 = map(data, ~nls(Result ~ a + b*Time, data = .,
start = c(a = unique(.$starta),
b = unique(.$startb))
)
)
)
# A tibble: 3 × 3
# Group data model2
# <fctr> <list> <list>
#1 A <tibble [7 × 6]> <S3: nls>
#2 B <tibble [7 × 6]> <S3: nls>
#3 C <tibble [7 × 6]> <S3: nls>
示例数据比较棘手,因为 B 组基本上时间倒转。为此寻找好的初始值不是我的问题。所以我为 B 组制作了新数据。以下是如何设置数据框以在 map2()
.
nls()
library(tidyverse)
df <- data.frame(Group = c(rep("A", 7), rep("B", 7), rep("C", 7)),
Time = c(rep(c(1:7), 3)),
Result = c(100, 96.9, 85.1, 62.0, 30.7, 15.2, 9.6,
## I replaced these values!!
## Group B initial values are NOT MY PROBLEM
105, 90, 82, 55, 40, 23, 7,
100, 55.61, 3.26, -4.77, -7.21, -3.2, -5.6))
## ggplot(df, aes(x = Time, y = Result, group = Group)) + geom_line()
df_p <-
df %>%
group_by(Group) %>%
nest() %>%
## init vals are all the same, but this shows how to make them different
mutate(start = list(
list(a = -3, b = 85, c = 4, d = 10),
list(a = -3, b = 85, c = 4, d = 10),
list(a = -3, b = 85, c = 4, d = 10)
)
)
df_p %>%
mutate(model2 = map2(data, start,
~ nls(Result ~ a+(b-a)/(1+(Time/c)^d),
data = .x, start = .y)))
#> # A tibble: 3 × 4
#> Group data start model2
#> <fctr> <list> <list> <list>
#> 1 A <tibble [7 × 2]> <list [4]> <S3: nls>
#> 2 B <tibble [7 × 2]> <list [4]> <S3: nls>
#> 3 C <tibble [7 × 2]> <list [4]> <S3: nls>