使用 purrr::map 在自定义函数中输入列表数据框中的列参数
entering column arguments from list dataframes in a custom function using purrr::map
我正在编写一个自定义函数,它在 purrr::map
的帮助下为列表的每个元素建立线性混合效应模型。代码块工作得很好,但是当我把它变成一个自定义函数时,我不清楚我应该如何输入与列表元素中的各个列相对应的参数。
如果自定义函数正常工作,我可以将它用于任意多的变量。否则,我将不得不为不同的变量复制粘贴相同的代码。
# libraries needed
library(purrr)
library(lmerTest)
data(mtcars)
# create a list of dataframes from mtcars based on a split
group_list <- split(mtcars, mtcars$am)
# goal: to do linear mixed effects model for each dataframe and combining the results neatly in a dataframe
# achieving this outside of a custom function
group_list %>%
purrr::map(.x = (.),
.f = ~ lmerTest::lmer(
scale(mpg) ~ scale(wt) + (wt | cyl),
data = (.),
REML = FALSE
)) %>%
purrr::map(.f = ~ coef(summary(.))[-c(1),]) %>%
base::do.call(what = cbind.data.frame, args = .) %>%
tibble::rownames_to_column(df = ., var = "Effect")
#> Effect 0 1
#> 1 Estimate -0.3318711 -9.089148e-01
#> 2 Std. Error 0.2104268 1.156500e-01
#> 3 df 0.6084658 1.300000e+01
#> 4 t value -1.5771334 -7.859187e+00
#> 5 Pr(>|t|) 0.4558206 2.714599e-06
# preparing the custom function to do the same
lmer_group <- function(list, x, y) {
list %>%
purrr::map(
.x = (.),
.f = ~ lmerTest::lmer(
scale(y) ~ scale(x) + (x | cyl),
data = (.),
REML = FALSE
)
) %>%
purrr::map(.f = ~ coef(summary(.))[-c(1),]) %>%
base::do.call(what = cbind.data.frame, args = .) %>%
tibble::rownames_to_column(df = ., var = "Effect")
}
# doing the same analysis with a custom function
lmer_group(list = group_list, x = wt, y = mpg) # attempt 1
#> Error in scale(y): object 'mpg' not found
lmer_group(list = group_list, x = 'wt', y = 'mpg') # attempt 2
#> Error in colMeans(x, na.rm = TRUE): 'x' must be numeric
lmer_group(
list = group_list,
x = lapply(group_list, `[`, 'wt'),
y = lapply(group_list, `[`, 'mpg')
) # attempt 3
#> Error in colMeans(x, na.rm = TRUE): 'x' must be numeric
由 reprex 创建于 2018-01-28
包 (v0.1.1.9000).
所有间接都发生在公式中,所以现在我认为根本不需要 rlang。
您可以传递所需变量的字符串,并将它们粘贴在一起作为 lmer 函数的字符串。然后使用 stats::as.formula()
将其转换为适合 lmer 的公式。
lmer_group <- function(l, x_name, y_name) {
fx <- paste0("scale(", y_name, ") ~ scale(", x_name, ") + (", x_name," | cyl)")
print(paste("Evaluating: ", fx))
l %>%
purrr::map(
.f = ~ lmerTest::lmer(
as.formula(fx),
data = (.),
REML = FALSE
)
) %>%
purrr::map(.f = ~ coef(summary(.))[-c(1),]) %>%
base::do.call(what = cbind.data.frame, args = .) %>%
tibble::rownames_to_column(df = ., var = "Effect")
}
lmer_group(l = group_list, x = 'wt', y = 'mpg') # attempt 2
结果:
[1] "Evaluating: scale(mpg) ~ scale(wt) + (wt | cyl)"
Effect 0 1
1 Estimate -0.3318712 -9.089148e-01
2 Std. Error 0.2104267 1.156500e-01
3 df 0.6084632 1.300000e+01
4 t value -1.5771343 -7.859187e+00
5 Pr(>|t|) 0.4558213 2.714599e-06
我敢打赌 rlang approach with quo()
. If you take this solution, it's essentially a duplicate of Formula with dynamic number of variables。
这是一个类似的方法,其结果被转置。我认为如果所有 t 值都在同一列而不是在同一行中会更有用。它使查询和操作更容易。
lmer_group <- function(l, x_name, y_name) {
fx <- glue::glue("scale({y_name}) ~ scale({x_name}) + ({x_name} | cyl)")
cat(paste("Evaluating: ", fx, "\n"))
filter_name <- glue::glue("scale({x_name})")
l %>%
purrr::map(
.f = ~ lmerTest::lmer(
as.formula(fx),
data = (.),
REML = FALSE
)
) %>%
purrr::map_dfr(.f = ~ broom::tidy(.), .id = "am") %>%
dplyr::filter(term==!!filter_name) %>%
dplyr::select(
am,
estimate,
std.error,
t = statistic
)
}
lmer_group(l = group_list, x = 'wt', y = 'mpg') # attempt 2
df 和 p 值没有出现,因为我认为 lme4 tidyer 中没有写入这些值.这可能会破坏交易。
Evaluating: scale(mpg) ~ scale(wt) + (wt | cyl)
am estimate std.error t
1 0 -0.3318712 0.2104267 -1.577134
2 1 -0.9089148 0.1156500 -7.859187
为了多样性,我用glue代替了paste0()
。
我正在编写一个自定义函数,它在 purrr::map
的帮助下为列表的每个元素建立线性混合效应模型。代码块工作得很好,但是当我把它变成一个自定义函数时,我不清楚我应该如何输入与列表元素中的各个列相对应的参数。
如果自定义函数正常工作,我可以将它用于任意多的变量。否则,我将不得不为不同的变量复制粘贴相同的代码。
# libraries needed
library(purrr)
library(lmerTest)
data(mtcars)
# create a list of dataframes from mtcars based on a split
group_list <- split(mtcars, mtcars$am)
# goal: to do linear mixed effects model for each dataframe and combining the results neatly in a dataframe
# achieving this outside of a custom function
group_list %>%
purrr::map(.x = (.),
.f = ~ lmerTest::lmer(
scale(mpg) ~ scale(wt) + (wt | cyl),
data = (.),
REML = FALSE
)) %>%
purrr::map(.f = ~ coef(summary(.))[-c(1),]) %>%
base::do.call(what = cbind.data.frame, args = .) %>%
tibble::rownames_to_column(df = ., var = "Effect")
#> Effect 0 1
#> 1 Estimate -0.3318711 -9.089148e-01
#> 2 Std. Error 0.2104268 1.156500e-01
#> 3 df 0.6084658 1.300000e+01
#> 4 t value -1.5771334 -7.859187e+00
#> 5 Pr(>|t|) 0.4558206 2.714599e-06
# preparing the custom function to do the same
lmer_group <- function(list, x, y) {
list %>%
purrr::map(
.x = (.),
.f = ~ lmerTest::lmer(
scale(y) ~ scale(x) + (x | cyl),
data = (.),
REML = FALSE
)
) %>%
purrr::map(.f = ~ coef(summary(.))[-c(1),]) %>%
base::do.call(what = cbind.data.frame, args = .) %>%
tibble::rownames_to_column(df = ., var = "Effect")
}
# doing the same analysis with a custom function
lmer_group(list = group_list, x = wt, y = mpg) # attempt 1
#> Error in scale(y): object 'mpg' not found
lmer_group(list = group_list, x = 'wt', y = 'mpg') # attempt 2
#> Error in colMeans(x, na.rm = TRUE): 'x' must be numeric
lmer_group(
list = group_list,
x = lapply(group_list, `[`, 'wt'),
y = lapply(group_list, `[`, 'mpg')
) # attempt 3
#> Error in colMeans(x, na.rm = TRUE): 'x' must be numeric
由 reprex 创建于 2018-01-28 包 (v0.1.1.9000).
所有间接都发生在公式中,所以现在我认为根本不需要 rlang。
您可以传递所需变量的字符串,并将它们粘贴在一起作为 lmer 函数的字符串。然后使用 stats::as.formula()
将其转换为适合 lmer 的公式。
lmer_group <- function(l, x_name, y_name) {
fx <- paste0("scale(", y_name, ") ~ scale(", x_name, ") + (", x_name," | cyl)")
print(paste("Evaluating: ", fx))
l %>%
purrr::map(
.f = ~ lmerTest::lmer(
as.formula(fx),
data = (.),
REML = FALSE
)
) %>%
purrr::map(.f = ~ coef(summary(.))[-c(1),]) %>%
base::do.call(what = cbind.data.frame, args = .) %>%
tibble::rownames_to_column(df = ., var = "Effect")
}
lmer_group(l = group_list, x = 'wt', y = 'mpg') # attempt 2
结果:
[1] "Evaluating: scale(mpg) ~ scale(wt) + (wt | cyl)"
Effect 0 1
1 Estimate -0.3318712 -9.089148e-01
2 Std. Error 0.2104267 1.156500e-01
3 df 0.6084632 1.300000e+01
4 t value -1.5771343 -7.859187e+00
5 Pr(>|t|) 0.4558213 2.714599e-06
我敢打赌 rlang approach with quo()
. If you take this solution, it's essentially a duplicate of Formula with dynamic number of variables。
这是一个类似的方法,其结果被转置。我认为如果所有 t 值都在同一列而不是在同一行中会更有用。它使查询和操作更容易。
lmer_group <- function(l, x_name, y_name) {
fx <- glue::glue("scale({y_name}) ~ scale({x_name}) + ({x_name} | cyl)")
cat(paste("Evaluating: ", fx, "\n"))
filter_name <- glue::glue("scale({x_name})")
l %>%
purrr::map(
.f = ~ lmerTest::lmer(
as.formula(fx),
data = (.),
REML = FALSE
)
) %>%
purrr::map_dfr(.f = ~ broom::tidy(.), .id = "am") %>%
dplyr::filter(term==!!filter_name) %>%
dplyr::select(
am,
estimate,
std.error,
t = statistic
)
}
lmer_group(l = group_list, x = 'wt', y = 'mpg') # attempt 2
df 和 p 值没有出现,因为我认为 lme4 tidyer 中没有写入这些值.这可能会破坏交易。
Evaluating: scale(mpg) ~ scale(wt) + (wt | cyl)
am estimate std.error t
1 0 -0.3318712 0.2104267 -1.577134
2 1 -0.9089148 0.1156500 -7.859187
为了多样性,我用glue代替了paste0()
。