在具有多个数据框的列表中使用包 "bestNormalize" 进行数据转换
Data transformation with the package "bestNormalize" on a list with multiple dataframes
我想转换我的数据,但我不太确定哪种方法最好。因此,我使用包“bestNormalize”。
它在数据框的一列上工作正常,但是我有两个数据框的列表(每个数据框有 9 列),我想将函数“bestNormalize”应用于每一列。我尝试映射但它不起作用。
此外,我想应用程序包的其他功能(数据转换,例如使用功能“yeojohnson”),就像我将“bestNormalize”功能应用于每个数据帧的每一列一样。
有人知道这是怎么回事吗?提前致谢。
install.packages("bestNormalize")
library(bestNormalize)
install.packages("purrr")
library(purrr)
# Data
a <- data.frame(
met1 = rnorm(n = 100, mean = 0, sd = 1),
met2 = rnorm(n = 100, mean = 0, sd = 1),
met3 = rnorm(n = 100, mean = 0, sd = 1),
met4 = rnorm(n = 100, mean = 0, sd = 1),
met5 = rnorm(n = 100, mean = 0, sd = 1),
met6 = rnorm(n = 100, mean = 0, sd = 1),
met7 = rnorm(n = 100, mean = 0, sd = 1),
met8 = rnorm(n = 100, mean = 0, sd = 1),
met9 = rnorm(n = 100, mean = 0, sd = 1)
)
y <- data.frame(
met1 = rnorm(n = 100, mean = 0, sd = 1),
met2 = rnorm(n = 100, mean = 0, sd = 1),
met3 = rnorm(n = 100, mean = 0, sd = 1),
met4 = rnorm(n = 100, mean = 0, sd = 1),
met5 = rnorm(n = 100, mean = 0, sd = 1),
met6 = rnorm(n = 100, mean = 0, sd = 1),
met7 = rnorm(n = 100, mean = 0, sd = 1),
met8 = rnorm(n = 100, mean = 0, sd = 1),
met9 = rnorm(n = 100, mean = 0, sd = 1)
)
my_list <- list(a, y)
# Works:
bestNormalize::bestNormalize(my_list[[1]]$met1)
# Does not work:
stand_dat_men <- my_list %>% purrr::map(~mutate_at(.x, .vars = vars(met1:met9), ~bestNormalize(.)))
bestNormalize
returns class“bestNormalize”的一个对象,你可以将它存储在一个列表中。此外,您可以在此处使用 summarise
而不是 mutate
。
library(dplyr)
library(bestNormalize)
output <- purrr::map(my_list, ~.x %>% summarise_at(vars(met1:met9),
~list(bestNormalize(.))))
summarise_at
现已替换为 across
。
output <- purrr::map(my_list, ~.x %>% summarise(across(met1:met9,
~list(bestNormalize(.)))))
我想转换我的数据,但我不太确定哪种方法最好。因此,我使用包“bestNormalize”。
它在数据框的一列上工作正常,但是我有两个数据框的列表(每个数据框有 9 列),我想将函数“bestNormalize”应用于每一列。我尝试映射但它不起作用。
此外,我想应用程序包的其他功能(数据转换,例如使用功能“yeojohnson”),就像我将“bestNormalize”功能应用于每个数据帧的每一列一样。
有人知道这是怎么回事吗?提前致谢。
install.packages("bestNormalize")
library(bestNormalize)
install.packages("purrr")
library(purrr)
# Data
a <- data.frame(
met1 = rnorm(n = 100, mean = 0, sd = 1),
met2 = rnorm(n = 100, mean = 0, sd = 1),
met3 = rnorm(n = 100, mean = 0, sd = 1),
met4 = rnorm(n = 100, mean = 0, sd = 1),
met5 = rnorm(n = 100, mean = 0, sd = 1),
met6 = rnorm(n = 100, mean = 0, sd = 1),
met7 = rnorm(n = 100, mean = 0, sd = 1),
met8 = rnorm(n = 100, mean = 0, sd = 1),
met9 = rnorm(n = 100, mean = 0, sd = 1)
)
y <- data.frame(
met1 = rnorm(n = 100, mean = 0, sd = 1),
met2 = rnorm(n = 100, mean = 0, sd = 1),
met3 = rnorm(n = 100, mean = 0, sd = 1),
met4 = rnorm(n = 100, mean = 0, sd = 1),
met5 = rnorm(n = 100, mean = 0, sd = 1),
met6 = rnorm(n = 100, mean = 0, sd = 1),
met7 = rnorm(n = 100, mean = 0, sd = 1),
met8 = rnorm(n = 100, mean = 0, sd = 1),
met9 = rnorm(n = 100, mean = 0, sd = 1)
)
my_list <- list(a, y)
# Works:
bestNormalize::bestNormalize(my_list[[1]]$met1)
# Does not work:
stand_dat_men <- my_list %>% purrr::map(~mutate_at(.x, .vars = vars(met1:met9), ~bestNormalize(.)))
bestNormalize
returns class“bestNormalize”的一个对象,你可以将它存储在一个列表中。此外,您可以在此处使用 summarise
而不是 mutate
。
library(dplyr)
library(bestNormalize)
output <- purrr::map(my_list, ~.x %>% summarise_at(vars(met1:met9),
~list(bestNormalize(.))))
summarise_at
现已替换为 across
。
output <- purrr::map(my_list, ~.x %>% summarise(across(met1:met9,
~list(bestNormalize(.)))))