在具有多个数据框的列表中使用包 "bestNormalize" 进行数据转换

Data transformation with the package "bestNormalize" on a list with multiple dataframes

我想转换我的数据,但我不太确定哪种方法最好。因此,我使用包“bestNormalize”。

它在数据框的一列上工作正常,但是我有两个数据框的列表(每个数据框有 9 列),我想将函数“bestNormalize”应用于每一列。我尝试映射但它不起作用。

此外,我想应用程序包的其他功能(数据转换,例如使用功能“yeojohnson”),就像我将“bestNormalize”功能应用于每个数据帧的每一列一样。

有人知道这是怎么回事吗?提前致谢。


install.packages("bestNormalize")
library(bestNormalize)

install.packages("purrr")
library(purrr)

# Data
a <- data.frame(
  met1 = rnorm(n = 100, mean = 0, sd = 1),
  met2 = rnorm(n = 100, mean = 0, sd = 1),
  met3 = rnorm(n = 100, mean = 0, sd = 1),
  met4 = rnorm(n = 100, mean = 0, sd = 1),
  met5 = rnorm(n = 100, mean = 0, sd = 1),
  met6 = rnorm(n = 100, mean = 0, sd = 1),
  met7 = rnorm(n = 100, mean = 0, sd = 1),
  met8 = rnorm(n = 100, mean = 0, sd = 1),
  met9 = rnorm(n = 100, mean = 0, sd = 1)
)


y <- data.frame(
  met1 = rnorm(n = 100, mean = 0, sd = 1),
  met2 = rnorm(n = 100, mean = 0, sd = 1),
  met3 = rnorm(n = 100, mean = 0, sd = 1),
  met4 = rnorm(n = 100, mean = 0, sd = 1),
  met5 = rnorm(n = 100, mean = 0, sd = 1),
  met6 = rnorm(n = 100, mean = 0, sd = 1),
  met7 = rnorm(n = 100, mean = 0, sd = 1),
  met8 = rnorm(n = 100, mean = 0, sd = 1),
  met9 = rnorm(n = 100, mean = 0, sd = 1)
)


my_list <- list(a, y)
 
# Works:
bestNormalize::bestNormalize(my_list[[1]]$met1)

# Does not work:
stand_dat_men <- my_list  %>% purrr::map(~mutate_at(.x, .vars = vars(met1:met9), ~bestNormalize(.)))





bestNormalize returns class“bestNormalize”的一个对象,你可以将它存储在一个列表中。此外,您可以在此处使用 summarise 而不是 mutate

library(dplyr) 
library(bestNormalize)

output <- purrr::map(my_list, ~.x %>% summarise_at(vars(met1:met9), 
                               ~list(bestNormalize(.))))

summarise_at 现已替换为 across

output <- purrr::map(my_list, ~.x %>% summarise(across(met1:met9, 
                               ~list(bestNormalize(.)))))