使用另一个数据框在数据框中缩放变量
scale variables in dataframe using another dataframe
我有一个包含以下变量的数据框
dat <- data.frame(cell.ID = 1:10, cell.name = letters[1:10],
groupID = rep(1:2, each = 5),
x1 = rnorm(10), x2 = rnorm(10),
x3 = rnorm(10), x4= rnorm(10),
x5 = rnorm(10), x6 = rnorm(10))
我存储了一个平均值和 sd 以将另一个数据帧中的 x1 标准化为 x6
norm_fin <- data.frame(variable = paste0('x',1:6),
meanVar = rnorm(6),
SdVar = rnorm(6))
我想在将 x1 规范化为 x6 之后从 dat
创建一个新的数据帧。我做了一个循环解决方案
varVec <- paste0('x',1:6)
dat1 <- dat
for(i in varVec){
meanRef <- norm_fin$meanVar[norm_fin$variable == i]
sdRef <- norm_fin$SdVar[norm_fin$variable == i]
dat1[, i] <- (dat[, i] - meanRef)/sdRef
}
有没有不使用循环的其他解决方案?
我们可以把数据转成长格式然后left join norm_find
,计算出值然后取回宽格式的数据。
library(dplyr)
library(tidyr)
dat %>%
pivot_longer(cols = starts_with('x')) %>%
left_join(norm_fin, by = c('name' = 'variable')) %>%
mutate(val = (value - meanVar)/SdVar) %>%
select(-value, -meanVar, -SdVar) %>%
pivot_wider(names_from = name, values_from = val)
# A tibble: 10 x 9
# cell.ID cell.name groupID x1 x2 x3 x4 x5 x6
# <int> <fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 a 1 -2.10 32.6 -0.797 0.705 -0.768 0.0217
# 2 2 b 1 -1.36 16.3 0.125 0.353 -1.76 0.144
# 3 3 c 1 2.63 17.0 -0.751 0.933 0.394 0.150
# 4 4 d 1 -0.690 11.6 -0.429 0.925 -6.60 -0.461
# 5 5 e 1 -0.559 -1.01 -0.316 0.898 -4.64 0.229
# 6 6 f 2 2.98 43.2 -1.47 0.833 0.105 -0.525
# 7 7 g 2 0.181 18.9 1.27 0.767 -1.36 0.802
# 8 8 h 2 -3.67 -27.6 0.528 0.467 -1.23 -0.122
# 9 9 i 2 -2.38 22.7 -0.873 0.348 -3.77 0.0778
#10 10 j 2 -1.84 0.557 1.72 0.311 -2.01 0.0379
数据
set.seed(123)
dat <- data.frame(cell.ID = 1:10, cell.name = letters[1:10],
groupID = rep(1:2, each = 5),
x1 = rnorm(10), x2 = rnorm(10),
x3 = rnorm(10), x4= rnorm(10),
x5 = rnorm(10), x6 = rnorm(10))
norm_fin <- data.frame(variable = paste0('x',1:6),
meanVar = rnorm(6),
SdVar = rnorm(6))
我有一个包含以下变量的数据框
dat <- data.frame(cell.ID = 1:10, cell.name = letters[1:10],
groupID = rep(1:2, each = 5),
x1 = rnorm(10), x2 = rnorm(10),
x3 = rnorm(10), x4= rnorm(10),
x5 = rnorm(10), x6 = rnorm(10))
我存储了一个平均值和 sd 以将另一个数据帧中的 x1 标准化为 x6
norm_fin <- data.frame(variable = paste0('x',1:6),
meanVar = rnorm(6),
SdVar = rnorm(6))
我想在将 x1 规范化为 x6 之后从 dat
创建一个新的数据帧。我做了一个循环解决方案
varVec <- paste0('x',1:6)
dat1 <- dat
for(i in varVec){
meanRef <- norm_fin$meanVar[norm_fin$variable == i]
sdRef <- norm_fin$SdVar[norm_fin$variable == i]
dat1[, i] <- (dat[, i] - meanRef)/sdRef
}
有没有不使用循环的其他解决方案?
我们可以把数据转成长格式然后left join norm_find
,计算出值然后取回宽格式的数据。
library(dplyr)
library(tidyr)
dat %>%
pivot_longer(cols = starts_with('x')) %>%
left_join(norm_fin, by = c('name' = 'variable')) %>%
mutate(val = (value - meanVar)/SdVar) %>%
select(-value, -meanVar, -SdVar) %>%
pivot_wider(names_from = name, values_from = val)
# A tibble: 10 x 9
# cell.ID cell.name groupID x1 x2 x3 x4 x5 x6
# <int> <fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 a 1 -2.10 32.6 -0.797 0.705 -0.768 0.0217
# 2 2 b 1 -1.36 16.3 0.125 0.353 -1.76 0.144
# 3 3 c 1 2.63 17.0 -0.751 0.933 0.394 0.150
# 4 4 d 1 -0.690 11.6 -0.429 0.925 -6.60 -0.461
# 5 5 e 1 -0.559 -1.01 -0.316 0.898 -4.64 0.229
# 6 6 f 2 2.98 43.2 -1.47 0.833 0.105 -0.525
# 7 7 g 2 0.181 18.9 1.27 0.767 -1.36 0.802
# 8 8 h 2 -3.67 -27.6 0.528 0.467 -1.23 -0.122
# 9 9 i 2 -2.38 22.7 -0.873 0.348 -3.77 0.0778
#10 10 j 2 -1.84 0.557 1.72 0.311 -2.01 0.0379
数据
set.seed(123)
dat <- data.frame(cell.ID = 1:10, cell.name = letters[1:10],
groupID = rep(1:2, each = 5),
x1 = rnorm(10), x2 = rnorm(10),
x3 = rnorm(10), x4= rnorm(10),
x5 = rnorm(10), x6 = rnorm(10))
norm_fin <- data.frame(variable = paste0('x',1:6),
meanVar = rnorm(6),
SdVar = rnorm(6))