将 data.frame 中的部分数据乘以另一个 data.frame 中的值
Multiply part of the data in data.frame by values in another data.frame
这里已经有人提供了部分代码:
library(dplyr)
set.seed(12345)
df1 = data.frame(a=c(rep("a",8), rep("b",5), rep("c",7), rep("d",10)),
b=rnorm(30, 6, 2),
c=rnorm(30, 12, 3.5),
d=rnorm(30, 8, 3)
)
df2 = data.frame(b= 1.5,
c= 13,
d= 0.34
)
df1_z <- df1 %>%
group_by(a) %>%
mutate(across(b:d, list(zscore = ~as.numeric(scale(.))))) %>%
ungroup %>%
mutate(total = rowSums(select(., ends_with('zscore'))))
这正是我当时想要的,但现在我想要一些稍微不同的东西。在 df1_z
中,而不是最后一列中称为“总计”的值,我希望这个值是 _zscore
列中的值与 [=] 中相应值的乘积之和16=],所以:b_zscore x 1.5 + c_zscore x 13 + d_zscore x 0.34。
例如,第一个值为 0.6971403 x 1.5 + 0.100595417 x 13 + 0.01790090 x 0.34 = 2.359537177。新 total
列的预期结果:
total
2.359537177
16.04147765
13.64141872
9.146152274
-3.380574542
-5.55439223
etc...
如何修改以上代码以在 df1_z
的新“总计”列中获得此结果?
您可以使用 crossprod
函数:
df1 %>%
group_by(a) %>%
mutate(across(b:d, list(zscore = ~as.numeric(scale(.))))) %>%
ungroup %>%
mutate(total = c(crossprod(t(select(., ends_with('zscore'))),t(df2))))
# A tibble: 30 x 8
a b c d b_zscore c_zscore d_zscore total
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 a 7.17 14.8 8.45 0.697 0.101 0.0179 2.36
2 a 7.42 19.7 3.97 0.841 1.17 -1.14 16.0
3 a 5.78 19.2 9.66 -0.108 1.05 0.332 13.6
4 a 5.09 17.7 12.8 -0.508 0.732 1.14 9.15
5 a 7.21 12.9 6.24 0.721 -0.329 -0.555 -3.38
6 a 2.36 13.7 2.50 -2.09 -0.146 -1.52 -5.55
7 a 7.26 10.9 10.7 0.749 -0.774 0.593 -8.74
8 a 5.45 6.18 12.8 -0.302 -1.80 1.14 -23.5
9 b 5.43 18.2 9.55 -0.445 1.12 1.34 14.4
10 b 4.16 12.1 4.11 -1.06 0.0776 -1.02 -0.933
# ... with 20 more rows
另一个选项:
library(tidyverse)
df1 %>%
group_by(a) %>%
mutate(across(b:d, list(zscore = ~as.numeric(scale(.))))) %>%
ungroup %>%
mutate(total = rowSums(map2_dfc(select(., contains('zscore')), df2, `*`)))
输出:
# A tibble: 30 x 8
a b c d b_zscore c_zscore d_zscore total
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 a 7.17 14.8 8.45 0.697 0.101 0.0179 2.36
2 a 7.42 19.7 3.97 0.841 1.17 -1.14 16.0
3 a 5.78 19.2 9.66 -0.108 1.05 0.332 13.6
4 a 5.09 17.7 12.8 -0.508 0.732 1.14 9.15
5 a 7.21 12.9 6.24 0.721 -0.329 -0.555 -3.38
6 a 2.36 13.7 2.50 -2.09 -0.146 -1.52 -5.55
7 a 7.26 10.9 10.7 0.749 -0.774 0.593 -8.74
8 a 5.45 6.18 12.8 -0.302 -1.80 1.14 -23.5
9 b 5.43 18.2 9.55 -0.445 1.12 1.34 14.4
10 b 4.16 12.1 4.11 -1.06 0.0776 -1.02 -0.933
# ... with 20 more rows
这里已经有人提供了部分代码:
library(dplyr)
set.seed(12345)
df1 = data.frame(a=c(rep("a",8), rep("b",5), rep("c",7), rep("d",10)),
b=rnorm(30, 6, 2),
c=rnorm(30, 12, 3.5),
d=rnorm(30, 8, 3)
)
df2 = data.frame(b= 1.5,
c= 13,
d= 0.34
)
df1_z <- df1 %>%
group_by(a) %>%
mutate(across(b:d, list(zscore = ~as.numeric(scale(.))))) %>%
ungroup %>%
mutate(total = rowSums(select(., ends_with('zscore'))))
这正是我当时想要的,但现在我想要一些稍微不同的东西。在 df1_z
中,而不是最后一列中称为“总计”的值,我希望这个值是 _zscore
列中的值与 [=] 中相应值的乘积之和16=],所以:b_zscore x 1.5 + c_zscore x 13 + d_zscore x 0.34。
例如,第一个值为 0.6971403 x 1.5 + 0.100595417 x 13 + 0.01790090 x 0.34 = 2.359537177。新 total
列的预期结果:
total
2.359537177
16.04147765
13.64141872
9.146152274
-3.380574542
-5.55439223
etc...
如何修改以上代码以在 df1_z
的新“总计”列中获得此结果?
您可以使用 crossprod
函数:
df1 %>%
group_by(a) %>%
mutate(across(b:d, list(zscore = ~as.numeric(scale(.))))) %>%
ungroup %>%
mutate(total = c(crossprod(t(select(., ends_with('zscore'))),t(df2))))
# A tibble: 30 x 8
a b c d b_zscore c_zscore d_zscore total
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 a 7.17 14.8 8.45 0.697 0.101 0.0179 2.36
2 a 7.42 19.7 3.97 0.841 1.17 -1.14 16.0
3 a 5.78 19.2 9.66 -0.108 1.05 0.332 13.6
4 a 5.09 17.7 12.8 -0.508 0.732 1.14 9.15
5 a 7.21 12.9 6.24 0.721 -0.329 -0.555 -3.38
6 a 2.36 13.7 2.50 -2.09 -0.146 -1.52 -5.55
7 a 7.26 10.9 10.7 0.749 -0.774 0.593 -8.74
8 a 5.45 6.18 12.8 -0.302 -1.80 1.14 -23.5
9 b 5.43 18.2 9.55 -0.445 1.12 1.34 14.4
10 b 4.16 12.1 4.11 -1.06 0.0776 -1.02 -0.933
# ... with 20 more rows
另一个选项:
library(tidyverse)
df1 %>%
group_by(a) %>%
mutate(across(b:d, list(zscore = ~as.numeric(scale(.))))) %>%
ungroup %>%
mutate(total = rowSums(map2_dfc(select(., contains('zscore')), df2, `*`)))
输出:
# A tibble: 30 x 8
a b c d b_zscore c_zscore d_zscore total
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 a 7.17 14.8 8.45 0.697 0.101 0.0179 2.36
2 a 7.42 19.7 3.97 0.841 1.17 -1.14 16.0
3 a 5.78 19.2 9.66 -0.108 1.05 0.332 13.6
4 a 5.09 17.7 12.8 -0.508 0.732 1.14 9.15
5 a 7.21 12.9 6.24 0.721 -0.329 -0.555 -3.38
6 a 2.36 13.7 2.50 -2.09 -0.146 -1.52 -5.55
7 a 7.26 10.9 10.7 0.749 -0.774 0.593 -8.74
8 a 5.45 6.18 12.8 -0.302 -1.80 1.14 -23.5
9 b 5.43 18.2 9.55 -0.445 1.12 1.34 14.4
10 b 4.16 12.1 4.11 -1.06 0.0776 -1.02 -0.933
# ... with 20 more rows