函数取决于 R 中行中的值

Question

我希望该函数取决于数据框中每个产品的 sd 值。所以不要在函数中输入常量sd值，因为每个产品都有不同的sd.

df <- data.frame(Product = c("A", "B"), 
                 Oct = c(33, 23),
                 Nov = c(23, 26),
                 SD = c(1, 5))

rand_vect_cont <- function(N, M, sd) {
  vec <- rnorm(N, M/N, sd)
  vec / sum(vec) * M
}

sapply(unlist(df[2:3]), rand_vect_cont, N = 4, df$SD)

我想创建一个新的 table，其列将是数据框中各个元素的总和。这是常量 sd 的示例：

> sapply(unlist(df[2:3]), rand_vect_cont, N = 4, sd = 1)
         Oct1     Oct2     Nov1     Nov2
[1,] 7.492679 5.285553 4.716102 6.177153
[2,] 8.499570 6.008897 6.339937 6.638079
[3,] 9.301981 6.617405 5.262105 6.235205
[4,] 7.705770 5.088145 6.681856 6.949563

Answer 1

我们可以使用Map

do.call(cbind, Map(function(x, y) sapply(x, rand_vect_cont, N = 4, sd = y),
          asplit(as.matrix(df[2:3]), 1), df$SD))
 #     Oct      Nov       Oct       Nov
#[1,] 7.551047 5.053925  2.449044  3.174316
#[2,] 8.353440 5.853014  6.238516  6.992176
#[3,] 7.343861 4.847592  1.470566  2.188509
#[4,] 9.751653 7.245469 12.841873 13.644999

或 tidyverse

library(dplyr)
library(tidyr)
df %>%
   pivot_longer(cols = -c(Product, SD), names_to = "month") %>% 
   group_by(Product, month, SD) %>% 
   summarise(value = list(rand_vect_cont(4, value, SD)))  %>% 
   unnest(c(value))
# A tibble: 16 x 4
# Groups:   Product, month [4]
#   Product month    SD value
#   <fct>   <chr> <dbl> <dbl>
# 1 A       Nov       1  5.05
# 2 A       Nov       1  5.85
# 3 A       Nov       1  4.85
# 4 A       Nov       1  7.25
# 5 A       Oct       1  7.55
# 6 A       Oct       1  8.35
# 7 A       Oct       1  7.34
# 8 A       Oct       1  9.75
# 9 B       Nov       5  3.17
#10 B       Nov       5  6.99
#11 B       Nov       5  2.19
#12 B       Nov       5 13.6 
#13 B       Oct       5  2.45
#14 B       Oct       5  6.24
#15 B       Oct       5  1.47
#16 B       Oct       5 12.8

编辑：使用@Sathish post

中显示的相同种子

Answer 2

将 set.seed() 放入您的函数中以获得可重现的结果。

df <- data.frame(Product = c("A", "B"), 
                 Oct = c(33, 23),
                 Nov = c(23, 26),
                 SD = c(1, 5))
rand_vect_cont <- function(N, M, sd) {
  set.seed(1); 
  vec <- rnorm(N, M/N, sd)
  vec / sum(vec) * M
}

数据table解决方案

library(data.table)
setDT(df)
df <- melt(df, id.vars = c("Product", "SD"), variable.name = "month")
df[, rand_vect_cont(4, value, SD), by = .(Product, SD, month)]

#     Product SD month        V1
# 1:        A  1   Oct  7.551047
# 2:        A  1   Oct  8.353440
# 3:        A  1   Oct  7.343861
# 4:        A  1   Oct  9.751653
# 5:        B  5   Oct  2.449044
# 6:        B  5   Oct  6.238516
# 7:        B  5   Oct  1.470566
# 8:        B  5   Oct 12.841873
# 9:        A  1   Nov  5.053925
# 10:       A  1   Nov  5.853014
# 11:       A  1   Nov  4.847592
# 12:       A  1   Nov  7.245469
# 13:       B  5   Nov  3.174316
# 14:       B  5   Nov  6.992176
# 15:       B  5   Nov  2.188509
# 16:       B  5   Nov 13.644999

与你的代码比较 - base R:

df <- data.frame(Product = c("A", "B"), 
                     Oct = c(33, 23),
                     Nov = c(23, 26),
                     SD = c(1, 5))

sapply(unlist(df[2:3]), rand_vect_cont, N = 4, sd = 1)
#          Oct1     Oct2     Nov1     Nov2
# [1,] 7.551047 5.053925 5.053925 5.802832
# [2,] 8.353440 5.853014 5.853014 6.603176
# [3,] 7.343861 4.847592 4.847592 5.596175
# [4,] 9.751653 7.245469 7.245469 7.997818

sapply(unlist(df[2:3]), rand_vect_cont, N = 4, sd = 5)
#           Oct1      Oct2      Nov1      Nov2
# [1,]  4.883302  2.449044  2.449044  3.174316
# [2,]  8.748246  6.238516  6.238516  6.992176
# [3,]  3.885336  1.470566  1.470566  2.188509
# [4,] 15.483117 12.841873 12.841873 13.644999

函数取决于 R 中行中的值

Function dependent on the value in the rows in R

r

function

dataframe

standard-deviation

sapply