R - 使用 ifelse 语句在不同的列上分配一个数字的份额

R - allocate a share of a number over different columns using an ifelse statement

我有以下数据集:

observation <- c(1:10)
pop.d.rank  <- c(1:10)
cost.1  <- c(101:110)
cost.2  <- c(102:111)
cost.3  <- c(103:112)
all <- data.frame(observation,pop.d.rank,cost.1, cost.2, cost.3) 

我想在三年内分配以下金额:

annual.investment <- 500

第一年我可以使用以下脚本执行此操作:

library(dplyr)

all <- all %>%  
 mutate(capital_allocated.5G = diff(c(0, pmin(cumsum(cost), annual.investment)))) %>%
 mutate(capital_percentage.5G = capital_allocated.5G / cost * 100) %>%
 mutate(year = ifelse(capital_percentage.5G >= 50, "Year.1",0))

但是当我第二年尝试这样做时,考虑到前一年的投入,代码不起作用。这是我尝试在 mutate 循环中放置一个 ifelse 语句,这样它就不会覆盖前一年分配的资金:

all <- all %>%  
 mutate(capital_allocated.5G = ifelse(year == 0, diff(c(0, pmin(cumsum(cost), annual.investment))), 0) %>%
 mutate(capital_percentage.5G = capital_allocated.5G / cost * 100) %>%
 mutate(year = ifelse(capital_percentage.5G >= 50, "Year.2",0))

我希望数据如下所示,其中分配的金额首先分配给上一年尚未 100% 完成的任何行。

capital_allocated.5G <- c(101, 102, 103, 104, 105, 106, 107, 108, 109, 55)
capital_percentage.5G <- c(100, 100, 100, 100, 100, 100, 100, 100, 100, 50)
year <- c("Year.1", "Year.1","Year.1", "Year.1","Year.1", "Year.2", "Year.2","Year.2", "Year.2","Year.2")
example.output <- data.frame(observation,pop.d.rank,cost,   capital_allocated.5G, capital_percentage.5G, year) 

编辑:cost.1 是第 1 年的成本变量,cost.2 是第 2 年的变量,cost.3 是第 3 年的成本变量

编辑:先前接受的答案有问题

我意识到这最终会为 capital_percentage.5G 变量分配超过 100 个。我创建了一个可重现的示例。我认为这与这样一个事实有关,即有些成本会随着时间的推移而降低,而有些成本会随着时间的推移而增加。

这背后的逻辑是,当在一年内进行投资时,5G 移动网络的部署成本是特定的,这就是成本列与该时间点相关的成本。一旦该投资在一年内完成,我希望该功能提供 capital_percentage.5G 100%,然后在未来几年不再分配任何资金。

如何才能使百分比值达到 100 的限制并且以后不会分配更多的资本分配给它?

observation <- c(1:10)
pop.d.rank  <- c(1:10)
cost.1  <- c(101:110)
cost.2  <- c(110:101)
cost.3  <- c(100:91)
all <- data.frame(observation,pop.d.rank,cost.1, cost.2, cost.3) 

capital_allocated.5G <- rep(0,10)   ## initialize to zero
capital_percentage.5G <- rep(0,10)  ## initialize to zero
year <- rep(NA,10)                  ## initialize to NA
all <- data.frame(observation,pop.d.rank,cost.1, cost.2, cost.3,   capital_allocated.5G,capital_percentage.5G,year) 

alloc.invest <- function(df, ann.invest, y) {
  df %>% mutate_(cost=paste0("cost.",y)) %>%
    mutate(capital_percentage.5G = capital_allocated.5G / cost * 100,
           year = ifelse(capital_percentage.5G < 50, NA, year),
           not.yet.alloc = ifelse(capital_percentage.5G < 100,cost-capital_allocated.5G,0),
           capital_allocated.5G = capital_allocated.5G +     ifelse(capital_percentage.5G < 100,diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))), 0),
       capital_percentage.5G = capital_allocated.5G / cost * 100,
       year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
select(-cost,-not.yet.alloc)
}

annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
all <- alloc.invest(all,annual.investment,2)
print(all)
all <- alloc.invest(all,annual.investment,3)
print(all)

第三年,这里最后的投资配置,capital_percentage.5G突然飙升到110%。

针对可能增加或减少的同比成本进行了更新

对于每年可能减少或增加的不同成本,我们根本不需要在更新 not.yet.alloccapital_allocated.5G 时检查 capital_percentage.5G 是否超过 100% :

library(dplyr)
alloc.invest <- function(df, ann.invest, y) {
  df %>% mutate_(cost=paste0("cost.",y)) %>%
    mutate(capital_percentage.5G = capital_allocated.5G / cost * 100,
           year = ifelse(capital_percentage.5G < 50, NA, year),
           not.yet.alloc = cost-capital_allocated.5G,
           capital_allocated.5G = capital_allocated.5G + diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))),
           capital_percentage.5G = capital_allocated.5G / cost * 100,
           year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
    select(-cost,-not.yet.alloc)
}

有了新的成本数据:

observation <- c(1:10)
pop.d.rank  <- c(1:10)
cost.1  <- c(101:110)
cost.2  <- c(110:101)
cost.3  <- c(100:91)

像以前一样使用初始值列进行扩充:

capital_allocated.5G <- rep(0,10)   ## initialize to zero
capital_percentage.5G <- rep(0,10)  ## initialize to zero
year <- rep(NA,10)                  ## initialize to NA
all <- data.frame(observation,pop.d.rank,cost.1, cost.2, cost.3, capital_allocated.5G,capital_percentage.5G,year) 

第 1 年:

annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    110    100                  101             100.00000 Year.1
##2            2          2    102    109     99                  102             100.00000 Year.1
##3            3          3    103    108     98                  103             100.00000 Year.1
##4            4          4    104    107     97                  104             100.00000 Year.1
##5            5          5    105    106     96                   90              85.71429 Year.1
##6            6          6    106    105     95                    0               0.00000   <NA>
##7            7          7    107    104     94                    0               0.00000   <NA>
##8            8          8    108    103     93                    0               0.00000   <NA>
##9            9          9    109    102     92                    0               0.00000   <NA>
##10          10         10    110    101     91                    0               0.00000   <NA>

第 2 年:

all <- alloc.invest(all,annual.investment,2)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    110    100                  110             100.00000 Year.1
##2            2          2    102    109     99                  109             100.00000 Year.1
##3            3          3    103    108     98                  108             100.00000 Year.1
##4            4          4    104    107     97                  107             100.00000 Year.1
##5            5          5    105    106     96                  106             100.00000 Year.1
##6            6          6    106    105     95                  105             100.00000 Year.2
##7            7          7    107    104     94                  104             100.00000 Year.2
##8            8          8    108    103     93                  103             100.00000 Year.2
##9            9          9    109    102     92                  102             100.00000 Year.2
##10          10         10    110    101     91                   46              45.54455   <NA>

第 3 年:

all <- alloc.invest(all,annual.investment,3)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    110    100                  100                   100 Year.1
##2            2          2    102    109     99                   99                   100 Year.1
##3            3          3    103    108     98                   98                   100 Year.1
##4            4          4    104    107     97                   97                   100 Year.1
##5            5          5    105    106     96                   96                   100 Year.1
##6            6          6    106    105     95                   95                   100 Year.2
##7            7          7    107    104     94                   94                   100 Year.2
##8            8          8    108    103     93                   93                   100 Year.2
##9            9          9    109    102     92                   92                   100 Year.2
##10          10         10    110    101     91                   91                   100 Year.3

您的代码的原始问题是 ifelse 只是根据条件而不是 cost 中使用的输入 cost 提供 输出 上的开关ifelseTRUE 分支。因此,cumsum(cost) 计算整个 costcumsum,而不仅仅是 ifelseTRUE 分支部分。为了解决这个问题,我们可以定义以下函数,然后每年依次执行该函数。

library(dplyr)
alloc.invest <- function(df, ann.invest, y) {
  df %>% mutate(not.yet.alloc = ifelse(capital_percentage.5G < 100,cost-capital_allocated.5G,0),
                capital_allocated.5G = capital_allocated.5G + ifelse(capital_percentage.5G < 100,diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))), 0),
                capital_percentage.5G = capital_allocated.5G / cost * 100,
                year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
         select(-not.yet.alloc)
}

注:

  1. 创建一个新的临时not.yet.alloc,我们从中计算年度分配的结果cumsum
  2. 不需要单独的 mutate 语句。
  3. 在设置 year 之前还需要检查 is.na(year)。否则,之前已经标注的year将被覆盖。

要使用此函数,我们必须首先使用 capital_allocated.5Gcapital_percentage.5Gyear:

的一些初始值扩充输入数据
capital_allocated.5G <- rep(0,10)   ## initialize to zero
capital_percentage.5G <- rep(0,10)  ## initialize to zero
year <- rep(NA,10)                  ## initialize to NA
all <- data.frame(observation,pop.d.rank,cost,capital_allocated.5G,capital_percentage.5G,year) 

然后第 1 年:

annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
##   observation pop.d.rank cost capital_allocated.5G capital_percentage.5G   year
##1            1          1  101                  101             100.00000 Year.1
##2            2          2  102                  102             100.00000 Year.1
##3            3          3  103                  103             100.00000 Year.1
##4            4          4  104                  104             100.00000 Year.1
##5            5          5  105                   90              85.71429 Year.1
##6            6          6  106                    0               0.00000   <NA>
##7            7          7  107                    0               0.00000   <NA>
##8            8          8  108                    0               0.00000   <NA>
##9            9          9  109                    0               0.00000   <NA>
##10          10         10  110                    0               0.00000   <NA>

第 2 年:

all <- alloc.invest(all,annual.investment,2)
print(all)
##   observation pop.d.rank cost capital_allocated.5G capital_percentage.5G   year
##1            1          1  101                  101                   100 Year.1
##2            2          2  102                  102                   100 Year.1
##3            3          3  103                  103                   100 Year.1
##4            4          4  104                  104                   100 Year.1
##5            5          5  105                  105                   100 Year.1
##6            6          6  106                  106                   100 Year.2
##7            7          7  107                  107                   100 Year.2
##8            8          8  108                  108                   100 Year.2
##9            9          9  109                  109                   100 Year.2
##10          10         10  110                   55                    50 Year.2  

更新每年更改成本的新要求

如果每年的成本不同,则函数需要首先重新调整 capital_percentage.5G 和可能的 year 列:

library(dplyr)
alloc.invest <- function(df, ann.invest, y) {
  df %>% mutate_(cost=paste0("cost.",y)) %>%
         mutate(capital_percentage.5G = capital_allocated.5G / cost * 100,
                year = ifelse(capital_percentage.5G < 50, NA, year),
                not.yet.alloc = ifelse(capital_percentage.5G < 100,cost-capital_allocated.5G,0),
                capital_allocated.5G = capital_allocated.5G + ifelse(capital_percentage.5G < 100,diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))), 0),
                capital_percentage.5G = capital_allocated.5G / cost * 100,
                year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
         select(-cost,-not.yet.alloc)
}

请注意,使用 mutate_ 创建另一个 临时 cost 只是为了方便,因为需要根据输入动态选择成本列 y(否则,我们需要使用 mutate_ 进行所有计算,这会有些混乱)。

更新后的数据同样增加了 capital_allocated.5Gcapital_percentage.5Gyear 的初始值,第 1 年:

annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    102    103                  101             100.00000 Year.1
##2            2          2    102    103    104                  102             100.00000 Year.1
##3            3          3    103    104    105                  103             100.00000 Year.1
##4            4          4    104    105    106                  104             100.00000 Year.1
##5            5          5    105    106    107                   90              85.71429 Year.1
##6            6          6    106    107    108                    0               0.00000   <NA>
##7            7          7    107    108    109                    0               0.00000   <NA>
##8            8          8    108    109    110                    0               0.00000   <NA>
##9            9          9    109    110    111                    0               0.00000   <NA>
##10          10         10    110    111    112                    0               0.00000   <NA>

第 2 年:请注意,最后一项资产的分配少于 50%,因此其 year 仍为 NA

all <- alloc.invest(all,annual.investment,2)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    102    103                  102             100.00000 Year.1
##2            2          2    102    103    104                  103             100.00000 Year.1
##3            3          3    103    104    105                  104             100.00000 Year.1
##4            4          4    104    105    106                  105             100.00000 Year.1
##5            5          5    105    106    107                  106             100.00000 Year.1
##6            6          6    106    107    108                  107             100.00000 Year.2
##7            7          7    107    108    109                  108             100.00000 Year.2
##8            8          8    108    109    110                  109             100.00000 Year.2
##9            9          9    109    110    111                  110             100.00000 Year.2
##10          10         10    110    111    112                   46              41.44144   <NA>

第 3 年:

all <- alloc.invest(all,annual.investment,3)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    102    103                  103                   100 Year.1
##2            2          2    102    103    104                  104                   100 Year.1
##3            3          3    103    104    105                  105                   100 Year.1
##4            4          4    104    105    106                  106                   100 Year.1
##5            5          5    105    106    107                  107                   100 Year.1
##6            6          6    106    107    108                  108                   100 Year.2
##7            7          7    107    108    109                  109                   100 Year.2
##8            8          8    108    109    110                  110                   100 Year.2
##9            9          9    109    110    111                  111                   100 Year.2
##10          10         10    110    111    112                  112                   100 Year.3