R - 使用 ifelse 语句在不同的列上分配一个数字的份额

Question

我有以下数据集：

observation <- c(1:10)
pop.d.rank  <- c(1:10)
cost.1  <- c(101:110)
cost.2  <- c(102:111)
cost.3  <- c(103:112)
all <- data.frame(observation,pop.d.rank,cost.1, cost.2, cost.3)

我想在三年内分配以下金额：

annual.investment <- 500

第一年我可以使用以下脚本执行此操作：

library(dplyr)

all <- all %>%  
 mutate(capital_allocated.5G = diff(c(0, pmin(cumsum(cost), annual.investment)))) %>%
 mutate(capital_percentage.5G = capital_allocated.5G / cost * 100) %>%
 mutate(year = ifelse(capital_percentage.5G >= 50, "Year.1",0))

但是当我第二年尝试这样做时，考虑到前一年的投入，代码不起作用。这是我尝试在 mutate 循环中放置一个 ifelse 语句，这样它就不会覆盖前一年分配的资金：

all <- all %>%  
 mutate(capital_allocated.5G = ifelse(year == 0, diff(c(0, pmin(cumsum(cost), annual.investment))), 0) %>%
 mutate(capital_percentage.5G = capital_allocated.5G / cost * 100) %>%
 mutate(year = ifelse(capital_percentage.5G >= 50, "Year.2",0))

我希望数据如下所示，其中分配的金额首先分配给上一年尚未 100% 完成的任何行。

capital_allocated.5G <- c(101, 102, 103, 104, 105, 106, 107, 108, 109, 55)
capital_percentage.5G <- c(100, 100, 100, 100, 100, 100, 100, 100, 100, 50)
year <- c("Year.1", "Year.1","Year.1", "Year.1","Year.1", "Year.2", "Year.2","Year.2", "Year.2","Year.2")
example.output <- data.frame(observation,pop.d.rank,cost,   capital_allocated.5G, capital_percentage.5G, year)

编辑：cost.1 是第 1 年的成本变量，cost.2 是第 2 年的变量，cost.3 是第 3 年的成本变量

编辑：先前接受的答案有问题

我意识到这最终会为 capital_percentage.5G 变量分配超过 100 个。我创建了一个可重现的示例。我认为这与这样一个事实有关，即有些成本会随着时间的推移而降低，而有些成本会随着时间的推移而增加。

这背后的逻辑是，当在一年内进行投资时，5G 移动网络的部署成本是特定的，这就是成本列与该时间点相关的成本。一旦该投资在一年内完成，我希望该功能提供 capital_percentage.5G 100%，然后在未来几年不再分配任何资金。

如何才能使百分比值达到 100 的限制并且以后不会分配更多的资本分配给它？

observation <- c(1:10)
pop.d.rank  <- c(1:10)
cost.1  <- c(101:110)
cost.2  <- c(110:101)
cost.3  <- c(100:91)
all <- data.frame(observation,pop.d.rank,cost.1, cost.2, cost.3) 

capital_allocated.5G <- rep(0,10)   ## initialize to zero
capital_percentage.5G <- rep(0,10)  ## initialize to zero
year <- rep(NA,10)                  ## initialize to NA
all <- data.frame(observation,pop.d.rank,cost.1, cost.2, cost.3,   capital_allocated.5G,capital_percentage.5G,year) 

alloc.invest <- function(df, ann.invest, y) {
  df %>% mutate_(cost=paste0("cost.",y)) %>%
    mutate(capital_percentage.5G = capital_allocated.5G / cost * 100,
           year = ifelse(capital_percentage.5G < 50, NA, year),
           not.yet.alloc = ifelse(capital_percentage.5G < 100,cost-capital_allocated.5G,0),
           capital_allocated.5G = capital_allocated.5G +     ifelse(capital_percentage.5G < 100,diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))), 0),
       capital_percentage.5G = capital_allocated.5G / cost * 100,
       year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
select(-cost,-not.yet.alloc)
}

annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
all <- alloc.invest(all,annual.investment,2)
print(all)
all <- alloc.invest(all,annual.investment,3)
print(all)

第三年，这里最后的投资配置，capital_percentage.5G突然飙升到110%。

Answer 1

针对可能增加或减少的同比成本进行了更新

对于每年可能减少或增加的不同成本，我们根本不需要在更新 not.yet.alloc 和 capital_allocated.5G 时检查 capital_percentage.5G 是否超过 100% :

library(dplyr)
alloc.invest <- function(df, ann.invest, y) {
  df %>% mutate_(cost=paste0("cost.",y)) %>%
    mutate(capital_percentage.5G = capital_allocated.5G / cost * 100,
           year = ifelse(capital_percentage.5G < 50, NA, year),
           not.yet.alloc = cost-capital_allocated.5G,
           capital_allocated.5G = capital_allocated.5G + diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))),
           capital_percentage.5G = capital_allocated.5G / cost * 100,
           year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
    select(-cost,-not.yet.alloc)
}

有了新的成本数据：

observation <- c(1:10)
pop.d.rank  <- c(1:10)
cost.1  <- c(101:110)
cost.2  <- c(110:101)
cost.3  <- c(100:91)

像以前一样使用初始值列进行扩充：

capital_allocated.5G <- rep(0,10)   ## initialize to zero
capital_percentage.5G <- rep(0,10)  ## initialize to zero
year <- rep(NA,10)                  ## initialize to NA
all <- data.frame(observation,pop.d.rank,cost.1, cost.2, cost.3, capital_allocated.5G,capital_percentage.5G,year)

第 1 年：

annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    110    100                  101             100.00000 Year.1
##2            2          2    102    109     99                  102             100.00000 Year.1
##3            3          3    103    108     98                  103             100.00000 Year.1
##4            4          4    104    107     97                  104             100.00000 Year.1
##5            5          5    105    106     96                   90              85.71429 Year.1
##6            6          6    106    105     95                    0               0.00000   <NA>
##7            7          7    107    104     94                    0               0.00000   <NA>
##8            8          8    108    103     93                    0               0.00000   <NA>
##9            9          9    109    102     92                    0               0.00000   <NA>
##10          10         10    110    101     91                    0               0.00000   <NA>

第 2 年：

all <- alloc.invest(all,annual.investment,2)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    110    100                  110             100.00000 Year.1
##2            2          2    102    109     99                  109             100.00000 Year.1
##3            3          3    103    108     98                  108             100.00000 Year.1
##4            4          4    104    107     97                  107             100.00000 Year.1
##5            5          5    105    106     96                  106             100.00000 Year.1
##6            6          6    106    105     95                  105             100.00000 Year.2
##7            7          7    107    104     94                  104             100.00000 Year.2
##8            8          8    108    103     93                  103             100.00000 Year.2
##9            9          9    109    102     92                  102             100.00000 Year.2
##10          10         10    110    101     91                   46              45.54455   <NA>

第 3 年：

all <- alloc.invest(all,annual.investment,3)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    110    100                  100                   100 Year.1
##2            2          2    102    109     99                   99                   100 Year.1
##3            3          3    103    108     98                   98                   100 Year.1
##4            4          4    104    107     97                   97                   100 Year.1
##5            5          5    105    106     96                   96                   100 Year.1
##6            6          6    106    105     95                   95                   100 Year.2
##7            7          7    107    104     94                   94                   100 Year.2
##8            8          8    108    103     93                   93                   100 Year.2
##9            9          9    109    102     92                   92                   100 Year.2
##10          10         10    110    101     91                   91                   100 Year.3

您的代码的原始问题是 ifelse 只是根据条件而不是 cost 中使用的输入 cost 提供输出上的开关ifelse 的 TRUE 分支。因此，cumsum(cost) 计算整个 cost 的 cumsum，而不仅仅是 ifelse 的 TRUE 分支部分。为了解决这个问题，我们可以定义以下函数，然后每年依次执行该函数。

library(dplyr)
alloc.invest <- function(df, ann.invest, y) {
  df %>% mutate(not.yet.alloc = ifelse(capital_percentage.5G < 100,cost-capital_allocated.5G,0),
                capital_allocated.5G = capital_allocated.5G + ifelse(capital_percentage.5G < 100,diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))), 0),
                capital_percentage.5G = capital_allocated.5G / cost * 100,
                year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
         select(-not.yet.alloc)
}

注：

创建一个新的临时列not.yet.alloc，我们从中计算年度分配的结果cumsum。
不需要单独的 mutate 语句。
在设置 year 之前还需要检查 is.na(year)。否则，之前已经标注的year将被覆盖。

要使用此函数，我们必须首先使用 capital_allocated.5G、capital_percentage.5G 和 year:

的一些初始值扩充输入数据

capital_allocated.5G <- rep(0,10)   ## initialize to zero
capital_percentage.5G <- rep(0,10)  ## initialize to zero
year <- rep(NA,10)                  ## initialize to NA
all <- data.frame(observation,pop.d.rank,cost,capital_allocated.5G,capital_percentage.5G,year)

然后第 1 年：

annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
##   observation pop.d.rank cost capital_allocated.5G capital_percentage.5G   year
##1            1          1  101                  101             100.00000 Year.1
##2            2          2  102                  102             100.00000 Year.1
##3            3          3  103                  103             100.00000 Year.1
##4            4          4  104                  104             100.00000 Year.1
##5            5          5  105                   90              85.71429 Year.1
##6            6          6  106                    0               0.00000   <NA>
##7            7          7  107                    0               0.00000   <NA>
##8            8          8  108                    0               0.00000   <NA>
##9            9          9  109                    0               0.00000   <NA>
##10          10         10  110                    0               0.00000   <NA>

第 2 年：

all <- alloc.invest(all,annual.investment,2)
print(all)
##   observation pop.d.rank cost capital_allocated.5G capital_percentage.5G   year
##1            1          1  101                  101                   100 Year.1
##2            2          2  102                  102                   100 Year.1
##3            3          3  103                  103                   100 Year.1
##4            4          4  104                  104                   100 Year.1
##5            5          5  105                  105                   100 Year.1
##6            6          6  106                  106                   100 Year.2
##7            7          7  107                  107                   100 Year.2
##8            8          8  108                  108                   100 Year.2
##9            9          9  109                  109                   100 Year.2
##10          10         10  110                   55                    50 Year.2

更新每年更改成本的新要求

如果每年的成本不同，则函数需要首先重新调整 capital_percentage.5G 和可能的 year 列：

library(dplyr)
alloc.invest <- function(df, ann.invest, y) {
  df %>% mutate_(cost=paste0("cost.",y)) %>%
         mutate(capital_percentage.5G = capital_allocated.5G / cost * 100,
                year = ifelse(capital_percentage.5G < 50, NA, year),
                not.yet.alloc = ifelse(capital_percentage.5G < 100,cost-capital_allocated.5G,0),
                capital_allocated.5G = capital_allocated.5G + ifelse(capital_percentage.5G < 100,diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))), 0),
                capital_percentage.5G = capital_allocated.5G / cost * 100,
                year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
         select(-cost,-not.yet.alloc)
}

请注意，使用 mutate_ 创建另一个临时列 cost 只是为了方便，因为需要根据输入动态选择成本列 y（否则，我们需要使用 mutate_ 进行所有计算，这会有些混乱）。

更新后的数据同样增加了 capital_allocated.5G、capital_percentage.5G 和 year 的初始值，第 1 年：

annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    102    103                  101             100.00000 Year.1
##2            2          2    102    103    104                  102             100.00000 Year.1
##3            3          3    103    104    105                  103             100.00000 Year.1
##4            4          4    104    105    106                  104             100.00000 Year.1
##5            5          5    105    106    107                   90              85.71429 Year.1
##6            6          6    106    107    108                    0               0.00000   <NA>
##7            7          7    107    108    109                    0               0.00000   <NA>
##8            8          8    108    109    110                    0               0.00000   <NA>
##9            9          9    109    110    111                    0               0.00000   <NA>
##10          10         10    110    111    112                    0               0.00000   <NA>

第 2 年：请注意，最后一项资产的分配少于 50%，因此其 year 仍为 NA。

all <- alloc.invest(all,annual.investment,2)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    102    103                  102             100.00000 Year.1
##2            2          2    102    103    104                  103             100.00000 Year.1
##3            3          3    103    104    105                  104             100.00000 Year.1
##4            4          4    104    105    106                  105             100.00000 Year.1
##5            5          5    105    106    107                  106             100.00000 Year.1
##6            6          6    106    107    108                  107             100.00000 Year.2
##7            7          7    107    108    109                  108             100.00000 Year.2
##8            8          8    108    109    110                  109             100.00000 Year.2
##9            9          9    109    110    111                  110             100.00000 Year.2
##10          10         10    110    111    112                   46              41.44144   <NA>

第 3 年：

all <- alloc.invest(all,annual.investment,3)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    102    103                  103                   100 Year.1
##2            2          2    102    103    104                  104                   100 Year.1
##3            3          3    103    104    105                  105                   100 Year.1
##4            4          4    104    105    106                  106                   100 Year.1
##5            5          5    105    106    107                  107                   100 Year.1
##6            6          6    106    107    108                  108                   100 Year.2
##7            7          7    107    108    109                  109                   100 Year.2
##8            8          8    108    109    110                  110                   100 Year.2
##9            9          9    109    110    111                  111                   100 Year.2
##10          10         10    110    111    112                  112                   100 Year.3

R - 使用 ifelse 语句在不同的列上分配一个数字的份额

R - allocate a share of a number over different columns using an ifelse statement

if-statement

r

mutated

dplyr

针对可能增加或减少的同比成本进行了更新

更新每年更改成本的新要求