R中的条件累积和和分组

Conditional cumulative sum and grouping in R

我正在尝试根据另一个变量的累计和创建一个组变量。如果累积和超出限制(15000000),我想对其应用约束,那么组变量应该改变。这是我正在处理的代码:-

myDat = data.frame(Seg = c("A","B","C","D","F","G","H"),
                       Freq =c(4558848, 10926592, 15783936,8266496,7729349,13234562,9873456))

myDat$csum <- ceiling(ave(myDat$Freq,FUN=cumsum)/15000000)

# Seg     Freq csum
# A  4558848    1
# B 10926592    2
# C 15783936    3
# D  8266496    3
# F  7729349    4
# G 13234562    5
# H  9873456    5

myDat1 <- aggregate(Freq~csum, data=myDat, FUN = sum)

# csum     Freq
# 1  4558848
# 2 10926592
# 3 24050432
# 4  7729349
# 5 23108018

部分群组已超出 15000000 的限制。谁能帮我处理这段代码?

# Desired Results:-

# Seg     Freq csum  Desired csum
# A  4558848    1    1  
# B 10926592    2    2
# C 15783936    3    3
# D  8266496    3    4
# F  6229349    4    4
# G 13234562    4    5
# H  9873456    5    6

我能够找到答案,归功于 link .

myDat %>% mutate(cumsum_15 = accumulate(Freq, ~ifelse(.x + .y <= 15000000, .x + .y, .y)),
                 group_15 = cumsum(Freq == cumsum_10))

我相信你想要cumsum(Freq > 1e7)

with(myDat, aggregate(list(Freq=Freq), list(csum=cumsum(Freq > 1e7) + 1), sum))
#   csum     Freq
# 1    1  4558848
# 2    2 10926592
# 3    3 31779781
# 4    4 23108018

transform(myDat, csum=cumsum(Freq > 1e7) + 1)
#   Seg     Freq csum
# 1   A  4558848    1
# 2   B 10926592    2
# 3   C 15783936    3
# 4   D  8266496    3
# 5   F  7729349    3
# 6   G 13234562    4
# 7   H  9873456    4

数据:

myDat <- structure(list(Seg = c("A", "B", "C", "D", "F", "G", "H"), Freq = c(4558848, 
10926592, 15783936, 8266496, 7729349, 13234562, 9873456)), class = "data.frame", row.names = c(NA, 
-7L))