R中的条件累积和和分组
Conditional cumulative sum and grouping in R
我正在尝试根据另一个变量的累计和创建一个组变量。如果累积和超出限制(15000000),我想对其应用约束,那么组变量应该改变。这是我正在处理的代码:-
myDat = data.frame(Seg = c("A","B","C","D","F","G","H"),
Freq =c(4558848, 10926592, 15783936,8266496,7729349,13234562,9873456))
myDat$csum <- ceiling(ave(myDat$Freq,FUN=cumsum)/15000000)
# Seg Freq csum
# A 4558848 1
# B 10926592 2
# C 15783936 3
# D 8266496 3
# F 7729349 4
# G 13234562 5
# H 9873456 5
myDat1 <- aggregate(Freq~csum, data=myDat, FUN = sum)
# csum Freq
# 1 4558848
# 2 10926592
# 3 24050432
# 4 7729349
# 5 23108018
部分群组已超出 15000000 的限制。谁能帮我处理这段代码?
# Desired Results:-
# Seg Freq csum Desired csum
# A 4558848 1 1
# B 10926592 2 2
# C 15783936 3 3
# D 8266496 3 4
# F 6229349 4 4
# G 13234562 4 5
# H 9873456 5 6
我能够找到答案,归功于 link .
myDat %>% mutate(cumsum_15 = accumulate(Freq, ~ifelse(.x + .y <= 15000000, .x + .y, .y)),
group_15 = cumsum(Freq == cumsum_10))
我相信你想要cumsum(Freq > 1e7)
。
with(myDat, aggregate(list(Freq=Freq), list(csum=cumsum(Freq > 1e7) + 1), sum))
# csum Freq
# 1 1 4558848
# 2 2 10926592
# 3 3 31779781
# 4 4 23108018
transform(myDat, csum=cumsum(Freq > 1e7) + 1)
# Seg Freq csum
# 1 A 4558848 1
# 2 B 10926592 2
# 3 C 15783936 3
# 4 D 8266496 3
# 5 F 7729349 3
# 6 G 13234562 4
# 7 H 9873456 4
数据:
myDat <- structure(list(Seg = c("A", "B", "C", "D", "F", "G", "H"), Freq = c(4558848,
10926592, 15783936, 8266496, 7729349, 13234562, 9873456)), class = "data.frame", row.names = c(NA,
-7L))
我正在尝试根据另一个变量的累计和创建一个组变量。如果累积和超出限制(15000000),我想对其应用约束,那么组变量应该改变。这是我正在处理的代码:-
myDat = data.frame(Seg = c("A","B","C","D","F","G","H"),
Freq =c(4558848, 10926592, 15783936,8266496,7729349,13234562,9873456))
myDat$csum <- ceiling(ave(myDat$Freq,FUN=cumsum)/15000000)
# Seg Freq csum
# A 4558848 1
# B 10926592 2
# C 15783936 3
# D 8266496 3
# F 7729349 4
# G 13234562 5
# H 9873456 5
myDat1 <- aggregate(Freq~csum, data=myDat, FUN = sum)
# csum Freq
# 1 4558848
# 2 10926592
# 3 24050432
# 4 7729349
# 5 23108018
部分群组已超出 15000000 的限制。谁能帮我处理这段代码?
# Desired Results:-
# Seg Freq csum Desired csum
# A 4558848 1 1
# B 10926592 2 2
# C 15783936 3 3
# D 8266496 3 4
# F 6229349 4 4
# G 13234562 4 5
# H 9873456 5 6
我能够找到答案,归功于 link .
myDat %>% mutate(cumsum_15 = accumulate(Freq, ~ifelse(.x + .y <= 15000000, .x + .y, .y)),
group_15 = cumsum(Freq == cumsum_10))
我相信你想要cumsum(Freq > 1e7)
。
with(myDat, aggregate(list(Freq=Freq), list(csum=cumsum(Freq > 1e7) + 1), sum))
# csum Freq
# 1 1 4558848
# 2 2 10926592
# 3 3 31779781
# 4 4 23108018
transform(myDat, csum=cumsum(Freq > 1e7) + 1)
# Seg Freq csum
# 1 A 4558848 1
# 2 B 10926592 2
# 3 C 15783936 3
# 4 D 8266496 3
# 5 F 7729349 3
# 6 G 13234562 4
# 7 H 9873456 4
数据:
myDat <- structure(list(Seg = c("A", "B", "C", "D", "F", "G", "H"), Freq = c(4558848,
10926592, 15783936, 8266496, 7729349, 13234562, 9873456)), class = "data.frame", row.names = c(NA,
-7L))