计算 R 中的新事件?
Compute New Occurrences in R?
我正在处理的数据提供了某个日期的多个组中的一个的总计数。我希望添加一个新列来计算该组的新计数,与上次记录日期相比。
例如,这是我拥有的数据的结构:
library(tidyverse)
library(lubridate)
df <- data.frame(date = as.Date(rep(c("2021-12-24","2021-12-27","2021-12-28","2021-12-31"),each=3),"%Y-%m-%d"),
group = rep(c("A","B","C"), 4),
count = c(5,8,10,7,15,11,10,15,13,10,25,15))
提供以下内容:
date group count
2021-12-24 A 5
2021-12-24 B 8
2021-12-24 C 10
2021-12-27 A 7
2021-12-27 B 15
2021-12-27 C 11
2021-12-28 A 10
2021-12-28 B 15
2021-12-28 C 13
2021-12-31 A 10
2021-12-31 B 25
2021-12-31 C 15
我希望能够产生具有以下价值的东西:
df2 <- data.frame(date = as.Date(rep(c("2021-12-24","2021-12-27","2021-12-28","2021-12-31"),each=3),"%Y-%m-%d"),
group = rep(c("A","B","C"), 4),
count = c(5,8,10,7,15,11,10,15,13,10,25,15),
new = c(0,0,0,2,7,1,3,0,2,0,10,2))
或者,
date group count new
2021-12-24 A 5 0
2021-12-24 B 8 0
2021-12-24 C 10 0
2021-12-27 A 7 2
2021-12-27 B 15 7
2021-12-27 C 11 1
2021-12-28 A 10 3
2021-12-28 B 15 0
2021-12-28 C 13 2
2021-12-31 A 10 0
2021-12-31 B 25 10
2021-12-31 C 15 2
我完全想不出有效的策略。非常感谢。
tidyverse
df2 <- data.frame(date = as.Date(rep(c("2021-12-24","2021-12-27","2021-12-28","2021-12-31"),each=3),"%Y-%m-%d"),
group = rep(c("A","B","C"), 4),
count = c(5,8,10,7,15,11,10,15,13,10,25,15),
new = c(0,0,0,2,7,1,3,0,2,0,10,2))
library(tidyverse)
df2 %>%
arrange(date) %>%
group_by(group) %>%
mutate(new = c(0, diff(count))) %>%
ungroup()
#> # A tibble: 12 x 4
#> date group count new
#> <date> <chr> <dbl> <dbl>
#> 1 2021-12-24 A 5 0
#> 2 2021-12-24 B 8 0
#> 3 2021-12-24 C 10 0
#> 4 2021-12-27 A 7 2
#> 5 2021-12-27 B 15 7
#> 6 2021-12-27 C 11 1
#> 7 2021-12-28 A 10 3
#> 8 2021-12-28 B 15 0
#> 9 2021-12-28 C 13 2
#> 10 2021-12-31 A 10 0
#> 11 2021-12-31 B 25 10
#> 12 2021-12-31 C 15 2
由 reprex package (v2.0.1)
于 2022-01-17 创建
data.table
library(data.table)
setDT(df2)[order(date), new := c(0, diff(count)), by = group][]
#> date group count new
#> 1: 2021-12-24 A 5 0
#> 2: 2021-12-24 B 8 0
#> 3: 2021-12-24 C 10 0
#> 4: 2021-12-27 A 7 2
#> 5: 2021-12-27 B 15 7
#> 6: 2021-12-27 C 11 1
#> 7: 2021-12-28 A 10 3
#> 8: 2021-12-28 B 15 0
#> 9: 2021-12-28 C 13 2
#> 10: 2021-12-31 A 10 0
#> 11: 2021-12-31 B 25 10
#> 12: 2021-12-31 C 15 2
由 reprex package (v2.0.1)
于 2022-01-17 创建
基础
df2 <- df2[order(df2$date), ]
df2$new <- with(df2, ave(x = count, list(group), FUN = function(x) c(0, diff(x))))
df2
#> date group count new
#> 1 2021-12-24 A 5 0
#> 2 2021-12-24 B 8 0
#> 3 2021-12-24 C 10 0
#> 4 2021-12-27 A 7 2
#> 5 2021-12-27 B 15 7
#> 6 2021-12-27 C 11 1
#> 7 2021-12-28 A 10 3
#> 8 2021-12-28 B 15 0
#> 9 2021-12-28 C 13 2
#> 10 2021-12-31 A 10 0
#> 11 2021-12-31 B 25 10
#> 12 2021-12-31 C 15 2
由 reprex package (v2.0.1)
于 2022-01-17 创建
我正在处理的数据提供了某个日期的多个组中的一个的总计数。我希望添加一个新列来计算该组的新计数,与上次记录日期相比。
例如,这是我拥有的数据的结构:
library(tidyverse)
library(lubridate)
df <- data.frame(date = as.Date(rep(c("2021-12-24","2021-12-27","2021-12-28","2021-12-31"),each=3),"%Y-%m-%d"),
group = rep(c("A","B","C"), 4),
count = c(5,8,10,7,15,11,10,15,13,10,25,15))
提供以下内容:
date group count
2021-12-24 A 5
2021-12-24 B 8
2021-12-24 C 10
2021-12-27 A 7
2021-12-27 B 15
2021-12-27 C 11
2021-12-28 A 10
2021-12-28 B 15
2021-12-28 C 13
2021-12-31 A 10
2021-12-31 B 25
2021-12-31 C 15
我希望能够产生具有以下价值的东西:
df2 <- data.frame(date = as.Date(rep(c("2021-12-24","2021-12-27","2021-12-28","2021-12-31"),each=3),"%Y-%m-%d"),
group = rep(c("A","B","C"), 4),
count = c(5,8,10,7,15,11,10,15,13,10,25,15),
new = c(0,0,0,2,7,1,3,0,2,0,10,2))
或者,
date group count new
2021-12-24 A 5 0
2021-12-24 B 8 0
2021-12-24 C 10 0
2021-12-27 A 7 2
2021-12-27 B 15 7
2021-12-27 C 11 1
2021-12-28 A 10 3
2021-12-28 B 15 0
2021-12-28 C 13 2
2021-12-31 A 10 0
2021-12-31 B 25 10
2021-12-31 C 15 2
我完全想不出有效的策略。非常感谢。
tidyverse
df2 <- data.frame(date = as.Date(rep(c("2021-12-24","2021-12-27","2021-12-28","2021-12-31"),each=3),"%Y-%m-%d"),
group = rep(c("A","B","C"), 4),
count = c(5,8,10,7,15,11,10,15,13,10,25,15),
new = c(0,0,0,2,7,1,3,0,2,0,10,2))
library(tidyverse)
df2 %>%
arrange(date) %>%
group_by(group) %>%
mutate(new = c(0, diff(count))) %>%
ungroup()
#> # A tibble: 12 x 4
#> date group count new
#> <date> <chr> <dbl> <dbl>
#> 1 2021-12-24 A 5 0
#> 2 2021-12-24 B 8 0
#> 3 2021-12-24 C 10 0
#> 4 2021-12-27 A 7 2
#> 5 2021-12-27 B 15 7
#> 6 2021-12-27 C 11 1
#> 7 2021-12-28 A 10 3
#> 8 2021-12-28 B 15 0
#> 9 2021-12-28 C 13 2
#> 10 2021-12-31 A 10 0
#> 11 2021-12-31 B 25 10
#> 12 2021-12-31 C 15 2
由 reprex package (v2.0.1)
于 2022-01-17 创建data.table
library(data.table)
setDT(df2)[order(date), new := c(0, diff(count)), by = group][]
#> date group count new
#> 1: 2021-12-24 A 5 0
#> 2: 2021-12-24 B 8 0
#> 3: 2021-12-24 C 10 0
#> 4: 2021-12-27 A 7 2
#> 5: 2021-12-27 B 15 7
#> 6: 2021-12-27 C 11 1
#> 7: 2021-12-28 A 10 3
#> 8: 2021-12-28 B 15 0
#> 9: 2021-12-28 C 13 2
#> 10: 2021-12-31 A 10 0
#> 11: 2021-12-31 B 25 10
#> 12: 2021-12-31 C 15 2
由 reprex package (v2.0.1)
于 2022-01-17 创建基础
df2 <- df2[order(df2$date), ]
df2$new <- with(df2, ave(x = count, list(group), FUN = function(x) c(0, diff(x))))
df2
#> date group count new
#> 1 2021-12-24 A 5 0
#> 2 2021-12-24 B 8 0
#> 3 2021-12-24 C 10 0
#> 4 2021-12-27 A 7 2
#> 5 2021-12-27 B 15 7
#> 6 2021-12-27 C 11 1
#> 7 2021-12-28 A 10 3
#> 8 2021-12-28 B 15 0
#> 9 2021-12-28 C 13 2
#> 10 2021-12-31 A 10 0
#> 11 2021-12-31 B 25 10
#> 12 2021-12-31 C 15 2
由 reprex package (v2.0.1)
于 2022-01-17 创建