根据另一个属性计算日期列中的更改次数
Count the Number of Changes in a Date Column based on another Attribute
大家好,我在数据验证过程中遇到了问题。我需要在日期列中为名称列中的每个唯一变量提供更改次数。例如:
student.data <- data.frame(student_id = c (1:7),
student_name=c("Rick","Rick","Michelle","Michelle","Rick","Michelle","John"),
mark = c(623.3,515.2,611.0,729.0,843.25,459.4,846.65),
date_of_exam = as.Date(c("2014-01-01","2013-09-23","2014-11-15","2014-05-11", "2014-01-01","2016-04-14","2015-05-12")))
我知道这有点复杂,但结果必须是:
>table
>"Rick"
1
>"Michelle"
2
>"John"
0
在此先感谢您的帮助。
您可以按学生分组并计算不同日期的数量并减去一个:
library(dplyr)
student.data %>%
group_by(student_name) %>%
summarise(cnt = n_distinct(date_of_exam) -1)
# A tibble: 3 x 2
student_name cnt
<fct> <dbl>
1 John 0
2 Michelle 2
3 Rick 1
data.table
方式:
library(data.table)
setDT(student.data)
student.data[, .(change = uniqueN(date_of_exam) - 1), student_name]
# student_name change
#1: Rick 1
#2: Michelle 2
#3: John 0
或在基数 R 中:
aggregate(date_of_exam~student_name,student.data, function(x) length(unique(x)) - 1)
大家好,我在数据验证过程中遇到了问题。我需要在日期列中为名称列中的每个唯一变量提供更改次数。例如:
student.data <- data.frame(student_id = c (1:7),
student_name=c("Rick","Rick","Michelle","Michelle","Rick","Michelle","John"),
mark = c(623.3,515.2,611.0,729.0,843.25,459.4,846.65),
date_of_exam = as.Date(c("2014-01-01","2013-09-23","2014-11-15","2014-05-11", "2014-01-01","2016-04-14","2015-05-12")))
我知道这有点复杂,但结果必须是:
>table
>"Rick"
1
>"Michelle"
2
>"John"
0
在此先感谢您的帮助。
您可以按学生分组并计算不同日期的数量并减去一个:
library(dplyr)
student.data %>%
group_by(student_name) %>%
summarise(cnt = n_distinct(date_of_exam) -1)
# A tibble: 3 x 2
student_name cnt
<fct> <dbl>
1 John 0
2 Michelle 2
3 Rick 1
data.table
方式:
library(data.table)
setDT(student.data)
student.data[, .(change = uniqueN(date_of_exam) - 1), student_name]
# student_name change
#1: Rick 1
#2: Michelle 2
#3: John 0
或在基数 R 中:
aggregate(date_of_exam~student_name,student.data, function(x) length(unique(x)) - 1)