使用 R 的时间线数据操作/插补
Data Manipulation/ Imputation for Timeline using R
我正在尝试使用 R 中的 vistime
包创建时间线。我遇到的问题是创建不存在数据的行,以具有连续的时间线。
手动这样做可能会非常乏味,我想找到一种方法来自动完成为缺少数据的时间段填充默认标签的过程。
这是数据和当前输出的示例:
library(vistime)
syst <- data.frame(Position = rep(c( "DOWN"), each= 5),
Name = c("SYS2", "SYS2","SYS4","SYS4","SYS6"),
start = c("2018-10-16","2018-12-06","2018-10-24","2018-12-05","2018-11-09"),
end = c("2018-11-26","2018-12-31","2018-11-23","2018-12-31","2018-12-31"),
color = rep(c('#FF0000'), each=5),
fontcolor = rep(c('white'), each=5))
vistime(syst, events = "Position", groups = "Name")
期望的输出:
syst2 <- data.frame(Position = rep(c( "UP","DOWN"), 5),
Name = rep(c("SYS2", "SYS2","SYS4","SYS4","SYS6"), each=2),
start = c("2018-10-01","2018-10-16","2018-11-26","2018-12-06","2018-10-01","2018-10-24","2018-11-23","2018-12-05","2018-10-01","2018-11-09"),
end = c("2018-10-16","2018-11-26","2018-12-06","2018-12-31","2018-10-24","2018-11-23","2018-12-05","2018-12-31","2018-11-09","2018-12-31"),
color = rep(c("#008000",'#FF0000'), 5),
fontcolor = rep(c('white'), 10))
vistime(syst2, events = "Position", groups = "Name")
我们可以这样做。先让
rng <- c("2018-10-01", "2018-12-31")
是您考虑的开始日期和结束日期的向量。此外,我将 stringsAsFactors = FALSE
添加到 syst
的定义中,以避免在添加新日期时出现问题。
然后我们有
library(tidyverse)
syst2 <- syst %>% group_by(Name) %>%
do({bind_rows(., data.frame(Position = "UP", Name = .$Name[1],
start = c(rng[1], .$end),
end = c(.$start, rng[2]),
color = "#008000",
fontcolor = "white",
stringsAsFactors = FALSE))}) %>%
filter(start != end)
vistime(syst2, events = "Position", groups = "Name")
因此,我们按 Name
分组,对于每个组,我们将现有行与一个新数据框绑定,其中所有内容都按预期指定,唯一的技巧是 start
和end
。最后,我过滤掉那些开始日期和结束日期重合的行。
我正在尝试使用 R 中的 vistime
包创建时间线。我遇到的问题是创建不存在数据的行,以具有连续的时间线。
手动这样做可能会非常乏味,我想找到一种方法来自动完成为缺少数据的时间段填充默认标签的过程。
这是数据和当前输出的示例:
library(vistime)
syst <- data.frame(Position = rep(c( "DOWN"), each= 5),
Name = c("SYS2", "SYS2","SYS4","SYS4","SYS6"),
start = c("2018-10-16","2018-12-06","2018-10-24","2018-12-05","2018-11-09"),
end = c("2018-11-26","2018-12-31","2018-11-23","2018-12-31","2018-12-31"),
color = rep(c('#FF0000'), each=5),
fontcolor = rep(c('white'), each=5))
vistime(syst, events = "Position", groups = "Name")
期望的输出:
syst2 <- data.frame(Position = rep(c( "UP","DOWN"), 5),
Name = rep(c("SYS2", "SYS2","SYS4","SYS4","SYS6"), each=2),
start = c("2018-10-01","2018-10-16","2018-11-26","2018-12-06","2018-10-01","2018-10-24","2018-11-23","2018-12-05","2018-10-01","2018-11-09"),
end = c("2018-10-16","2018-11-26","2018-12-06","2018-12-31","2018-10-24","2018-11-23","2018-12-05","2018-12-31","2018-11-09","2018-12-31"),
color = rep(c("#008000",'#FF0000'), 5),
fontcolor = rep(c('white'), 10))
vistime(syst2, events = "Position", groups = "Name")
我们可以这样做。先让
rng <- c("2018-10-01", "2018-12-31")
是您考虑的开始日期和结束日期的向量。此外,我将 stringsAsFactors = FALSE
添加到 syst
的定义中,以避免在添加新日期时出现问题。
然后我们有
library(tidyverse)
syst2 <- syst %>% group_by(Name) %>%
do({bind_rows(., data.frame(Position = "UP", Name = .$Name[1],
start = c(rng[1], .$end),
end = c(.$start, rng[2]),
color = "#008000",
fontcolor = "white",
stringsAsFactors = FALSE))}) %>%
filter(start != end)
vistime(syst2, events = "Position", groups = "Name")
因此,我们按 Name
分组,对于每个组,我们将现有行与一个新数据框绑定,其中所有内容都按预期指定,唯一的技巧是 start
和end
。最后,我过滤掉那些开始日期和结束日期重合的行。