r:缺失日期的完整值
r: complete value in missing date
在 R 中,如果我有这个数据
date.hour temp
2014-01-05 20:00:00 16
2014-01-06 20:00:00 14
2014-01-06 22:00:00 18
并使用 seq
我可以获得一系列日期时间
begin <- as.POSIXct('2014-1-5')
end <- as.POSIXct('2014-1-7')
seq(begin, end, by=2*3600)
我如何才能将数据补全为类似于
的内容
date.hour temp
2014-01-05 00:00:00 NA
2014-01-05 02:00:00 NA
...
2014-01-05 18:00:00 NA
2014-01-05 20:00:00 16
2014-01-05 22:00:00 NA
...
2014-01-06 20:00:00 18
2014-01-06 22:00:00 14
...
2014-01-07 00:00:00 NA
如果这是您的示例数据框
dd<-data.frame(
date.hour = structure(c(1388970000, 1389056400, 1389063600), class = c("POSIXct", "POSIXt"), tzone = ""),
temp = c(16L, 14L, 18L)
)
然后你可以merge()
用你的序列
begin <- as.POSIXct('2014-1-5')
end <- as.POSIXct('2014-1-7')
comp<-seq(begin, end, by=2*3600)
merge(data.frame(date.hour=comp), dd, all.x=T)
通过设置all.x=T
,缺失值将用NA填充。
或与 data.table
类似
您的数据(来自@rawr)
df <- read.table(header = TRUE, text = "date.hour temp
'2014-01-05 20:00:00' 16
'2014-01-06 20:00:00' 14
'2014-01-06 22:00:00' 18", colClasses = c('POSIXct','numeric'))
解决方案
library(data.table)
dt <- data.table(date.hour = seq(begin, end, by=2*3600))
setkey(setDT(df), date.hour)
df[dt]
# date.hour temp
# 1: 2014-01-05 00:00:00 NA
# 2: 2014-01-05 02:00:00 NA
# 3: 2014-01-05 04:00:00 NA
# 4: 2014-01-05 06:00:00 NA
# 5: 2014-01-05 08:00:00 NA
# 6: 2014-01-05 10:00:00 NA
# 7: 2014-01-05 12:00:00 NA
# 8: 2014-01-05 14:00:00 NA
# 9: 2014-01-05 16:00:00 NA
# 10: 2014-01-05 18:00:00 NA
# 11: 2014-01-05 20:00:00 16
# 12: 2014-01-05 22:00:00 NA
# 13: 2014-01-06 00:00:00 NA
# 14: 2014-01-06 02:00:00 NA
# 15: 2014-01-06 04:00:00 NA
# 16: 2014-01-06 06:00:00 NA
# 17: 2014-01-06 08:00:00 NA
# 18: 2014-01-06 10:00:00 NA
# 19: 2014-01-06 12:00:00 NA
# 20: 2014-01-06 14:00:00 NA
# 21: 2014-01-06 16:00:00 NA
# 22: 2014-01-06 18:00:00 NA
# 23: 2014-01-06 20:00:00 14
# 24: 2014-01-06 22:00:00 18
# 25: 2014-01-07 00:00:00 NA
在 R 中,如果我有这个数据
date.hour temp
2014-01-05 20:00:00 16
2014-01-06 20:00:00 14
2014-01-06 22:00:00 18
并使用 seq
我可以获得一系列日期时间
begin <- as.POSIXct('2014-1-5')
end <- as.POSIXct('2014-1-7')
seq(begin, end, by=2*3600)
我如何才能将数据补全为类似于
的内容 date.hour temp
2014-01-05 00:00:00 NA
2014-01-05 02:00:00 NA
...
2014-01-05 18:00:00 NA
2014-01-05 20:00:00 16
2014-01-05 22:00:00 NA
...
2014-01-06 20:00:00 18
2014-01-06 22:00:00 14
...
2014-01-07 00:00:00 NA
如果这是您的示例数据框
dd<-data.frame(
date.hour = structure(c(1388970000, 1389056400, 1389063600), class = c("POSIXct", "POSIXt"), tzone = ""),
temp = c(16L, 14L, 18L)
)
然后你可以merge()
用你的序列
begin <- as.POSIXct('2014-1-5')
end <- as.POSIXct('2014-1-7')
comp<-seq(begin, end, by=2*3600)
merge(data.frame(date.hour=comp), dd, all.x=T)
通过设置all.x=T
,缺失值将用NA填充。
或与 data.table
您的数据(来自@rawr)
df <- read.table(header = TRUE, text = "date.hour temp
'2014-01-05 20:00:00' 16
'2014-01-06 20:00:00' 14
'2014-01-06 22:00:00' 18", colClasses = c('POSIXct','numeric'))
解决方案
library(data.table)
dt <- data.table(date.hour = seq(begin, end, by=2*3600))
setkey(setDT(df), date.hour)
df[dt]
# date.hour temp
# 1: 2014-01-05 00:00:00 NA
# 2: 2014-01-05 02:00:00 NA
# 3: 2014-01-05 04:00:00 NA
# 4: 2014-01-05 06:00:00 NA
# 5: 2014-01-05 08:00:00 NA
# 6: 2014-01-05 10:00:00 NA
# 7: 2014-01-05 12:00:00 NA
# 8: 2014-01-05 14:00:00 NA
# 9: 2014-01-05 16:00:00 NA
# 10: 2014-01-05 18:00:00 NA
# 11: 2014-01-05 20:00:00 16
# 12: 2014-01-05 22:00:00 NA
# 13: 2014-01-06 00:00:00 NA
# 14: 2014-01-06 02:00:00 NA
# 15: 2014-01-06 04:00:00 NA
# 16: 2014-01-06 06:00:00 NA
# 17: 2014-01-06 08:00:00 NA
# 18: 2014-01-06 10:00:00 NA
# 19: 2014-01-06 12:00:00 NA
# 20: 2014-01-06 14:00:00 NA
# 21: 2014-01-06 16:00:00 NA
# 22: 2014-01-06 18:00:00 NA
# 23: 2014-01-06 20:00:00 14
# 24: 2014-01-06 22:00:00 18
# 25: 2014-01-07 00:00:00 NA