xts:如何在 as.xts 之后保持对数据类型的控制?

xts: how to keep control of data types after as.xts?

考虑以下数据框

time <-c('2016-04-13 23:07:45','2016-04-13 23:07:55','2016-04-13 23:08:45','2016-04-13 23:08:45'
         ,'2016-04-13 23:08:45','2016-04-13 23:07:50','2016-04-13 23:07:51')
group <-c('A','A','A','B','B','B','B')
value<- c(5,10,2,2,NA,1,4)
df=data.frame(time,group,value)

> df
                 time group value
1 2016-04-13 23:07:45     A     5
2 2016-04-13 23:07:55     A    10
3 2016-04-13 23:08:45     A     2
4 2016-04-13 23:08:45     B     2
5 2016-04-13 23:08:45     B    NA
6 2016-04-13 23:07:50     B     1
7 2016-04-13 23:07:51     B     4

注意缺失值行 5。现在我在使用 lubridate 将我的时间戳转换为正确的 Posix 类型后转换为 xts

> df$time = ymd_hms(df$time)
> df<-as.xts(df,order.by=df$time)
> df
                    time                  group value
2016-04-13 23:07:45 "2016-04-13 23:07:45" "A"   " 5" 
2016-04-13 23:07:50 "2016-04-13 23:07:50" "B"   " 1" 
2016-04-13 23:07:51 "2016-04-13 23:07:51" "B"   " 4" 
2016-04-13 23:07:55 "2016-04-13 23:07:55" "A"   "10" 
2016-04-13 23:08:45 "2016-04-13 23:08:45" "A"   " 2" 
2016-04-13 23:08:45 "2016-04-13 23:08:45" "B"   " 2" 
2016-04-13 23:08:45 "2016-04-13 23:08:45" "B"   NA   

我不错的 numeric 专栏 value 现在是 character

我怎样才能避免这种情况?

谢谢!

数据的底层 xts 对象是一个矩阵,它可以是数字或字符类型,但不能同时是两者(与 data.frame 不同,后者是一个列表,其中每一列都可以是 R 中的任何原子类型)。看到这种情况发生的粗略检查是尝试这个:

> as.matrix(df)
     time                  group value
[1,] "2016-04-13 23:07:45" "A"   " 5" 
[2,] "2016-04-13 23:07:55" "A"   "10" 
[3,] "2016-04-13 23:08:45" "A"   " 2" 
[4,] "2016-04-13 23:08:45" "B"   " 2" 
[5,] "2016-04-13 23:08:45" "B"   NA   
[6,] "2016-04-13 23:07:50" "B"   " 1" 
[7,] "2016-04-13 23:07:51" "B"   " 4"

这是 coredata returns 创建 xts 对象时的内容:

x.df<- xts(df,order.by=df$time)
> coredata(x.df)
     time                  group value
[1,] "2016-04-13 23:07:45" "A"   " 5" 
[2,] "2016-04-13 23:07:50" "B"   " 1" 
[3,] "2016-04-13 23:07:51" "B"   " 4" 
[4,] "2016-04-13 23:07:55" "A"   "10" 
[5,] "2016-04-13 23:08:45" "A"   " 2" 
[6,] "2016-04-13 23:08:45" "B"   " 2" 
[7,] "2016-04-13 23:08:45" "B"   NA   

在创建 xts 对象时删除 timegroup 列以获取您期望的数字数据。您可以将组列类型映射到整数。您也不应该在 x 参数的 xts 对象创建中包含 time,因为您的 order.by 已经包含时间信息。

例如

df$group_idx <- as.numeric(as.factor(df$group))
x.df<- xts(df[, c("group_idx", "value")],order.by=df$time)
> x.df
                    group_idx value
2016-04-13 23:07:45         1     5
2016-04-13 23:07:50         2     1
2016-04-13 23:07:51         2     4
2016-04-13 23:07:55         1    10
2016-04-13 23:08:45         1     2
2016-04-13 23:08:45         2     2
2016-04-13 23:08:45         2    NA