如何合并具有稍微不同列的 xts 对象?
How to merge xts objects with slightly different columns?
给定各种单行 xts 对象:
z1 = xts(t(c("9902"=0,"9903"=0,"9904"=0,"9905"=2,"9906"=2)),as.Date("2015-01-01"))
z2 = xts(t(c("9902"=3,"9903"=4,"9905"=6,"9906"=5,"9908"=8)),as.Date("2015-01-02"))
z3 = xts(t(c("9901"=1,"9903"=3,"9905"=5,"9906"=6,"9907"=7,"9909"=9)),as.Date("2015-01-03"))
我想将它们合并为一个 xts 对象。但是 cbind(z1,z2,z3)
给出:
X9902 X9903 X9904 X9905 X9906 X9902.1 X9903.1 X9905.1 X9906.1 X9908 X9901 X9903.2 X9905.2 X9906.2 X9907 X9909
2015-01-01 0 0 0 2 2 NA NA NA NA NA NA NA NA NA NA NA
2015-01-02 NA NA NA NA NA 3 4 6 5 8 NA NA NA NA NA NA
2015-01-03 NA NA NA NA NA NA NA NA NA NA 1 3 5 6 7 9
而我期望的是:
9901 9902 9903 9904 9905 9906 9907 9908 9909
2015-01-01 0 0 0 0 2 2 0 0 0
2015-01-02 0 3 4 0 6 5 0 8 0
2015-01-03 1 0 3 0 5 6 7 0 9
(我可以通过给出 fill=0
将 NA 更改为零,即 cbind(z1,z2,z3,fill=0)
。)
rbind(z1,z2,z3)
抱怨行的列数不同。但是,我相信如果事先将缺失的列添加到每个 xts 对象中,这将是一个好方法吗?
真实数据可能有数千行和几百列(合并后),所以我只关注效率。
library(xts)
library(plyr)
z1df <- as.data.frame(z1)
z2df <- as.data.frame(z2)
z3df <- as.data.frame(z3)
res <- rbind.fill(z1df, z2df, z3df)
res[is.na(res)] <- 0
res
# 9902 9903 9904 9905 9906 9908 9901 9907 9909
#1 0 0 0 2 2 0 0 0 0
#2 3 4 0 6 5 8 0 0 0
#3 0 3 0 5 6 0 1 7 9
这类似于下面的Whosebug post
combining two data frames of different lengths
包括日期列
res$Date <- c("2015-01-01", "2015-01-02", "2015-01-03") # the appropriate values
res$Date <- as.Date(res$Date)
并转化为xts对象
xts(res[,-10], order.by=res[,10])
正如我在评论中提到的,merge.xts
(和 merge.zoo
)仅按索引合并,因此您无法使用 merge
(或 cbind
).所以看起来您确实需要 rbind
,但是(如您所说)它将要求所有对象以相同的顺序具有相同数量的列。
我在下面创建了两个函数来帮助处理对象,因此您可以 rbind
它们来创建您想要的结果。
# put all xts objects in a list for easier processing
x <- list(z1, z2, z3)
# function to create template xts object
template <- function(xlist) {
# find set of unique column names from all objects
cn <- unique(unlist(lapply(xlist, colnames)))
# create template xts object
# using a date that doesn't occur in the actual data
minIndex <- do.call(min, lapply(xlist, function(x) index(x[1L,])))
# template object
xts(matrix(0,1,length(cn)), minIndex-1, dimnames=list(NULL, sort(cn)))
}
# function to apply to each xts object
proc <- function(x, template) {
# columns we need to add
neededCols <- !(colnames(template) %in% colnames(x))
# merge this object with template object, filling w/zeros
out <- merge(x, template[,neededCols], fill=0)
# reorder columns (NB: merge.xts always uses make.names)
# and remove first row (from template)
out <- out[-1L,make.names(colnames(template))]
# set column names back to desired values
# (using attr<- because dimnames<-.xts copies)
attr(out, "dimnames") <- list(NULL, colnames(template))
# return object
out
}
(res <- do.call(rbind, lapply(x, proc, template=template(x))))
# 9901 9902 9903 9904 9905 9906 9907 9908 9909
# 2015-01-01 0 0 0 0 2 2 0 0 0
# 2015-01-02 0 3 4 0 6 5 0 8 0
# 2015-01-03 1 0 3 0 5 6 7 0 9
给定各种单行 xts 对象:
z1 = xts(t(c("9902"=0,"9903"=0,"9904"=0,"9905"=2,"9906"=2)),as.Date("2015-01-01"))
z2 = xts(t(c("9902"=3,"9903"=4,"9905"=6,"9906"=5,"9908"=8)),as.Date("2015-01-02"))
z3 = xts(t(c("9901"=1,"9903"=3,"9905"=5,"9906"=6,"9907"=7,"9909"=9)),as.Date("2015-01-03"))
我想将它们合并为一个 xts 对象。但是 cbind(z1,z2,z3)
给出:
X9902 X9903 X9904 X9905 X9906 X9902.1 X9903.1 X9905.1 X9906.1 X9908 X9901 X9903.2 X9905.2 X9906.2 X9907 X9909
2015-01-01 0 0 0 2 2 NA NA NA NA NA NA NA NA NA NA NA
2015-01-02 NA NA NA NA NA 3 4 6 5 8 NA NA NA NA NA NA
2015-01-03 NA NA NA NA NA NA NA NA NA NA 1 3 5 6 7 9
而我期望的是:
9901 9902 9903 9904 9905 9906 9907 9908 9909
2015-01-01 0 0 0 0 2 2 0 0 0
2015-01-02 0 3 4 0 6 5 0 8 0
2015-01-03 1 0 3 0 5 6 7 0 9
(我可以通过给出 fill=0
将 NA 更改为零,即 cbind(z1,z2,z3,fill=0)
。)
rbind(z1,z2,z3)
抱怨行的列数不同。但是,我相信如果事先将缺失的列添加到每个 xts 对象中,这将是一个好方法吗?
真实数据可能有数千行和几百列(合并后),所以我只关注效率。
library(xts)
library(plyr)
z1df <- as.data.frame(z1)
z2df <- as.data.frame(z2)
z3df <- as.data.frame(z3)
res <- rbind.fill(z1df, z2df, z3df)
res[is.na(res)] <- 0
res
# 9902 9903 9904 9905 9906 9908 9901 9907 9909
#1 0 0 0 2 2 0 0 0 0
#2 3 4 0 6 5 8 0 0 0
#3 0 3 0 5 6 0 1 7 9
这类似于下面的Whosebug post
combining two data frames of different lengths
包括日期列
res$Date <- c("2015-01-01", "2015-01-02", "2015-01-03") # the appropriate values
res$Date <- as.Date(res$Date)
并转化为xts对象
xts(res[,-10], order.by=res[,10])
正如我在评论中提到的,merge.xts
(和 merge.zoo
)仅按索引合并,因此您无法使用 merge
(或 cbind
).所以看起来您确实需要 rbind
,但是(如您所说)它将要求所有对象以相同的顺序具有相同数量的列。
我在下面创建了两个函数来帮助处理对象,因此您可以 rbind
它们来创建您想要的结果。
# put all xts objects in a list for easier processing
x <- list(z1, z2, z3)
# function to create template xts object
template <- function(xlist) {
# find set of unique column names from all objects
cn <- unique(unlist(lapply(xlist, colnames)))
# create template xts object
# using a date that doesn't occur in the actual data
minIndex <- do.call(min, lapply(xlist, function(x) index(x[1L,])))
# template object
xts(matrix(0,1,length(cn)), minIndex-1, dimnames=list(NULL, sort(cn)))
}
# function to apply to each xts object
proc <- function(x, template) {
# columns we need to add
neededCols <- !(colnames(template) %in% colnames(x))
# merge this object with template object, filling w/zeros
out <- merge(x, template[,neededCols], fill=0)
# reorder columns (NB: merge.xts always uses make.names)
# and remove first row (from template)
out <- out[-1L,make.names(colnames(template))]
# set column names back to desired values
# (using attr<- because dimnames<-.xts copies)
attr(out, "dimnames") <- list(NULL, colnames(template))
# return object
out
}
(res <- do.call(rbind, lapply(x, proc, template=template(x))))
# 9901 9902 9903 9904 9905 9906 9907 9908 9909
# 2015-01-01 0 0 0 0 2 2 0 0 0
# 2015-01-02 0 3 4 0 6 5 0 8 0
# 2015-01-03 1 0 3 0 5 6 7 0 9