混合日期数据文件

Mixed date datafile

我输入了以下数据文件:三对日期价格数据(加上列索引编号)。问题是每个价格都有不同的国定假日,因此英国和美国的价格最终会错位。有没有一种很好的方法可以将日期推入 xts/zoo 格式并在价格不存在(mkt 已关闭)的情况下填充 NA

ColNumb  Date1      UK2Y       Date2         US2Y       Date3       GBPUSD
1     09/07/2012   0.9330    09/07/2012    0.5210    09/07/2012    1.552554
2    10/07/2012    0.9401    10/07/2012    0.5235    10/07/2012    1.551831
3    11/07/2012    0.9122    11/07/2012    0.5003    11/07/2012    1.550388
4    12/07/2012    0.8732    12/07/2012    0.4805    12/07/2012    1.542972 

UK2y <- as.xts(data[1:1033,1:2])
US2y <- as.xts(data[,3:4])
GBPUSD <- data[,5:6]

我试过使用 {A <- strptime(UK2y$Date1, format = "%d/%m/%Y")},但这会导致无效的动物园对象。我最终在 'A' 中得到了正确格式的日期,如 POSIX class,但在 zoo ("error in structure"):

中无法 cbind
UK2y <- cbind(UK2y, A)

你在上面看到一个额外的问题,因为每个成对列的长度不同。某种 "date match" 函数可以缓解,或者 zoo/xts?

中可能存在一个解决方案

这是使用 merge 的解决方案:

# subset your data
UK2Y = data[,c("Date1", "UK2Y")]
US2Y = data[,c("Date2", "US2Y")]
GBPUSD = data[,c("Date3", "GBPUSD")]

# rename them to have the same Date column
names(UK2Y)[names(UK2Y) == "Date1"] <- "Date"
names(US2Y)[names(US2Y) == "Date2"] <- "Date"
names(GBPUSD)[names(GBPUSD) == "Date3"] <- "Date"

# Test: remove one data 
US2Y = US2Y[-4,] # market closed in US this day

# Merge the data frames
group = merge(UK2Y, US2Y, by = "Date", all = T) # "all = T" will show missing data as NA
group = merge(group, GBPUSD, by = "Date", all = T)

print(group)

    Date   UK2Y   US2Y   GBPUSD
1 2012-07-09 0.9330 0.5210 1.552554
2 2012-07-10 0.9401 0.5235 1.551831
3 2012-07-11 0.9122 0.5003 1.550388
4 2012-07-12 0.8732     NA 1.542972

编辑

您可以创建一个空数据框,其中包含按您想要的顺序生成的正确日期,然后合并:

UK2Y$Date = as.Date(UK2Y$Date)
US2Y$Date = as.Date(US2Y$Date)
GBPUSD$Date = as.Date(GBPUSD$Date)

# create empty dataframe with correct dates
dates = data.frame(Date = seq(as.Date("2012-07-01"), as.Date("2012-07-20"), by = '1 day'))

US2Y = US2Y[-4,]

group = merge(dates, UK2Y, by = "Date", all = T)
group = merge(group, US2Y, by = "Date", all = T)
group = merge(group, GBPUSD, by = "Date", all = T)

print(group)
     Date   UK2Y   US2Y   GBPUSD
1  2012-07-01     NA     NA       NA
2  2012-07-02     NA     NA       NA
3  2012-07-03     NA     NA       NA
4  2012-07-04     NA     NA       NA
5  2012-07-05     NA     NA       NA
6  2012-07-06     NA     NA       NA
7  2012-07-07     NA     NA       NA
8  2012-07-08     NA     NA       NA
9  2012-07-09 0.9330 0.5210 1.552554
10 2012-07-10 0.9401 0.5235 1.551831
11 2012-07-11 0.9122 0.5003 1.550388
12 2012-07-12 0.8732     NA 1.542972
13 2012-07-13     NA     NA       NA
14 2012-07-14     NA     NA       NA
15 2012-07-15     NA     NA       NA
16 2012-07-16     NA     NA       NA
17 2012-07-17     NA     NA       NA
18 2012-07-18     NA     NA       NA
19 2012-07-19     NA     NA       NA
20 2012-07-20     NA     NA       NA