混合日期数据文件
Mixed date datafile
我输入了以下数据文件:三对日期价格数据(加上列索引编号)。问题是每个价格都有不同的国定假日,因此英国和美国的价格最终会错位。有没有一种很好的方法可以将日期推入 xts/zoo 格式并在价格不存在(mkt 已关闭)的情况下填充 NA
?
ColNumb Date1 UK2Y Date2 US2Y Date3 GBPUSD
1 09/07/2012 0.9330 09/07/2012 0.5210 09/07/2012 1.552554
2 10/07/2012 0.9401 10/07/2012 0.5235 10/07/2012 1.551831
3 11/07/2012 0.9122 11/07/2012 0.5003 11/07/2012 1.550388
4 12/07/2012 0.8732 12/07/2012 0.4805 12/07/2012 1.542972
等
UK2y <- as.xts(data[1:1033,1:2])
US2y <- as.xts(data[,3:4])
GBPUSD <- data[,5:6]
我试过使用 {A <- strptime(UK2y$Date1, format = "%d/%m/%Y")}
,但这会导致无效的动物园对象。我最终在 'A' 中得到了正确格式的日期,如 POSIX class,但在 zoo ("error in structure"):
中无法 cbind
UK2y <- cbind(UK2y, A)
你在上面看到一个额外的问题,因为每个成对列的长度不同。某种 "date match" 函数可以缓解,或者 zoo/xts?
中可能存在一个解决方案
这是使用 merge
的解决方案:
# subset your data
UK2Y = data[,c("Date1", "UK2Y")]
US2Y = data[,c("Date2", "US2Y")]
GBPUSD = data[,c("Date3", "GBPUSD")]
# rename them to have the same Date column
names(UK2Y)[names(UK2Y) == "Date1"] <- "Date"
names(US2Y)[names(US2Y) == "Date2"] <- "Date"
names(GBPUSD)[names(GBPUSD) == "Date3"] <- "Date"
# Test: remove one data
US2Y = US2Y[-4,] # market closed in US this day
# Merge the data frames
group = merge(UK2Y, US2Y, by = "Date", all = T) # "all = T" will show missing data as NA
group = merge(group, GBPUSD, by = "Date", all = T)
print(group)
Date UK2Y US2Y GBPUSD
1 2012-07-09 0.9330 0.5210 1.552554
2 2012-07-10 0.9401 0.5235 1.551831
3 2012-07-11 0.9122 0.5003 1.550388
4 2012-07-12 0.8732 NA 1.542972
编辑
您可以创建一个空数据框,其中包含按您想要的顺序生成的正确日期,然后合并:
UK2Y$Date = as.Date(UK2Y$Date)
US2Y$Date = as.Date(US2Y$Date)
GBPUSD$Date = as.Date(GBPUSD$Date)
# create empty dataframe with correct dates
dates = data.frame(Date = seq(as.Date("2012-07-01"), as.Date("2012-07-20"), by = '1 day'))
US2Y = US2Y[-4,]
group = merge(dates, UK2Y, by = "Date", all = T)
group = merge(group, US2Y, by = "Date", all = T)
group = merge(group, GBPUSD, by = "Date", all = T)
print(group)
Date UK2Y US2Y GBPUSD
1 2012-07-01 NA NA NA
2 2012-07-02 NA NA NA
3 2012-07-03 NA NA NA
4 2012-07-04 NA NA NA
5 2012-07-05 NA NA NA
6 2012-07-06 NA NA NA
7 2012-07-07 NA NA NA
8 2012-07-08 NA NA NA
9 2012-07-09 0.9330 0.5210 1.552554
10 2012-07-10 0.9401 0.5235 1.551831
11 2012-07-11 0.9122 0.5003 1.550388
12 2012-07-12 0.8732 NA 1.542972
13 2012-07-13 NA NA NA
14 2012-07-14 NA NA NA
15 2012-07-15 NA NA NA
16 2012-07-16 NA NA NA
17 2012-07-17 NA NA NA
18 2012-07-18 NA NA NA
19 2012-07-19 NA NA NA
20 2012-07-20 NA NA NA
我输入了以下数据文件:三对日期价格数据(加上列索引编号)。问题是每个价格都有不同的国定假日,因此英国和美国的价格最终会错位。有没有一种很好的方法可以将日期推入 xts/zoo 格式并在价格不存在(mkt 已关闭)的情况下填充 NA
?
ColNumb Date1 UK2Y Date2 US2Y Date3 GBPUSD
1 09/07/2012 0.9330 09/07/2012 0.5210 09/07/2012 1.552554
2 10/07/2012 0.9401 10/07/2012 0.5235 10/07/2012 1.551831
3 11/07/2012 0.9122 11/07/2012 0.5003 11/07/2012 1.550388
4 12/07/2012 0.8732 12/07/2012 0.4805 12/07/2012 1.542972
等
UK2y <- as.xts(data[1:1033,1:2])
US2y <- as.xts(data[,3:4])
GBPUSD <- data[,5:6]
我试过使用 {A <- strptime(UK2y$Date1, format = "%d/%m/%Y")}
,但这会导致无效的动物园对象。我最终在 'A' 中得到了正确格式的日期,如 POSIX class,但在 zoo ("error in structure"):
cbind
UK2y <- cbind(UK2y, A)
你在上面看到一个额外的问题,因为每个成对列的长度不同。某种 "date match" 函数可以缓解,或者 zoo/xts?
中可能存在一个解决方案这是使用 merge
的解决方案:
# subset your data
UK2Y = data[,c("Date1", "UK2Y")]
US2Y = data[,c("Date2", "US2Y")]
GBPUSD = data[,c("Date3", "GBPUSD")]
# rename them to have the same Date column
names(UK2Y)[names(UK2Y) == "Date1"] <- "Date"
names(US2Y)[names(US2Y) == "Date2"] <- "Date"
names(GBPUSD)[names(GBPUSD) == "Date3"] <- "Date"
# Test: remove one data
US2Y = US2Y[-4,] # market closed in US this day
# Merge the data frames
group = merge(UK2Y, US2Y, by = "Date", all = T) # "all = T" will show missing data as NA
group = merge(group, GBPUSD, by = "Date", all = T)
print(group)
Date UK2Y US2Y GBPUSD
1 2012-07-09 0.9330 0.5210 1.552554
2 2012-07-10 0.9401 0.5235 1.551831
3 2012-07-11 0.9122 0.5003 1.550388
4 2012-07-12 0.8732 NA 1.542972
编辑
您可以创建一个空数据框,其中包含按您想要的顺序生成的正确日期,然后合并:
UK2Y$Date = as.Date(UK2Y$Date)
US2Y$Date = as.Date(US2Y$Date)
GBPUSD$Date = as.Date(GBPUSD$Date)
# create empty dataframe with correct dates
dates = data.frame(Date = seq(as.Date("2012-07-01"), as.Date("2012-07-20"), by = '1 day'))
US2Y = US2Y[-4,]
group = merge(dates, UK2Y, by = "Date", all = T)
group = merge(group, US2Y, by = "Date", all = T)
group = merge(group, GBPUSD, by = "Date", all = T)
print(group)
Date UK2Y US2Y GBPUSD
1 2012-07-01 NA NA NA
2 2012-07-02 NA NA NA
3 2012-07-03 NA NA NA
4 2012-07-04 NA NA NA
5 2012-07-05 NA NA NA
6 2012-07-06 NA NA NA
7 2012-07-07 NA NA NA
8 2012-07-08 NA NA NA
9 2012-07-09 0.9330 0.5210 1.552554
10 2012-07-10 0.9401 0.5235 1.551831
11 2012-07-11 0.9122 0.5003 1.550388
12 2012-07-12 0.8732 NA 1.542972
13 2012-07-13 NA NA NA
14 2012-07-14 NA NA NA
15 2012-07-15 NA NA NA
16 2012-07-16 NA NA NA
17 2012-07-17 NA NA NA
18 2012-07-18 NA NA NA
19 2012-07-19 NA NA NA
20 2012-07-20 NA NA NA