R xts quantmod 在特定时间访问数据
R xts quantmod access to data at a specific time
学习 R(包括 xts 和 quantmod 包)。有我的数据集:
str(h2)
‘zoo’ series from 2016-06-15 11:00:00 to 2016-09-15 14:00:00
Data: num [1:928, 1:5] 67842 67486 67603 67465 67457 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:5] "X.OPEN." "X.HIGH." "X.LOW." "X.CLOSE." ...
Index: POSIXct[1:928], format: "2016-06-15 11:00:00" "2016-06-15 12:00:00" "2016-06-15 13:00:00" ...
first(h2, '1 day')
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
2016-06-15 11:00:00 67842 67842 67122 67488 262740
2016-06-15 12:00:00 67486 67610 67420 67603 288875
2016-06-15 13:00:00 67603 67608 67381 67466 323498
2016-06-15 14:00:00 67465 67484 67356 67455 168991
2016-06-15 15:00:00 67457 67460 67289 67361 174965
2016-06-15 16:00:00 67363 67381 67202 67317 195579
2016-06-15 17:00:00 67320 67465 67288 67397 230255
2016-06-15 18:00:00 67397 67436 67084 67099 469379
2016-06-15 19:00:00 67096 67198 66900 67058 264430
2016-06-15 20:00:00 67040 67094 66944 67092 110503
2016-06-15 21:00:00 67092 67158 66877 66992 83041
2016-06-15 22:00:00 66993 67110 66680 66909 386905
2016-06-15 23:00:00 66909 67269 66884 67126 143373
dput(first(h2, '1 day'))
structure(c(67842, 67486, 67603, 67465, 67457, 67363, 67320,
67397, 67096, 67040, 67092, 66993, 66909, 67842, 67610, 67608,
67484, 67460, 67381, 67465, 67436, 67198, 67094, 67158, 67110,
67269, 67122, 67420, 67381, 67356, 67289, 67202, 67288, 67084,
66900, 66944, 66877, 66680, 66884, 67488, 67603, 67466, 67455,
67361, 67317, 67397, 67099, 67058, 67092, 66992, 66909, 67126,
262740, 288875, 323498, 168991, 174965, 195579, 230255, 469379,
264430, 110503, 83041, 386905, 143373), .Dim = c(13L, 5L), .Dimnames = list(
NULL, c("X.OPEN.", "X.HIGH.", "X.LOW.", "X.CLOSE.", "X.VOL."
)), index = structure(c(1465977600, 1465981200, 1465984800,
1465988400, 1465992000, 1465995600, 1465999200, 1466002800, 1466006400,
1466010000, 1466013600, 1466017200, 1466020800), class = c("POSIXct",
"POSIXt"), tzone = ""), class = "zoo")
无法弄清楚如何解决,例如,这样的任务 - 比较 11:00 处的差值 (X.CLOSE-X.OPEN) 和差值 (X.CLOSE(13:00)-X.OPEN(12:00)) 在样本中包含的所有日期。
在解决这个问题的过程中,我看到了 2 个项目:
1).如何获取特定时间的数据? IE。例如,如何在我选择的 12:00 天获得 X.OPEN。我尝试了不同的组合(见下面的代码)但没有结果(只有数据集的标题)
h2["T11:00:00/12:00:00"]
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
h2["2016-06-15 11:00:00 MSK"]
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
x <- h2["2016-06-15 11:00:00 MSK"] #'zoo' series (without observations)
x
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
h2["T12:00:00.000/T12:00:00.001"]
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
2).早些时候我在 Excel 中做了类似的算法,并通过暴力(逐步检查)数据集解决了这个问题。但是 R 有矢量数据类型,它应该是解决这个任务的更快更方便的方法。
要从时间序列中获取特定时间的数据,请使用 window
。
window(h2, start = as.POSIXct('2016-06-15 04:00'), end = as.POSIXct('2016-06-15 04:59'))
# X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
# 2016-06-15 04:00:00 67842 67842 67122 67488 262740
window(h2, start = as.POSIXct('2016-06-15 06:00'), end = as.POSIXct('2016-06-15 08:00'))
# X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
# 2016-06-15 06:00:00 67603 67608 67381 67466 323498
# 2016-06-15 07:00:00 67465 67484 67356 67455 168991
# 2016-06-15 08:00:00 67457 67460 67289 67361 174965
您也可以按行号访问。
h2[1,]
# X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
# 2016-06-15 04:00:00 67842 67842 67122 67488 262740
h2[4:6,]
# X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
# 2016-06-15 07:00:00 67465 67484 67356 67455 168991
# 2016-06-15 08:00:00 67457 67460 67289 67361 174965
# 2016-06-15 09:00:00 67363 67381 67202 67317 195579
取X.CLOSE之间的差值。和 X.OPEN。在任何时候,您都可以...
h2_diff <- h2[,'X.CLOSE.'] - h2[,'X.OPEN.']
# 2016-06-15 04:00:00 2016-06-15 05:00:00 2016-06-15 06:00:00 2016-06-15 07:00:00
# -354 117 -137 -10
# 2016-06-15 08:00:00 2016-06-15 09:00:00 2016-06-15 10:00:00 2016-06-15 11:00:00
# -96 -46 77 -298
# 2016-06-15 12:00:00 2016-06-15 13:00:00 2016-06-15 14:00:00 2016-06-15 15:00:00
# -38 52 -100 -84
# 2016-06-15 16:00:00
# 217
要进行滞后差异,您可以使用 lag
函数。
h2[1:3,'X.CLOSE.']
# 2016-06-15 04:00:00 2016-06-15 05:00:00 2016-06-15 06:00:00
# 67488 67603 67466
lag(h2[1:3,'X.CLOSE.'])
# 2016-06-15 04:00:00 2016-06-15 05:00:00
# 67603 67466
lag(h2[1:3,'X.CLOSE.'], -1)
# 2016-06-15 05:00:00 2016-06-15 06:00:00
# 67488 67603
所以X.CLOSE(n) - X.OPEN(n - 1)
...
h2[,'X.CLOSE.'] - lag(h2[,'X.OPEN.'], -1)
# 2016-06-15 05:00:00 2016-06-15 06:00:00 2016-06-15 07:00:00 2016-06-15 08:00:00
# -239 -20 -148 -104
# 2016-06-15 09:00:00 2016-06-15 10:00:00 2016-06-15 11:00:00 2016-06-15 12:00:00
# -140 34 -221 -339
# 2016-06-15 13:00:00 2016-06-15 14:00:00 2016-06-15 15:00:00 2016-06-15 16:00:00
# -4 -48 -183 133
学习 R(包括 xts 和 quantmod 包)。有我的数据集:
str(h2)
‘zoo’ series from 2016-06-15 11:00:00 to 2016-09-15 14:00:00
Data: num [1:928, 1:5] 67842 67486 67603 67465 67457 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:5] "X.OPEN." "X.HIGH." "X.LOW." "X.CLOSE." ...
Index: POSIXct[1:928], format: "2016-06-15 11:00:00" "2016-06-15 12:00:00" "2016-06-15 13:00:00" ...
first(h2, '1 day')
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
2016-06-15 11:00:00 67842 67842 67122 67488 262740
2016-06-15 12:00:00 67486 67610 67420 67603 288875
2016-06-15 13:00:00 67603 67608 67381 67466 323498
2016-06-15 14:00:00 67465 67484 67356 67455 168991
2016-06-15 15:00:00 67457 67460 67289 67361 174965
2016-06-15 16:00:00 67363 67381 67202 67317 195579
2016-06-15 17:00:00 67320 67465 67288 67397 230255
2016-06-15 18:00:00 67397 67436 67084 67099 469379
2016-06-15 19:00:00 67096 67198 66900 67058 264430
2016-06-15 20:00:00 67040 67094 66944 67092 110503
2016-06-15 21:00:00 67092 67158 66877 66992 83041
2016-06-15 22:00:00 66993 67110 66680 66909 386905
2016-06-15 23:00:00 66909 67269 66884 67126 143373
dput(first(h2, '1 day'))
structure(c(67842, 67486, 67603, 67465, 67457, 67363, 67320,
67397, 67096, 67040, 67092, 66993, 66909, 67842, 67610, 67608,
67484, 67460, 67381, 67465, 67436, 67198, 67094, 67158, 67110,
67269, 67122, 67420, 67381, 67356, 67289, 67202, 67288, 67084,
66900, 66944, 66877, 66680, 66884, 67488, 67603, 67466, 67455,
67361, 67317, 67397, 67099, 67058, 67092, 66992, 66909, 67126,
262740, 288875, 323498, 168991, 174965, 195579, 230255, 469379,
264430, 110503, 83041, 386905, 143373), .Dim = c(13L, 5L), .Dimnames = list(
NULL, c("X.OPEN.", "X.HIGH.", "X.LOW.", "X.CLOSE.", "X.VOL."
)), index = structure(c(1465977600, 1465981200, 1465984800,
1465988400, 1465992000, 1465995600, 1465999200, 1466002800, 1466006400,
1466010000, 1466013600, 1466017200, 1466020800), class = c("POSIXct",
"POSIXt"), tzone = ""), class = "zoo")
无法弄清楚如何解决,例如,这样的任务 - 比较 11:00 处的差值 (X.CLOSE-X.OPEN) 和差值 (X.CLOSE(13:00)-X.OPEN(12:00)) 在样本中包含的所有日期。
在解决这个问题的过程中,我看到了 2 个项目:
1).如何获取特定时间的数据? IE。例如,如何在我选择的 12:00 天获得 X.OPEN。我尝试了不同的组合(见下面的代码)但没有结果(只有数据集的标题)
h2["T11:00:00/12:00:00"]
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
h2["2016-06-15 11:00:00 MSK"]
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
x <- h2["2016-06-15 11:00:00 MSK"] #'zoo' series (without observations)
x
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
h2["T12:00:00.000/T12:00:00.001"]
X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
2).早些时候我在 Excel 中做了类似的算法,并通过暴力(逐步检查)数据集解决了这个问题。但是 R 有矢量数据类型,它应该是解决这个任务的更快更方便的方法。
要从时间序列中获取特定时间的数据,请使用 window
。
window(h2, start = as.POSIXct('2016-06-15 04:00'), end = as.POSIXct('2016-06-15 04:59'))
# X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
# 2016-06-15 04:00:00 67842 67842 67122 67488 262740
window(h2, start = as.POSIXct('2016-06-15 06:00'), end = as.POSIXct('2016-06-15 08:00'))
# X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
# 2016-06-15 06:00:00 67603 67608 67381 67466 323498
# 2016-06-15 07:00:00 67465 67484 67356 67455 168991
# 2016-06-15 08:00:00 67457 67460 67289 67361 174965
您也可以按行号访问。
h2[1,]
# X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
# 2016-06-15 04:00:00 67842 67842 67122 67488 262740
h2[4:6,]
# X.OPEN. X.HIGH. X.LOW. X.CLOSE. X.VOL.
# 2016-06-15 07:00:00 67465 67484 67356 67455 168991
# 2016-06-15 08:00:00 67457 67460 67289 67361 174965
# 2016-06-15 09:00:00 67363 67381 67202 67317 195579
取X.CLOSE之间的差值。和 X.OPEN。在任何时候,您都可以...
h2_diff <- h2[,'X.CLOSE.'] - h2[,'X.OPEN.']
# 2016-06-15 04:00:00 2016-06-15 05:00:00 2016-06-15 06:00:00 2016-06-15 07:00:00
# -354 117 -137 -10
# 2016-06-15 08:00:00 2016-06-15 09:00:00 2016-06-15 10:00:00 2016-06-15 11:00:00
# -96 -46 77 -298
# 2016-06-15 12:00:00 2016-06-15 13:00:00 2016-06-15 14:00:00 2016-06-15 15:00:00
# -38 52 -100 -84
# 2016-06-15 16:00:00
# 217
要进行滞后差异,您可以使用 lag
函数。
h2[1:3,'X.CLOSE.']
# 2016-06-15 04:00:00 2016-06-15 05:00:00 2016-06-15 06:00:00
# 67488 67603 67466
lag(h2[1:3,'X.CLOSE.'])
# 2016-06-15 04:00:00 2016-06-15 05:00:00
# 67603 67466
lag(h2[1:3,'X.CLOSE.'], -1)
# 2016-06-15 05:00:00 2016-06-15 06:00:00
# 67488 67603
所以X.CLOSE(n) - X.OPEN(n - 1)
...
h2[,'X.CLOSE.'] - lag(h2[,'X.OPEN.'], -1)
# 2016-06-15 05:00:00 2016-06-15 06:00:00 2016-06-15 07:00:00 2016-06-15 08:00:00
# -239 -20 -148 -104
# 2016-06-15 09:00:00 2016-06-15 10:00:00 2016-06-15 11:00:00 2016-06-15 12:00:00
# -140 34 -221 -339
# 2016-06-15 13:00:00 2016-06-15 14:00:00 2016-06-15 15:00:00 2016-06-15 16:00:00
# -4 -48 -183 133