Dataframe的滚动列减法
Rolling column subtraction of Dataframe
我有这样一个数据框:
library(lubridate)
set.seed(23)
date_list = seq(ymd('2000-01-15'),ymd('2010-09-18'),by='day')
testframe = data.frame(Date = date_list)
testframe$Day = substr(testframe$Date, start = 6, stop = 10)
testframe$ABC = rnorm(3900)
testframe$DEF = rnorm(3900)
testframe$GHI = seq(from = 10, to = 25, length.out = 3900)
testframe$JKL = seq(from = 5, to = 45, length.out = 3900)
我想要这个数据框的自动滚动子集,应该是这样的:
testframe_ABC = testframe[,c("Date","Day","ABC")]
testframe_DEF = testframe[,c("Date","Day","DEF")]
testframe_GHI = testframe[,c("Date","Day","GHI")]
testframe_JKL = testframe[,c("Date","Day","JKL")]
Date 和 Day 列应该始终保留,其他列应该单独添加。可变列的名称应添加到数据框名称中,以获得新的 df。如果可能的话,所有数据帧也可以在数据帧列表中。
知道怎么做吗?
我假设你想要一个包含 4 个数据框的列表,其组成部分是 ABC
、DEF
等。最好将它们放在一个列表中:
L <- Map(function(nm) testframe[c("Date", "Day", nm)], names(testframe)[-(1:2)])
在这种情况下 L$ABC
或 L[[1]]
将引用 ABC
数据框,但如果你想让它们在全局环境中悬空,这会将列表组件复制到它:
list2env(L, .GlobalEnv)
我不会在这种情况下使用术语 rolling
。通常,该术语指的是滑动 window,例如:
library(zoo)
rollmeanr(1:10, 3) # 2 is mean of 1:3, 3 is mean of 2:4, etc.
## [1] 2 3 4 5 6 7 8 9
您可以使用 split.default
拆分每一列,然后 cbind
前两列拆分到每个元素,即
lapply(split.default(testframe[-c(1, 2)], seq_along(testframe)[-c(1, 2)]), function(i)
cbind.data.frame(testframe[c(1, 2)], i))
给出了一个列表,
$`3`
Date Day ABC
1 2000-01-15 01-15 0.1932123
2 2000-01-16 01-16 -0.4346821
3 2000-01-17 01-17 0.9132671
$`4`
Date Day DEF
1 2000-01-15 01-15 1.7933881
2 2000-01-16 01-16 0.9966051
3 2000-01-17 01-17 1.1074905
$`5`
Date Day GHI
1 2000-01-15 01-15 10.0
2 2000-01-16 01-16 17.5
3 2000-01-17 01-17 25.0
$`6`
Date Day JKL
1 2000-01-15 01-15 5
2 2000-01-16 01-16 25
3 2000-01-17 01-17 45
已使用数据
dput(testframe)
structure(list(Date = structure(c(10971, 10972, 10973), class = "Date"),
Day = c("01-15", "01-16", "01-17"), ABC = c(0.193212333898146,
-0.434682108206693, 0.913267096589322), DEF = c(1.79338809206353,
0.996605106833546, 1.10749048744809), GHI = c(10, 17.5, 25
), JKL = c(5, 25, 45)), row.names = c(NA, -3L), class = "data.frame")
我有这样一个数据框:
library(lubridate)
set.seed(23)
date_list = seq(ymd('2000-01-15'),ymd('2010-09-18'),by='day')
testframe = data.frame(Date = date_list)
testframe$Day = substr(testframe$Date, start = 6, stop = 10)
testframe$ABC = rnorm(3900)
testframe$DEF = rnorm(3900)
testframe$GHI = seq(from = 10, to = 25, length.out = 3900)
testframe$JKL = seq(from = 5, to = 45, length.out = 3900)
我想要这个数据框的自动滚动子集,应该是这样的:
testframe_ABC = testframe[,c("Date","Day","ABC")]
testframe_DEF = testframe[,c("Date","Day","DEF")]
testframe_GHI = testframe[,c("Date","Day","GHI")]
testframe_JKL = testframe[,c("Date","Day","JKL")]
Date 和 Day 列应该始终保留,其他列应该单独添加。可变列的名称应添加到数据框名称中,以获得新的 df。如果可能的话,所有数据帧也可以在数据帧列表中。
知道怎么做吗?
我假设你想要一个包含 4 个数据框的列表,其组成部分是 ABC
、DEF
等。最好将它们放在一个列表中:
L <- Map(function(nm) testframe[c("Date", "Day", nm)], names(testframe)[-(1:2)])
在这种情况下 L$ABC
或 L[[1]]
将引用 ABC
数据框,但如果你想让它们在全局环境中悬空,这会将列表组件复制到它:
list2env(L, .GlobalEnv)
我不会在这种情况下使用术语 rolling
。通常,该术语指的是滑动 window,例如:
library(zoo)
rollmeanr(1:10, 3) # 2 is mean of 1:3, 3 is mean of 2:4, etc.
## [1] 2 3 4 5 6 7 8 9
您可以使用 split.default
拆分每一列,然后 cbind
前两列拆分到每个元素,即
lapply(split.default(testframe[-c(1, 2)], seq_along(testframe)[-c(1, 2)]), function(i)
cbind.data.frame(testframe[c(1, 2)], i))
给出了一个列表,
$`3`
Date Day ABC
1 2000-01-15 01-15 0.1932123
2 2000-01-16 01-16 -0.4346821
3 2000-01-17 01-17 0.9132671
$`4`
Date Day DEF
1 2000-01-15 01-15 1.7933881
2 2000-01-16 01-16 0.9966051
3 2000-01-17 01-17 1.1074905
$`5`
Date Day GHI
1 2000-01-15 01-15 10.0
2 2000-01-16 01-16 17.5
3 2000-01-17 01-17 25.0
$`6`
Date Day JKL
1 2000-01-15 01-15 5
2 2000-01-16 01-16 25
3 2000-01-17 01-17 45
已使用数据
dput(testframe)
structure(list(Date = structure(c(10971, 10972, 10973), class = "Date"),
Day = c("01-15", "01-16", "01-17"), ABC = c(0.193212333898146,
-0.434682108206693, 0.913267096589322), DEF = c(1.79338809206353,
0.996605106833546, 1.10749048744809), GHI = c(10, 17.5, 25
), JKL = c(5, 25, 45)), row.names = c(NA, -3L), class = "data.frame")