创建多个时差变量的最佳方法
Best way to create multiple time differences variables
我想创建多个变量来显示多个变量与一个变量 (V0) 的时间差。我想要绝对差异(即忽略差异的符号)。我所有的变量都是日期格式。
我有下面的代码,它可以工作,但我想有一种 neater/better 方法可以用更少的代码行来做到这一点。我已经尝试了几件事,但运气不佳。
df$V1_timediff <- (abs(as.numeric(difftime(df$V0, df$V1, units = "days"))))
df$V2_timediff <- (abs(as.numeric(difftime(df$V0, df$V2, units = "days"))))
df$V3_timediff <- (abs(as.numeric(difftime(df$V0, df$V3, units = "days"))))
df$V4_timediff <- (abs(as.numeric(difftime(df$V0, df$V4, units = "days"))))
我将使用 mtcars
进行演示。由于它没有 POSIXt
个对象,我将使用简单的 -
;这也适用于您的情况,没有变化,因此从技术上讲不需要 difftime
,结果应该是相同的。但是,如果适应使用 difftime
.
,则以下两种解决方案的前提都可以工作
dplyr
library(dplyr)
mtcars %>%
mutate(across(vs:carb, list(timediff = ~ abs(as.numeric(cyl - ., units = "days"))))) %>%
head()
# mpg cyl disp hp drat wt qsec vs am gear carb vs_timediff am_timediff gear_timediff carb_timediff
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 6 5 2 2
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 6 5 2 2
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 3 3 0 3
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 5 6 3 5
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 8 8 5 6
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 5 6 3 5
基础 R
tmp <- lapply(mtcars$cyl - subset(mtcars, select = vs:carb),
function(z) abs(as.numeric(z, units = "days")))
names(tmp) <- paste0(names(tmp), "_timediff")
head(cbind(mtcars, tmp))
# mpg cyl disp hp drat wt qsec vs am gear carb vs_timediff am_timediff gear_timediff carb_timediff
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 6 5 2 2
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 6 5 2 2
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 3 3 0 3
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 5 6 3 5
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 8 8 5 6
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 5 6 3 5
在base
中我们可以定义一个UDF并循环遍历列:
time_diff <- function(df, v0, vn) {
abs(as.numeric(difftime(df[[v0]], df[[vn]], units = "days")))
}
lapply(c("t2", "t3"), function(tn) time_diff(test, "t1", tn))
#> [[1]]
#> [1] 0.05208333 0.05208333 0.05208333 0.05208333 0.05208333
#>
#> [[2]]
#> [1] 0.1041667 0.1041667 0.1041667 0.1041667 0.1041667
数据:
structure(list(t1 = structure(c(1014919200, 1014920100, 1014921000,
1014921900, 1014922800),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
t2 = structure(c(1014923700, 1014924600, 1014925500,
1014926400, 1014927300),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
t3 = structure(c(1014928200, 1014929100, 1014930000,
1014930900, 1014931800),
class = c("POSIXct", "POSIXt"), tzone = "UTC")),
class = "data.frame", row.names = c(NA, -5L))
我想创建多个变量来显示多个变量与一个变量 (V0) 的时间差。我想要绝对差异(即忽略差异的符号)。我所有的变量都是日期格式。
我有下面的代码,它可以工作,但我想有一种 neater/better 方法可以用更少的代码行来做到这一点。我已经尝试了几件事,但运气不佳。
df$V1_timediff <- (abs(as.numeric(difftime(df$V0, df$V1, units = "days"))))
df$V2_timediff <- (abs(as.numeric(difftime(df$V0, df$V2, units = "days"))))
df$V3_timediff <- (abs(as.numeric(difftime(df$V0, df$V3, units = "days"))))
df$V4_timediff <- (abs(as.numeric(difftime(df$V0, df$V4, units = "days"))))
我将使用 mtcars
进行演示。由于它没有 POSIXt
个对象,我将使用简单的 -
;这也适用于您的情况,没有变化,因此从技术上讲不需要 difftime
,结果应该是相同的。但是,如果适应使用 difftime
.
dplyr
library(dplyr)
mtcars %>%
mutate(across(vs:carb, list(timediff = ~ abs(as.numeric(cyl - ., units = "days"))))) %>%
head()
# mpg cyl disp hp drat wt qsec vs am gear carb vs_timediff am_timediff gear_timediff carb_timediff
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 6 5 2 2
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 6 5 2 2
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 3 3 0 3
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 5 6 3 5
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 8 8 5 6
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 5 6 3 5
基础 R
tmp <- lapply(mtcars$cyl - subset(mtcars, select = vs:carb),
function(z) abs(as.numeric(z, units = "days")))
names(tmp) <- paste0(names(tmp), "_timediff")
head(cbind(mtcars, tmp))
# mpg cyl disp hp drat wt qsec vs am gear carb vs_timediff am_timediff gear_timediff carb_timediff
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 6 5 2 2
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 6 5 2 2
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 3 3 0 3
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 5 6 3 5
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 8 8 5 6
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 5 6 3 5
在base
中我们可以定义一个UDF并循环遍历列:
time_diff <- function(df, v0, vn) {
abs(as.numeric(difftime(df[[v0]], df[[vn]], units = "days")))
}
lapply(c("t2", "t3"), function(tn) time_diff(test, "t1", tn))
#> [[1]]
#> [1] 0.05208333 0.05208333 0.05208333 0.05208333 0.05208333
#>
#> [[2]]
#> [1] 0.1041667 0.1041667 0.1041667 0.1041667 0.1041667
数据:
structure(list(t1 = structure(c(1014919200, 1014920100, 1014921000,
1014921900, 1014922800),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
t2 = structure(c(1014923700, 1014924600, 1014925500,
1014926400, 1014927300),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
t3 = structure(c(1014928200, 1014929100, 1014930000,
1014930900, 1014931800),
class = c("POSIXct", "POSIXt"), tzone = "UTC")),
class = "data.frame", row.names = c(NA, -5L))