从一列中减去第一个单元格,并从另一列中的第二个单元格中减去它
Subtract the first cell from one column and subtract it from the second cell in a different column
我有一个阶段开始时间和阶段结束时间的数据框。我想从上一阶段的结束时间减去一个阶段的开始时间。即,我想从一列中减去第 n 个单元格,并从另一列中的 n-1 个单元格中减去。这在 Excel 中很容易做到,但我不确定如何在 R 中做到这一点。
这是数据的样子。我也包含了数据的 dput():
#test data frame
Stage StartTime EndTime
102 2021-07-19 17:23:00 2021-07-19 21:53:24
103 2021-07-19 21:54:00 2021-07-19 23:00:14
104 2021-07-19 23:01:00 2021-07-20 00:50:10
105 2021-07-20 00:51:00 2021-07-20 01:50:58
106 2021-07-20 01:51:00 2021-07-20 03:28:22
107 2021-07-20 03:29:00 2021-07-20 04:28:00
108 2021-07-20 05:38:00 2021-07-20 08:19:26
> dput(test[1:7,])
structure(list(Stage = c(102, 103, 104, 105, 106, 107, 108),
StartTime = structure(c(1626733380, 1626749640, 1626753660,
1626760260, 1626763860, 1626769740, 1626777480), tzone = "", class = c("POSIXct",
"POSIXt")), EndTime = structure(c(1626749604, 1626753614,
1626760210, 1626763858, 1626769702, 1626773280, 1626787166
), tzone = "", class = c("POSIXct", "POSIXt")), row.names = c(NA, -7L), class = c("tbl_df",
"tbl", "data.frame"))
例如,我知道我可以手动执行以下操作以获取第 102 阶段结束和第 103 阶段开始之间的时间差:
as.numeric(difftime("2021-07-19 21:54:00", "2021-07-19 21:53:24", units='sec'))
> 36
我试图让它更通用,但这不起作用。但是,想法是每次都将行索引加 1:
#does not work
# want to subtract row 2, col 2 - the later starting time
# from row 1, col 3 - the earlier end time
as.numeric(difftime(test[1,3], test[2,2], units='sec'))
保存时差的位置并不重要,无论是数据帧内的新列还是全新的数据帧。什么都行。我真的不知道该怎么做。也许是一个循环?任何帮助,将不胜感激。谢谢。
您可以使用 the
滞后函数将日期用作向量,例如
library(tidyverse)
data_example <-
structure(list(
Stage = c(102, 103, 104, 105, 106, 107, 108),
StartTime = structure(
c(
1626733380,
1626749640,
1626753660,
1626760260,
1626763860,
1626769740,
1626777480
),
tzone = "",
class = c("POSIXct",
"POSIXt")
),
EndTime = structure(
c(
1626749604,
1626753614,
1626760210,
1626763858,
1626769702,
1626773280,
1626787166
),
tzone = "",
class = c("POSIXct", "POSIXt")
)
))
tibble_df <- data_example |> as_tibble()
tibble_df |>
mutate(time_diff = difftime(StartTime,lag(EndTime)))
#> # A tibble: 7 x 4
#> Stage StartTime EndTime time_diff
#> <dbl> <dttm> <dttm> <drtn>
#> 1 102 2021-07-19 19:23:00 2021-07-19 23:53:24 NA secs
#> 2 103 2021-07-19 23:54:00 2021-07-20 01:00:14 36 secs
#> 3 104 2021-07-20 01:01:00 2021-07-20 02:50:10 46 secs
#> 4 105 2021-07-20 02:51:00 2021-07-20 03:50:58 50 secs
#> 5 106 2021-07-20 03:51:00 2021-07-20 05:28:22 2 secs
#> 6 107 2021-07-20 05:29:00 2021-07-20 06:28:00 38 secs
#> 7 108 2021-07-20 07:38:00 2021-07-20 10:19:26 4200 secs
由 reprex package (v2.0.1)
于 2021-10-18 创建
我有一个阶段开始时间和阶段结束时间的数据框。我想从上一阶段的结束时间减去一个阶段的开始时间。即,我想从一列中减去第 n 个单元格,并从另一列中的 n-1 个单元格中减去。这在 Excel 中很容易做到,但我不确定如何在 R 中做到这一点。
这是数据的样子。我也包含了数据的 dput():
#test data frame
Stage StartTime EndTime
102 2021-07-19 17:23:00 2021-07-19 21:53:24
103 2021-07-19 21:54:00 2021-07-19 23:00:14
104 2021-07-19 23:01:00 2021-07-20 00:50:10
105 2021-07-20 00:51:00 2021-07-20 01:50:58
106 2021-07-20 01:51:00 2021-07-20 03:28:22
107 2021-07-20 03:29:00 2021-07-20 04:28:00
108 2021-07-20 05:38:00 2021-07-20 08:19:26
> dput(test[1:7,])
structure(list(Stage = c(102, 103, 104, 105, 106, 107, 108),
StartTime = structure(c(1626733380, 1626749640, 1626753660,
1626760260, 1626763860, 1626769740, 1626777480), tzone = "", class = c("POSIXct",
"POSIXt")), EndTime = structure(c(1626749604, 1626753614,
1626760210, 1626763858, 1626769702, 1626773280, 1626787166
), tzone = "", class = c("POSIXct", "POSIXt")), row.names = c(NA, -7L), class = c("tbl_df",
"tbl", "data.frame"))
例如,我知道我可以手动执行以下操作以获取第 102 阶段结束和第 103 阶段开始之间的时间差:
as.numeric(difftime("2021-07-19 21:54:00", "2021-07-19 21:53:24", units='sec'))
> 36
我试图让它更通用,但这不起作用。但是,想法是每次都将行索引加 1:
#does not work
# want to subtract row 2, col 2 - the later starting time
# from row 1, col 3 - the earlier end time
as.numeric(difftime(test[1,3], test[2,2], units='sec'))
保存时差的位置并不重要,无论是数据帧内的新列还是全新的数据帧。什么都行。我真的不知道该怎么做。也许是一个循环?任何帮助,将不胜感激。谢谢。
您可以使用 the
滞后函数将日期用作向量,例如
library(tidyverse)
data_example <-
structure(list(
Stage = c(102, 103, 104, 105, 106, 107, 108),
StartTime = structure(
c(
1626733380,
1626749640,
1626753660,
1626760260,
1626763860,
1626769740,
1626777480
),
tzone = "",
class = c("POSIXct",
"POSIXt")
),
EndTime = structure(
c(
1626749604,
1626753614,
1626760210,
1626763858,
1626769702,
1626773280,
1626787166
),
tzone = "",
class = c("POSIXct", "POSIXt")
)
))
tibble_df <- data_example |> as_tibble()
tibble_df |>
mutate(time_diff = difftime(StartTime,lag(EndTime)))
#> # A tibble: 7 x 4
#> Stage StartTime EndTime time_diff
#> <dbl> <dttm> <dttm> <drtn>
#> 1 102 2021-07-19 19:23:00 2021-07-19 23:53:24 NA secs
#> 2 103 2021-07-19 23:54:00 2021-07-20 01:00:14 36 secs
#> 3 104 2021-07-20 01:01:00 2021-07-20 02:50:10 46 secs
#> 4 105 2021-07-20 02:51:00 2021-07-20 03:50:58 50 secs
#> 5 106 2021-07-20 03:51:00 2021-07-20 05:28:22 2 secs
#> 6 107 2021-07-20 05:29:00 2021-07-20 06:28:00 38 secs
#> 7 108 2021-07-20 07:38:00 2021-07-20 10:19:26 4200 secs
由 reprex package (v2.0.1)
于 2021-10-18 创建