从另一列 R 中按条件查找天数
Find number days by condition from another column, R
我有一个这样的数据框:
Ptt Date Area
88734 2016-10-23 05:39:18 BA
88734 2016-10-23 06:53:13 BA
88734 2016-11-09 08:32:18 MI
88734 2016-11-19 06:45:27 MI
88734 2016-12-20 12:30:43 MI
88734 2016-12-29 02:45:35 FA
129041 2017-10-05 04:55:24 BA
129041 2016-10-23 06:45:30 MI
129041 2016-11-16 07:10:32 FA
129041 2016-11-29 03:43:54 FA
120941 2017-01-02 14:54:39 FA
...
我想数一数每个 Ptt
在每个 area
有多少天,但我不知道该怎么做,有人知道吗?
我期望的是这样的:
Ptt Date Area Days
88734 2016-10-23 05:39:18 BA 1
88734 2016-10-23 06:53:13 BA 1
88734 2016-11-09 08:32:18 MI 1
88734 2016-11-19 06:45:27 MI 2
88734 2016-12-20 12:30:43 MI 3
88734 2016-12-29 02:45:35 FA 1
129041 2017-10-05 04:55:24 BA 1
129041 2016-10-23 06:45:30 MI 1
129041 2016-11-16 07:10:32 FA 1
129041 2016-11-29 03:43:54 FA 2
120941 2017-01-02 14:54:39 FA 3
...
dt = data.table(Ptt= c("88734", "88734", "88734", "88734", "88734", "88734", "120941", "120941","120941","120941","120941"),
date = c("2016-10-23 05:39:18",
"2016-10-23 06:53:13 ",
"2016-11-09 08:32:18",
"2016-11-19 06:45:27",
"2016-12-20 12:30:43",
"2016-12-29 02:45:35",
"2017-10-05 04:55:24",
"2016-10-23 06:45:30",
"2016-11-16 07:10:32",
"2016-11-29 03:43:54",
"2017-01-02 14:54:39"),
Area = c("BA", "BA", "MI", "MI", "MI", "FA", "BA", "MI", "FA", "FA", "FA"))
编辑
我没解释清楚
所以,我想知道每个 Ptt
在每个 Area
有多少天
例如:88734
在 BA
有 1 天,在 MI
有 3 天,在 FA
有 1 天,依此类推。
我想要这个:
Ptt Area Days
88734 BA 1
88734 MI 3
88734 FA 1
129041 BA 1
129041 MI 1
120941 FA 3
谢谢!
您可以将 date
列转换为 POSIXct
并从中提取日期。对于每个 Ptt
和 Area
,您可以为每个日期分配一个唯一的编号。
library(dplyr)
dt %>%
mutate(date = lubridate::ymd_hms(date),
date1 = as.Date(date)) %>%
group_by(Ptt, Area) %>%
mutate(Days = dense_rank(date1)) %>%
ungroup() %>%
select(-date1)
# Ptt date Area Days
# <chr> <dttm> <chr> <int>
# 1 88734 2016-10-23 05:39:18 BA 1
# 2 88734 2016-10-23 06:53:13 BA 1
# 3 88734 2016-11-09 08:32:18 MI 1
# 4 88734 2016-11-19 06:45:27 MI 2
# 5 88734 2016-12-20 12:30:43 MI 3
# 6 88734 2016-12-29 02:45:35 FA 1
# 7 120941 2017-10-05 04:55:24 BA 1
# 8 120941 2016-10-23 06:45:30 MI 1
# 9 120941 2016-11-16 07:10:32 FA 1
#10 120941 2016-11-29 03:43:54 FA 2
#11 120941 2017-01-02 14:54:39 FA 3
因为你有 data.table
,你也可以使用 data.table
语法来做到这一点:
library(data.table)
dt[, date := lubridate::ymd_hms(date)]
dt[, date1 := as.Date(date)]
dt[, Days := match(date1, unique(date1)), .(Ptt, Area)]
我有一个这样的数据框:
Ptt Date Area
88734 2016-10-23 05:39:18 BA
88734 2016-10-23 06:53:13 BA
88734 2016-11-09 08:32:18 MI
88734 2016-11-19 06:45:27 MI
88734 2016-12-20 12:30:43 MI
88734 2016-12-29 02:45:35 FA
129041 2017-10-05 04:55:24 BA
129041 2016-10-23 06:45:30 MI
129041 2016-11-16 07:10:32 FA
129041 2016-11-29 03:43:54 FA
120941 2017-01-02 14:54:39 FA
...
我想数一数每个 Ptt
在每个 area
有多少天,但我不知道该怎么做,有人知道吗?
我期望的是这样的:
Ptt Date Area Days
88734 2016-10-23 05:39:18 BA 1
88734 2016-10-23 06:53:13 BA 1
88734 2016-11-09 08:32:18 MI 1
88734 2016-11-19 06:45:27 MI 2
88734 2016-12-20 12:30:43 MI 3
88734 2016-12-29 02:45:35 FA 1
129041 2017-10-05 04:55:24 BA 1
129041 2016-10-23 06:45:30 MI 1
129041 2016-11-16 07:10:32 FA 1
129041 2016-11-29 03:43:54 FA 2
120941 2017-01-02 14:54:39 FA 3
...
dt = data.table(Ptt= c("88734", "88734", "88734", "88734", "88734", "88734", "120941", "120941","120941","120941","120941"),
date = c("2016-10-23 05:39:18",
"2016-10-23 06:53:13 ",
"2016-11-09 08:32:18",
"2016-11-19 06:45:27",
"2016-12-20 12:30:43",
"2016-12-29 02:45:35",
"2017-10-05 04:55:24",
"2016-10-23 06:45:30",
"2016-11-16 07:10:32",
"2016-11-29 03:43:54",
"2017-01-02 14:54:39"),
Area = c("BA", "BA", "MI", "MI", "MI", "FA", "BA", "MI", "FA", "FA", "FA"))
编辑
我没解释清楚
所以,我想知道每个 Ptt
在每个 Area
有多少天
例如:88734
在 BA
有 1 天,在 MI
有 3 天,在 FA
有 1 天,依此类推。
我想要这个:
Ptt Area Days
88734 BA 1
88734 MI 3
88734 FA 1
129041 BA 1
129041 MI 1
120941 FA 3
谢谢!
您可以将 date
列转换为 POSIXct
并从中提取日期。对于每个 Ptt
和 Area
,您可以为每个日期分配一个唯一的编号。
library(dplyr)
dt %>%
mutate(date = lubridate::ymd_hms(date),
date1 = as.Date(date)) %>%
group_by(Ptt, Area) %>%
mutate(Days = dense_rank(date1)) %>%
ungroup() %>%
select(-date1)
# Ptt date Area Days
# <chr> <dttm> <chr> <int>
# 1 88734 2016-10-23 05:39:18 BA 1
# 2 88734 2016-10-23 06:53:13 BA 1
# 3 88734 2016-11-09 08:32:18 MI 1
# 4 88734 2016-11-19 06:45:27 MI 2
# 5 88734 2016-12-20 12:30:43 MI 3
# 6 88734 2016-12-29 02:45:35 FA 1
# 7 120941 2017-10-05 04:55:24 BA 1
# 8 120941 2016-10-23 06:45:30 MI 1
# 9 120941 2016-11-16 07:10:32 FA 1
#10 120941 2016-11-29 03:43:54 FA 2
#11 120941 2017-01-02 14:54:39 FA 3
因为你有 data.table
,你也可以使用 data.table
语法来做到这一点:
library(data.table)
dt[, date := lubridate::ymd_hms(date)]
dt[, date1 := as.Date(date)]
dt[, Days := match(date1, unique(date1)), .(Ptt, Area)]