r 按 ID 按年折叠
r collapse by year by ID
我有一个像这样每个 ID 有多行的数据集
ID From To State
1 2004 2005 MD
1 2005 2005 MD
1 2005 2012 DC
1 2012 2015 DC
1 2015 2020 DC
1 2012 2013 MD
1 2013 2016 MD
1 2016 2019 MD
1 2019 2020 MD
2 2003 2004 OR
2 2004 2008 OR
2 2008 2013 AZ
2 2013 2015 AZ
我的目标是折叠多个“从”和“到”列以创建像
这样的平滑时间线
ID From To State
1 2004 2005 MD
1 2005 2020 DC
1 2012 2020 MD
2 2003 2008 OR
2 2008 2015 AZ
不确定如何完成此操作。非常感谢您的帮助。谢谢
按'ID'、'State'和'State'的运行-length-id分组,得到'From'的first
和last
个 'To'
library(dplyr)
library(data.table)
df1 %>%
group_by(ID, State, grp = rleid(State)) %>%
summarise(From = first(From), To = last(To), .groups = 'drop') %>%
select(-grp)
-输出
# A tibble: 5 × 4
ID State From To
<int> <chr> <int> <int>
1 1 DC 2005 2020
2 1 MD 2004 2005
3 1 MD 2012 2020
4 2 AZ 2008 2015
5 2 OR 2003 2008
数据
df1 <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L), From = c(2004L, 2005L, 2005L, 2012L, 2015L, 2012L,
2013L, 2016L, 2019L, 2003L, 2004L, 2008L, 2013L), To = c(2005L,
2005L, 2012L, 2015L, 2020L, 2013L, 2016L, 2019L, 2020L, 2004L,
2008L, 2013L, 2015L), State = c("MD", "MD", "DC", "DC", "DC",
"MD", "MD", "MD", "MD", "OR", "OR", "AZ", "AZ")),
class = "data.frame", row.names = c(NA,
-13L))
我有一个像这样每个 ID 有多行的数据集
ID From To State
1 2004 2005 MD
1 2005 2005 MD
1 2005 2012 DC
1 2012 2015 DC
1 2015 2020 DC
1 2012 2013 MD
1 2013 2016 MD
1 2016 2019 MD
1 2019 2020 MD
2 2003 2004 OR
2 2004 2008 OR
2 2008 2013 AZ
2 2013 2015 AZ
我的目标是折叠多个“从”和“到”列以创建像
这样的平滑时间线ID From To State
1 2004 2005 MD
1 2005 2020 DC
1 2012 2020 MD
2 2003 2008 OR
2 2008 2015 AZ
不确定如何完成此操作。非常感谢您的帮助。谢谢
按'ID'、'State'和'State'的运行-length-id分组,得到'From'的first
和last
个 'To'
library(dplyr)
library(data.table)
df1 %>%
group_by(ID, State, grp = rleid(State)) %>%
summarise(From = first(From), To = last(To), .groups = 'drop') %>%
select(-grp)
-输出
# A tibble: 5 × 4
ID State From To
<int> <chr> <int> <int>
1 1 DC 2005 2020
2 1 MD 2004 2005
3 1 MD 2012 2020
4 2 AZ 2008 2015
5 2 OR 2003 2008
数据
df1 <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L), From = c(2004L, 2005L, 2005L, 2012L, 2015L, 2012L,
2013L, 2016L, 2019L, 2003L, 2004L, 2008L, 2013L), To = c(2005L,
2005L, 2012L, 2015L, 2020L, 2013L, 2016L, 2019L, 2020L, 2004L,
2008L, 2013L, 2015L), State = c("MD", "MD", "DC", "DC", "DC",
"MD", "MD", "MD", "MD", "OR", "OR", "AZ", "AZ")),
class = "data.frame", row.names = c(NA,
-13L))