确定组内变化的数量以及过渡时的观察结果
Determining number of changes within groups along with observations at the time of transition
假设我有以下面板数据框(下面是可复制的玩具示例):
ID <- c(12232,12232,12232,12232,12232,14452,14452,14452)
Time <- c(1,2,3,4,5,1,2,3)
y1 <- c(2.3,7.8,4.5,3.4,2.3,1.2,0.5,1.9)
State <- c("a","a","a","b","a","c","c","b")
DataFrame <- cbind(ID,Time,y1,State)
我愿意
我想知道是否有某种方法可以识别在状态 (State) 之间转换的个体以及它们发生时的观察结果。
期望的输出:一个数据框,产生在转换时在 State 和 y1 之间转换的个体的 ID,例如,类似于
的东西
ID transition y1
12232 a -> b 4.5
12232 b -> a 3.4
14452 c -> b 0.5
当然,转换字段不需要采用那种格式...ab 和 bc 也可以,重要的是
它按组(ID,因为它是面板数据)工作,并将状态之间的转换与它们发生时的特征相匹配。
非常感谢,这个网站救了我很多次:)
使用 dplyr
的快速答案是
library(dplyr)
DataFrame <- data_frame(ID,Time,y1,State)
output <- DataFrame %>%
group_by(ID) %>% # group the data by ID
mutate(StateL = lead(State)) %>% # create a lead variable called StateL
filter(State != StateL) %>% # subset the case where the state change at t+1
mutate(transState = paste(State, "->", StateL)) %>% # crate a variable transState
select(c(ID, transState, y1)) ## select the vaiables to keep
output
## # A tibble: 3 x 3
## # Groups: ID [2]
## ID transState y1
## <dbl> <chr> <dbl>
## 1 12232 a -> b 4.5
## 2 12232 b -> a 3.4
## 3 14452 c -> b 0.5
##
使用data.table
:
library(data.table)
df <- data.frame(DataFrame)
setDT(df)
df[, lead := shift(State, type = "lead"), by = ID]
df[State != lead, transition := paste0(State, " -> ", lead)]
df <- df[!(is.na(transition)), ]
df <- df[, c("ID", "transition", "y1")]
输出:
ID transition y1
1: 12232 a -> b 4.5
2: 12232 b -> a 3.4
3: 14452 c -> b 0.5
假设我有以下面板数据框(下面是可复制的玩具示例):
ID <- c(12232,12232,12232,12232,12232,14452,14452,14452)
Time <- c(1,2,3,4,5,1,2,3)
y1 <- c(2.3,7.8,4.5,3.4,2.3,1.2,0.5,1.9)
State <- c("a","a","a","b","a","c","c","b")
DataFrame <- cbind(ID,Time,y1,State)
我愿意 我想知道是否有某种方法可以识别在状态 (State) 之间转换的个体以及它们发生时的观察结果。 期望的输出:一个数据框,产生在转换时在 State 和 y1 之间转换的个体的 ID,例如,类似于
的东西ID transition y1
12232 a -> b 4.5
12232 b -> a 3.4
14452 c -> b 0.5
当然,转换字段不需要采用那种格式...ab 和 bc 也可以,重要的是 它按组(ID,因为它是面板数据)工作,并将状态之间的转换与它们发生时的特征相匹配。
非常感谢,这个网站救了我很多次:)
使用 dplyr
的快速答案是
library(dplyr)
DataFrame <- data_frame(ID,Time,y1,State)
output <- DataFrame %>%
group_by(ID) %>% # group the data by ID
mutate(StateL = lead(State)) %>% # create a lead variable called StateL
filter(State != StateL) %>% # subset the case where the state change at t+1
mutate(transState = paste(State, "->", StateL)) %>% # crate a variable transState
select(c(ID, transState, y1)) ## select the vaiables to keep
output
## # A tibble: 3 x 3
## # Groups: ID [2]
## ID transState y1
## <dbl> <chr> <dbl>
## 1 12232 a -> b 4.5
## 2 12232 b -> a 3.4
## 3 14452 c -> b 0.5
##
使用data.table
:
library(data.table)
df <- data.frame(DataFrame)
setDT(df)
df[, lead := shift(State, type = "lead"), by = ID]
df[State != lead, transition := paste0(State, " -> ", lead)]
df <- df[!(is.na(transition)), ]
df <- df[, c("ID", "transition", "y1")]
输出:
ID transition y1
1: 12232 a -> b 4.5
2: 12232 b -> a 3.4
3: 14452 c -> b 0.5