确定组内变化的数量以及过渡时的观察结果

Question

假设我有以下面板数据框（下面是可复制的玩具示例）：

ID <- c(12232,12232,12232,12232,12232,14452,14452,14452)
Time <- c(1,2,3,4,5,1,2,3)
y1 <- c(2.3,7.8,4.5,3.4,2.3,1.2,0.5,1.9)
State <- c("a","a","a","b","a","c","c","b")
DataFrame <- cbind(ID,Time,y1,State)

我愿意我想知道是否有某种方法可以识别在状态 (State) 之间转换的个体以及它们发生时的观察结果。期望的输出：一个数据框，产生在转换时在 State 和 y1 之间转换的个体的 ID，例如，类似于

的东西

ID     transition y1
12232   a -> b    4.5
12232   b -> a    3.4
14452   c -> b    0.5

当然，转换字段不需要采用那种格式...ab 和 bc 也可以，重要的是它按组（ID，因为它是面板数据）工作，并将状态之间的转换与它们发生时的特征相匹配。

非常感谢，这个网站救了我很多次:)

Answer 1

使用 dplyr 的快速答案是

library(dplyr)
DataFrame <- data_frame(ID,Time,y1,State)
output <- DataFrame %>% 
    group_by(ID) %>% # group the data by ID
    mutate(StateL = lead(State)) %>% # create a lead variable called StateL
    filter(State != StateL) %>% # subset the case where the state change at t+1
    mutate(transState = paste(State, "->", StateL)) %>% # crate a variable transState 
    select(c(ID, transState, y1)) ## select the vaiables to keep
output
##  # A tibble: 3 x 3
##  # Groups:   ID [2]
##       ID transState    y1
##    <dbl>      <chr> <dbl>
##  1 12232     a -> b   4.5
##  2 12232     b -> a   3.4
##  3 14452     c -> b   0.5
##

Answer 2

使用data.table:

    library(data.table)
    df <- data.frame(DataFrame)
    setDT(df)
    df[, lead := shift(State, type = "lead"), by = ID]
    df[State != lead, transition := paste0(State, " -> ", lead)]
    df <- df[!(is.na(transition)), ]
    df <- df[, c("ID", "transition", "y1")]

输出：

      ID transition  y1
1: 12232     a -> b 4.5
2: 12232     b -> a 3.4
3: 14452     c -> b 0.5

确定组内变化的数量以及过渡时的观察结果

Determining number of changes within groups along with observations at the time of transition

r

panel-data