如何计算 R 中的转移概率

How to calculate transition probabilities in R

我想计算人年组合(面板数据)值之间发生变化的频率。这模仿了 Stata 的命令 xttrans。索引 6 和 7 之间的过渡不应包括在内,因为它不是一个人内部的过渡。

df = data.frame(id=c(1,1,1,1,1,1,1,2,2,2,2,2,2,2), 
                year=seq(from=2003, to=2009, by=1), 
                health=c(3,1,2,2,5,1,1,1,2,3,2,1,1,2))

这是一个基本的 R 解决方案,用于按 id 组计算转换计数:

with(df, do.call(`+`, tapply(health, id, function(x){
  x <- factor(x, levels = min(health, na.rm = T):max(health, na.rm = T))
  table(x[-length(x)], x[-1])
})))

#    1 2 3 4 5
#  1 2 3 0 0 0
#  2 1 1 1 0 1
#  3 1 1 0 0 0
#  4 0 0 0 0 0
#  5 1 0 0 0 0
library(tidyverse)

# Calculate the last health status for each id
df <- df %>% 
         group_by(id) %>% 
         mutate(lastHealth=lag(health)) %>%  
         ungroup()
# Count nunmber of existing transitions
transitions <- df %>% 
                  group_by(health, lastHealth) %>%  
                  summarise(N=n()) %>% 
                  ungroup()
# Fill in the transition grid to include possible transitions that weren't observed
transitions <- transitions %>% 
                 complete(health=1:5, lastHealth=1:5, fill=list(N=0))
# Present the transitions in the required format
transitions %>% 
  pivot_wider(names_from="health", values_from="N", names_prefix="health") %>%
  filter(!is.na(lastHealth))