如何计算 R 中的转移概率
How to calculate transition probabilities in R
我想计算人年组合(面板数据)值之间发生变化的频率。这模仿了 Stata 的命令 xttrans
。索引 6 和 7 之间的过渡不应包括在内,因为它不是一个人内部的过渡。
df = data.frame(id=c(1,1,1,1,1,1,1,2,2,2,2,2,2,2),
year=seq(from=2003, to=2009, by=1),
health=c(3,1,2,2,5,1,1,1,2,3,2,1,1,2))
这是一个基本的 R 解决方案,用于按 id
组计算转换计数:
with(df, do.call(`+`, tapply(health, id, function(x){
x <- factor(x, levels = min(health, na.rm = T):max(health, na.rm = T))
table(x[-length(x)], x[-1])
})))
# 1 2 3 4 5
# 1 2 3 0 0 0
# 2 1 1 1 0 1
# 3 1 1 0 0 0
# 4 0 0 0 0 0
# 5 1 0 0 0 0
library(tidyverse)
# Calculate the last health status for each id
df <- df %>%
group_by(id) %>%
mutate(lastHealth=lag(health)) %>%
ungroup()
# Count nunmber of existing transitions
transitions <- df %>%
group_by(health, lastHealth) %>%
summarise(N=n()) %>%
ungroup()
# Fill in the transition grid to include possible transitions that weren't observed
transitions <- transitions %>%
complete(health=1:5, lastHealth=1:5, fill=list(N=0))
# Present the transitions in the required format
transitions %>%
pivot_wider(names_from="health", values_from="N", names_prefix="health") %>%
filter(!is.na(lastHealth))
我想计算人年组合(面板数据)值之间发生变化的频率。这模仿了 Stata 的命令 xttrans
。索引 6 和 7 之间的过渡不应包括在内,因为它不是一个人内部的过渡。
df = data.frame(id=c(1,1,1,1,1,1,1,2,2,2,2,2,2,2),
year=seq(from=2003, to=2009, by=1),
health=c(3,1,2,2,5,1,1,1,2,3,2,1,1,2))
这是一个基本的 R 解决方案,用于按 id
组计算转换计数:
with(df, do.call(`+`, tapply(health, id, function(x){
x <- factor(x, levels = min(health, na.rm = T):max(health, na.rm = T))
table(x[-length(x)], x[-1])
})))
# 1 2 3 4 5
# 1 2 3 0 0 0
# 2 1 1 1 0 1
# 3 1 1 0 0 0
# 4 0 0 0 0 0
# 5 1 0 0 0 0
library(tidyverse)
# Calculate the last health status for each id
df <- df %>%
group_by(id) %>%
mutate(lastHealth=lag(health)) %>%
ungroup()
# Count nunmber of existing transitions
transitions <- df %>%
group_by(health, lastHealth) %>%
summarise(N=n()) %>%
ungroup()
# Fill in the transition grid to include possible transitions that weren't observed
transitions <- transitions %>%
complete(health=1:5, lastHealth=1:5, fill=list(N=0))
# Present the transitions in the required format
transitions %>%
pivot_wider(names_from="health", values_from="N", names_prefix="health") %>%
filter(!is.na(lastHealth))