分组后估计模型子集的概率
Estimating probabilities for model subset after grouping
使用的数据可用here(文件名为"figshare.txt")。
我估计了马尔可夫模型的转移概率,其中观测值按位置分组 (group_by(km)
)。
data <- data %>% group_by(km) %>% summarize(pp_chain=list(pp)) %>% as.data.frame
pp_chains <- data$pp_chain; names(pp_chains) <- data$km
fit <- markovchainFit(pp_chains)
输出(此处总结)显示了模型整体的概率估计:
print(fit$estimate)
0 1
0 0.9116832 0.08831677
1 0.5250852 0.47491476
假设,我所追求的输出会更具体,并会为我提供每个位置的概率 (km
)。
看起来像这样:
km = 80
0 1
0 0.7116832 0.28831677
1 0.1250852 0.17491476
km = 81
0 1
0 0.8116832 0.18831677
1 0.4250852 0.37491476
km = 83
0 1
0 0.6116832 0.38831677
1 0.3250852 0.27491476
Does anyone know how to extract the Markov model estimates for each location (km
) individually after the model is run?
一个简单的 lapply()
解决方案是否足够?据我了解,每个序列都是单独处理的,即没有复杂的相互依赖关系等?
library(dplyr)
library(markovchain)
data <- read.table(paste0("https://ndownloader.figshare.com/files",
"/10412271?private_link=ace5b44bc12394a7c46d"), header=TRUE, sep="\t")
data <- data %>% group_by(km) %>% summarize(pp_chain=list(pp)) %>% as.data.frame
pp_chains <- data$pp_chain; names(pp_chains) <- data$km
est <- lapply(pp_chains, function(x) markovchainFit(x)$estimate)
head(est, 3)
# $`80`
# 0 1
# 0 0.8470588 0.1529412
# 1 0.7222222 0.2777778
# $`81`
# 0 1
# 0 0.6976378 0.3023622
# 1 0.2107574 0.7892426
# $`83`
# 0 1
# 0 0.9706840 0.02931596
# 1 0.4210526 0.57894737
使用的数据可用here(文件名为"figshare.txt")。
我估计了马尔可夫模型的转移概率,其中观测值按位置分组 (group_by(km)
)。
data <- data %>% group_by(km) %>% summarize(pp_chain=list(pp)) %>% as.data.frame
pp_chains <- data$pp_chain; names(pp_chains) <- data$km
fit <- markovchainFit(pp_chains)
输出(此处总结)显示了模型整体的概率估计:
print(fit$estimate)
0 1
0 0.9116832 0.08831677
1 0.5250852 0.47491476
假设,我所追求的输出会更具体,并会为我提供每个位置的概率 (km
)。
看起来像这样:
km = 80
0 1
0 0.7116832 0.28831677
1 0.1250852 0.17491476
km = 81
0 1
0 0.8116832 0.18831677
1 0.4250852 0.37491476
km = 83
0 1
0 0.6116832 0.38831677
1 0.3250852 0.27491476
Does anyone know how to extract the Markov model estimates for each location (
km
) individually after the model is run?
一个简单的 lapply()
解决方案是否足够?据我了解,每个序列都是单独处理的,即没有复杂的相互依赖关系等?
library(dplyr)
library(markovchain)
data <- read.table(paste0("https://ndownloader.figshare.com/files",
"/10412271?private_link=ace5b44bc12394a7c46d"), header=TRUE, sep="\t")
data <- data %>% group_by(km) %>% summarize(pp_chain=list(pp)) %>% as.data.frame
pp_chains <- data$pp_chain; names(pp_chains) <- data$km
est <- lapply(pp_chains, function(x) markovchainFit(x)$estimate)
head(est, 3)
# $`80`
# 0 1
# 0 0.8470588 0.1529412
# 1 0.7222222 0.2777778
# $`81`
# 0 1
# 0 0.6976378 0.3023622
# 1 0.2107574 0.7892426
# $`83`
# 0 1
# 0 0.9706840 0.02931596
# 1 0.4210526 0.57894737