将 VLMC 拟合到非常长的序列
Fitting a VLMC to very long sequences
我正在尝试使 VLMC 适合最长序列为 296 个状态的数据集。我这样做如下所示:
# Load libraries
library(PST)
library(RCurl)
library(TraMineR)
# Load and transform data
x <- getURL("https://gist.githubusercontent.com/aronlindberg/08228977353bf6dc2edb3ec121f54a29/raw/241ef39125ecb55a85b43d7f4cd3d58f617b2ecf/challenge_level.csv")
data <- read.csv(text = x)
data.seq <- seqdef(data[,2:ncol(data)], missing = NA, right = NA, nr = "*")
S1 <- pstree(data.seq, ymin = 0.01, lik = TRUE, with.missing = TRUE, nmin = 2)
但是,这会产生以下错误:
Error in res[i, , drop = FALSE] : subscript out of bounds
我怎样才能使模型适合具有这么长序列的数据?在模型中限制长度有什么好的理由吗?
问题出在你的数据上。通过不在 pstree 函数中设置 L,您的意思是您想要拟合最大阶模型。拟合过程在 L=8 处产生错误,因为您有 nmin=2 但按此顺序只有一个上下文有 nmin=2
> cprob(data.seq, L=8, nmin=2)
[>] 21 sequences, min/max length: 19/296
[>] computing prob., L=8, 2043 distinct context(s)
[>] removing 1894 context(s) where n<2
[>] total time: 0.156 secs
EX FA I1 I2 I3 N1 N2 N3 NR QU TR [n]
I2-I3-FA-I3-EX-I3-EX-I2 0 0.5 0 0.5 0 0 0 0 0 0 0 2
使用 L=8 拟合模型效果很好
S1 <- pstree(data.seq, ymin = 0.01, lik = TRUE, nmin = 2, L=8)
[>] 21 sequence(s) - min/max length: 19/296
[>] max. depth L=8, nmin=2, ymin=0.01
[L] [nodes]
0 1
1 11
2 99
3 368
4 340
5 126
6 34
7 4
8 1
[>] computing sequence(s) likelihood ... (0.804 secs)
[>] total time: 2.968 secs
同样,您不需要在 seqdef() 中使用任何 'missing'、'right' 或 'nr' 选项,也不需要在 pstree()
最好的,
亚历克西斯
我正在尝试使 VLMC 适合最长序列为 296 个状态的数据集。我这样做如下所示:
# Load libraries
library(PST)
library(RCurl)
library(TraMineR)
# Load and transform data
x <- getURL("https://gist.githubusercontent.com/aronlindberg/08228977353bf6dc2edb3ec121f54a29/raw/241ef39125ecb55a85b43d7f4cd3d58f617b2ecf/challenge_level.csv")
data <- read.csv(text = x)
data.seq <- seqdef(data[,2:ncol(data)], missing = NA, right = NA, nr = "*")
S1 <- pstree(data.seq, ymin = 0.01, lik = TRUE, with.missing = TRUE, nmin = 2)
但是,这会产生以下错误:
Error in res[i, , drop = FALSE] : subscript out of bounds
我怎样才能使模型适合具有这么长序列的数据?在模型中限制长度有什么好的理由吗?
问题出在你的数据上。通过不在 pstree 函数中设置 L,您的意思是您想要拟合最大阶模型。拟合过程在 L=8 处产生错误,因为您有 nmin=2 但按此顺序只有一个上下文有 nmin=2
> cprob(data.seq, L=8, nmin=2)
[>] 21 sequences, min/max length: 19/296
[>] computing prob., L=8, 2043 distinct context(s)
[>] removing 1894 context(s) where n<2
[>] total time: 0.156 secs
EX FA I1 I2 I3 N1 N2 N3 NR QU TR [n]
I2-I3-FA-I3-EX-I3-EX-I2 0 0.5 0 0.5 0 0 0 0 0 0 0 2
使用 L=8 拟合模型效果很好
S1 <- pstree(data.seq, ymin = 0.01, lik = TRUE, nmin = 2, L=8)
[>] 21 sequence(s) - min/max length: 19/296
[>] max. depth L=8, nmin=2, ymin=0.01
[L] [nodes]
0 1
1 11
2 99
3 368
4 340
5 126
6 34
7 4
8 1
[>] computing sequence(s) likelihood ... (0.804 secs)
[>] total time: 2.968 secs
同样,您不需要在 seqdef() 中使用任何 'missing'、'right' 或 'nr' 选项,也不需要在 pstree()
最好的, 亚历克西斯