数据太长 R FlexmixNL 包中的错误
Data is too long Error in R FlexmixNL package
我试图在线搜索此内容,但无法准确找出我的问题所在。这是我的代码:
n = 10000
x1 <- runif(n,0,100)
x2 <- runif(n,0,100)
y1 <- 10*sin(x1/10) + 10 + rnorm(n, sd = 1)
y2 <- x2 * cos(x2) - 2 * rnorm(n, sd = 2)
x <- c(x1, x2)
y <- c(x1, x2)
start1 = list(a = 10, b = 5)
start2 = list(a = 30, b = 5)
library(flexmix)
library(flexmixNL)
modelNL <- flexmix(y~x, k =2,
model = FLXMRnlm(formula = y ~ a*x/(b+x),
family = "gaussian",
start = list(start1, start2)))
plot(x, y, col = clusters(modelNL))
在情节之前,它给了我这个错误:
Error in matrix(1, nrow = sum(groups$groupfirst)) : data is too long
我检查了 google 类似的错误,但我不太明白我自己的代码有什么问题导致了这个错误。
如你所知,我是 R 的新手,所以请尽可能用最通俗的语言解释一下。提前谢谢你。
具有讽刺意味的是(在错误消息的上下文中说数据“太长”)我认为该错误的近因是没有 data
参数。如果您以数据帧的形式给它 data
,您仍然会收到错误,但它与您遇到的错误不同。当您绘制数据时,至少从统计分布的角度来看,您会得到一组相当奇怪的值,并且不清楚您为什么要尝试使用此公式对其进行建模。尽管如此,有了这些起始值和数据的数据框参数,人们还是看到了结果。
> modelNL <- flexmix(y~x, k =2, data=data.frame(x=x,y=y),
+ model = FLXMRnlm(formula = y ~ a*x/(b+x),
+ family = "gaussian",
+ start = list(start1, start2)))
> modelNL
Call:
flexmix(formula = y ~ x, data = data.frame(x = x, y = y), k = 2, model = FLXMRnlm(formula = y ~
a * x/(b + x), family = "gaussian", start = list(start1, start2)))
Cluster sizes:
1 2
6664 13336
convergence after 20 iterations
> summary(modelNL)
Call:
flexmix(formula = y ~ x, data = data.frame(x = x, y = y), k = 2, model = FLXMRnlm(formula = y ~
a * x/(b + x), family = "gaussian", start = list(start1, start2)))
prior size post>0 ratio
Comp.1 0.436 6664 20000 0.333
Comp.2 0.564 13336 16306 0.818
'log Lik.' -91417.03 (df=7)
AIC: 182848.1 BIC: 182903.4
大多数 R 回归函数首先检查 data=
参数中公式中的匹配名称。显然这个函数在需要到全局环境中去匹配公式标记时失败了。
我尝试了数据图建议的公式并得到收敛结果:
> modelNL <- flexmix(y~x, k =2, data=data.frame(x=x,y=y),
+ model = FLXMRnlm(formula = y ~ a*x*cos(x+b),
+ family = "gaussian",
+ start = list(start1, start2)))
> modelNL
Call:
flexmix(formula = y ~ x, data = data.frame(x = x, y = y), k = 2, model = FLXMRnlm(formula = y ~
a * x * cos(x + b), family = "gaussian", start = list(start1, start2)))
Cluster sizes:
1 2
9395 10605
convergence after 17 iterations
> summary(modelNL)
Call:
flexmix(formula = y ~ x, data = data.frame(x = x, y = y), k = 2, model = FLXMRnlm(formula = y ~
a * x * cos(x + b), family = "gaussian", start = list(start1, start2)))
prior size post>0 ratio
Comp.1 0.521 9395 18009 0.522
Comp.2 0.479 10605 13378 0.793
'log Lik.' -78659.85 (df=7)
AIC: 157333.7 BIC: 157389
与第一个公式相比,AIC 的减少似乎很大。
我试图在线搜索此内容,但无法准确找出我的问题所在。这是我的代码:
n = 10000
x1 <- runif(n,0,100)
x2 <- runif(n,0,100)
y1 <- 10*sin(x1/10) + 10 + rnorm(n, sd = 1)
y2 <- x2 * cos(x2) - 2 * rnorm(n, sd = 2)
x <- c(x1, x2)
y <- c(x1, x2)
start1 = list(a = 10, b = 5)
start2 = list(a = 30, b = 5)
library(flexmix)
library(flexmixNL)
modelNL <- flexmix(y~x, k =2,
model = FLXMRnlm(formula = y ~ a*x/(b+x),
family = "gaussian",
start = list(start1, start2)))
plot(x, y, col = clusters(modelNL))
在情节之前,它给了我这个错误:
Error in matrix(1, nrow = sum(groups$groupfirst)) : data is too long
我检查了 google 类似的错误,但我不太明白我自己的代码有什么问题导致了这个错误。
如你所知,我是 R 的新手,所以请尽可能用最通俗的语言解释一下。提前谢谢你。
具有讽刺意味的是(在错误消息的上下文中说数据“太长”)我认为该错误的近因是没有 data
参数。如果您以数据帧的形式给它 data
,您仍然会收到错误,但它与您遇到的错误不同。当您绘制数据时,至少从统计分布的角度来看,您会得到一组相当奇怪的值,并且不清楚您为什么要尝试使用此公式对其进行建模。尽管如此,有了这些起始值和数据的数据框参数,人们还是看到了结果。
> modelNL <- flexmix(y~x, k =2, data=data.frame(x=x,y=y),
+ model = FLXMRnlm(formula = y ~ a*x/(b+x),
+ family = "gaussian",
+ start = list(start1, start2)))
> modelNL
Call:
flexmix(formula = y ~ x, data = data.frame(x = x, y = y), k = 2, model = FLXMRnlm(formula = y ~
a * x/(b + x), family = "gaussian", start = list(start1, start2)))
Cluster sizes:
1 2
6664 13336
convergence after 20 iterations
> summary(modelNL)
Call:
flexmix(formula = y ~ x, data = data.frame(x = x, y = y), k = 2, model = FLXMRnlm(formula = y ~
a * x/(b + x), family = "gaussian", start = list(start1, start2)))
prior size post>0 ratio
Comp.1 0.436 6664 20000 0.333
Comp.2 0.564 13336 16306 0.818
'log Lik.' -91417.03 (df=7)
AIC: 182848.1 BIC: 182903.4
大多数 R 回归函数首先检查 data=
参数中公式中的匹配名称。显然这个函数在需要到全局环境中去匹配公式标记时失败了。
我尝试了数据图建议的公式并得到收敛结果:
> modelNL <- flexmix(y~x, k =2, data=data.frame(x=x,y=y),
+ model = FLXMRnlm(formula = y ~ a*x*cos(x+b),
+ family = "gaussian",
+ start = list(start1, start2)))
> modelNL
Call:
flexmix(formula = y ~ x, data = data.frame(x = x, y = y), k = 2, model = FLXMRnlm(formula = y ~
a * x * cos(x + b), family = "gaussian", start = list(start1, start2)))
Cluster sizes:
1 2
9395 10605
convergence after 17 iterations
> summary(modelNL)
Call:
flexmix(formula = y ~ x, data = data.frame(x = x, y = y), k = 2, model = FLXMRnlm(formula = y ~
a * x * cos(x + b), family = "gaussian", start = list(start1, start2)))
prior size post>0 ratio
Comp.1 0.521 9395 18009 0.522
Comp.2 0.479 10605 13378 0.793
'log Lik.' -78659.85 (df=7)
AIC: 157333.7 BIC: 157389
与第一个公式相比,AIC 的减少似乎很大。