rpy2下的Cubist回归:"subscript out of bounds"错误
Cubist regression under rpy2: "subscript out of bounds" error
我用rpy2做Cubist时regression.I遇到错误:
Error in strsplit(tmp, "\"")[[1]] : subscript out of bounds
我尝试使用as.matrix来改变数据格式,但还是不行。
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
from rpy2.robjects.vectors import FloatVector
from rpy2.robjects import pandas2ri
Cubist = importr('Cubist')
lattice = importr('lattice')
r = robjects.r
# 准备样点数据
dt = r('mtcars')
Z = FloatVector(dt[3])
X = FloatVector(dt[5])
X1 = FloatVector(dt[6])
T = r['cbind'](X,X1)
regr = r['cubist'](x=T,y=Z,committees=10)
如果是矩阵,cubist()
的 x
参数似乎需要 dimnames
属性。
R 中的设置:
library(Cubist)
dt = mtcars
Z = dt[, 4]
X = dt[, 6]
X1 = dt[, 7]
现在比较这个(重现你的错误):
> T = cbind(dt[, 6], dt[, 7])
> str(T)
num [1:32, 1:2] 2.62 2.88 2.32 3.21 3.44 ...
> cubist(x=T, y=Z, committees=10)
cubist code called exit with value 1
Error in strsplit(tmp, "\"")[[1]] : subscript out of bounds
对比
> T = cbind(X, X1)
> str(T)
num [1:32, 1:2] 2.62 2.88 2.32 3.21 3.44 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "X" "X1"
> cubist(x=T, y=Z, committees=10)
Call:
cubist.default(x = T, y = Z, committees = 10)
Number of samples: 32
Number of predictors: 2
Number of committees: 10
Number of rules per committee: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
有多种方法可以确保通过 rpy2
附加 dimnames。使用您的代码的一种简单方法是简单地显式命名变量:
In [15]: T = r['cbind'](X=X,X1=X1)
In [16]: print(r['str'](T))
num [1:32, 1:2] 2.62 2.88 2.32 3.21 3.44 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "X" "X1"
<rpy2.rinterface.NULLType object at 0x7f0d7c0f5608> [RTYPES.NILSXP]
In [17]: print(r['cubist'](x=T,y=Z,committees=10))
Call:
cubist.default(x = structure(c(2.62, 2.875, 2.32, 3.215, 3.44, 3.46,
205, 215, 230, 66, 52, 65, 97, 150, 150, 245, 175, 66, 91, 113, 264, 175,
335, 109), committees = 10L)
Number of samples: 32
Number of predictors: 2
Number of committees: 10
Number of rules per committee: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
我用rpy2做Cubist时regression.I遇到错误:
Error in strsplit(tmp, "\"")[[1]] : subscript out of bounds
我尝试使用as.matrix来改变数据格式,但还是不行。
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
from rpy2.robjects.vectors import FloatVector
from rpy2.robjects import pandas2ri
Cubist = importr('Cubist')
lattice = importr('lattice')
r = robjects.r
# 准备样点数据
dt = r('mtcars')
Z = FloatVector(dt[3])
X = FloatVector(dt[5])
X1 = FloatVector(dt[6])
T = r['cbind'](X,X1)
regr = r['cubist'](x=T,y=Z,committees=10)
如果是矩阵,cubist()
的 x
参数似乎需要 dimnames
属性。
R 中的设置:
library(Cubist)
dt = mtcars
Z = dt[, 4]
X = dt[, 6]
X1 = dt[, 7]
现在比较这个(重现你的错误):
> T = cbind(dt[, 6], dt[, 7])
> str(T)
num [1:32, 1:2] 2.62 2.88 2.32 3.21 3.44 ...
> cubist(x=T, y=Z, committees=10)
cubist code called exit with value 1
Error in strsplit(tmp, "\"")[[1]] : subscript out of bounds
对比
> T = cbind(X, X1)
> str(T)
num [1:32, 1:2] 2.62 2.88 2.32 3.21 3.44 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "X" "X1"
> cubist(x=T, y=Z, committees=10)
Call:
cubist.default(x = T, y = Z, committees = 10)
Number of samples: 32
Number of predictors: 2
Number of committees: 10
Number of rules per committee: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
有多种方法可以确保通过 rpy2
附加 dimnames。使用您的代码的一种简单方法是简单地显式命名变量:
In [15]: T = r['cbind'](X=X,X1=X1)
In [16]: print(r['str'](T))
num [1:32, 1:2] 2.62 2.88 2.32 3.21 3.44 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "X" "X1"
<rpy2.rinterface.NULLType object at 0x7f0d7c0f5608> [RTYPES.NILSXP]
In [17]: print(r['cubist'](x=T,y=Z,committees=10))
Call:
cubist.default(x = structure(c(2.62, 2.875, 2.32, 3.215, 3.44, 3.46,
205, 215, 230, 66, 52, 65, 97, 150, 150, 245, 175, 66, 91, 113, 264, 175,
335, 109), committees = 10L)
Number of samples: 32
Number of predictors: 2
Number of committees: 10
Number of rules per committee: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1