我如何预测在 R 中使用带有生存包的 AFT 模型?
How can I predict using an AFT model with the survival package in R?
我正在使用 accelerated failure time / AFT model with a weibull distribution to predict data. I am doing this using the survival package in R. I am splitting my data in training and test, do training on the training set and afterwards try to predict the values for the test set. To do that I am passing the the test set as the newdata
parameter, as stated in the references。我得到一个错误,说 newdata
与训练数据的大小不同(很明显!)。然后该函数似乎可以评估训练集的预测值。
如何预测新数据的值?
# get data
library(KMsurv)
library(survival)
data("kidtran")
n = nrow(kidtran)
kidtran <- kidtran[sample(n),] # shuffle row-wise
kidtran.train = kidtran[1:(n * 0.8),]
kidtran.test = kidtran[(n * 0.8):n,]
# create model
aftmodel <- survreg(kidtransurv~kidtran.train$gender+kidtran.train$race+kidtran.train$age, dist = "weibull")
predicted <- predict(aftmodel, newdata = kidtran.test)
编辑: 如 Hack-R 所述,缺少这行代码
kidtransurv <- Surv(kidtran.train$time, kidtran.train$delta)
问题似乎出在您对因变量的说明中。
你的问题中缺少依赖项的数据和代码定义,所以我看不出具体错误是什么,但它似乎不是一个合适的 Surv()
生存对象(参见 ?survreg
).
您的代码的这个变体修复了这个问题,对格式进行了一些小的改进,并且运行良好:
require(survival)
pacman::p_load(KMsurv)
library(KMsurv)
library(survival)
data("kidtran")
n = nrow(kidtran)
kidtran <- kidtran[sample(n),]
kidtran.train <- kidtran[1:(n * 0.8),]
kidtran.test <- kidtran[(n * 0.8):n,]
# Whatever kidtransurv was supposed to be is missing from your question,
# so I will replace it with something not-missing
# and I will make it into a proper survival object with Surv()
aftmodel <- survreg(Surv(time, delta) ~ gender + race + age, dist = "weibull", data = kidtran.train)
predicted <- predict(aftmodel, newdata = kidtran.test)
head(predicted)
302 636 727 121 85 612
33190.413 79238.898 111401.546 16792.180 4601.363 17698.895
我正在使用 accelerated failure time / AFT model with a weibull distribution to predict data. I am doing this using the survival package in R. I am splitting my data in training and test, do training on the training set and afterwards try to predict the values for the test set. To do that I am passing the the test set as the newdata
parameter, as stated in the references。我得到一个错误,说 newdata
与训练数据的大小不同(很明显!)。然后该函数似乎可以评估训练集的预测值。
如何预测新数据的值?
# get data
library(KMsurv)
library(survival)
data("kidtran")
n = nrow(kidtran)
kidtran <- kidtran[sample(n),] # shuffle row-wise
kidtran.train = kidtran[1:(n * 0.8),]
kidtran.test = kidtran[(n * 0.8):n,]
# create model
aftmodel <- survreg(kidtransurv~kidtran.train$gender+kidtran.train$race+kidtran.train$age, dist = "weibull")
predicted <- predict(aftmodel, newdata = kidtran.test)
编辑: 如 Hack-R 所述,缺少这行代码
kidtransurv <- Surv(kidtran.train$time, kidtran.train$delta)
问题似乎出在您对因变量的说明中。
你的问题中缺少依赖项的数据和代码定义,所以我看不出具体错误是什么,但它似乎不是一个合适的 Surv()
生存对象(参见 ?survreg
).
您的代码的这个变体修复了这个问题,对格式进行了一些小的改进,并且运行良好:
require(survival)
pacman::p_load(KMsurv)
library(KMsurv)
library(survival)
data("kidtran")
n = nrow(kidtran)
kidtran <- kidtran[sample(n),]
kidtran.train <- kidtran[1:(n * 0.8),]
kidtran.test <- kidtran[(n * 0.8):n,]
# Whatever kidtransurv was supposed to be is missing from your question,
# so I will replace it with something not-missing
# and I will make it into a proper survival object with Surv()
aftmodel <- survreg(Surv(time, delta) ~ gender + race + age, dist = "weibull", data = kidtran.train)
predicted <- predict(aftmodel, newdata = kidtran.test)
head(predicted)
302 636 727 121 85 612 33190.413 79238.898 111401.546 16792.180 4601.363 17698.895