if (any(co)) { 中的错误:缺少需要 TRUE/FALSE 的值
Error in if (any(co)) { : valor ausente donde TRUE/FALSE es necesario
我一直在训练一些模型,当我尝试将支持向量机与径向基函数内核一起使用时,出现以下错误:
> svmRFit <- train(x = Fraud_trainX,
+ y = Fraud_trainY,
+ method = "svmRadial",
+ metric = "ROC",
+ preProc = c("center", "scale"),
+ tuneLength = 15,
+ trControl = ctrl)
Error in if (any(co)) { : valor ausente donde TRUE/FALSE es necesario
Además: Warning messages:
1: In FUN(newX[, i], ...) : NAs introducidos por coerción
2: In FUN(newX[, i], ...) : NAs introducidos por coerción
3: In FUN(newX[, i], ...) : NAs introducidos por coerción
4: In FUN(newX[, i], ...) : NAs introducidos por coerción
5: In FUN(newX[, i], ...) : NAs introducidos por coerción
Called from: .local(x, ...)
Browse[1]>
这是我的数据库的摘要:
summary(Fraud_trainX)
Make AccidentArea PolicyType VehicleCategory
Pontiac :1412 Rural: 597 SedC :2109 Sedan :3660
Toyota :1177 Urban:5186 SedL :1857 Sport :1994
Honda :1054 SedA :1551 Utility: 129
Mazda : 883 SpoC : 126
Chevrolet: 637 Utility - All Perils: 113
Accura : 183 UtiCL : 16
(Other) : 437 (Other) : 11
BasePolicy WeekOfMonthClaimed Age PolicyNumber RepNumber
AP:1675 Min. :1.000 Min. :16.00 Min. : 2 Min. : 1.000
C :2246 1st Qu.:2.000 1st Qu.:31.00 1st Qu.: 3866 1st Qu.: 4.000
L :1862 Median :3.000 Median :38.00 Median : 7757 Median : 9.000
Mean :2.703 Mean :40.71 Mean : 7754 Mean : 8.473
3rd Qu.:4.000 3rd Qu.:49.00 3rd Qu.:11556 3rd Qu.:12.000
Max. :5.000 Max. :80.00 Max. :15420 Max. :16.000
NA's :130
Deductible DriverRating ClaimSize Month
Min. :400.0 Min. :1.000 Min. : 0 Min. : 1.000
1st Qu.:400.0 1st Qu.:1.000 1st Qu.: 4112 1st Qu.: 3.000
Median :400.0 Median :3.000 Median : 8150 Median : 6.000
Mean :407.3 Mean :2.488 Mean : 22921 Mean : 6.384
3rd Qu.:400.0 3rd Qu.:3.000 3rd Qu.: 43446 3rd Qu.: 9.000
Max. :700.0 Max. :4.000 Max. :141394 Max. :12.000
NA's :4
WeekOfMonth DayOfWeek DayOfWeekClaimed MonthClaimed
Min. :1.000 Min. :1.000 Min. :1.000 Min. : 1.000
1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.: 3.000
Median :3.000 Median :4.000 Median :3.000 Median : 6.000
Mean :2.776 Mean :3.844 Mean :2.824 Mean : 6.345
3rd Qu.:4.000 3rd Qu.:5.000 3rd Qu.:4.000 3rd Qu.: 9.000
Max. :5.000 Max. :7.000 Max. :7.000 Max. :12.000
Sex MaritalStatus Fault VehiclePrice
Min. :0.0000 Min. :1.000 Min. :0.0000 Min. :1.000
1st Qu.:1.0000 1st Qu.:1.000 1st Qu.:0.0000 1st Qu.:2.000
Median :1.0000 Median :2.000 Median :0.0000 Median :2.000
Mean :0.8406 Mean :1.698 Mean :0.2722 Mean :2.783
3rd Qu.:1.0000 3rd Qu.:2.000 3rd Qu.:1.0000 3rd Qu.:3.000
Max. :1.0000 Max. :3.000 Max. :1.0000 Max. :6.000
Days_Policy_Accident Days_Policy_Claim PastNumberOfClaims AgeOfVehicle
Min. :0.000 Min. :1.000 Min. :0.000 Min. :0.000
1st Qu.:4.000 1st Qu.:3.000 1st Qu.:0.000 1st Qu.:6.000
Median :4.000 Median :3.000 Median :1.000 Median :7.000
Mean :3.971 Mean :2.993 Mean :1.333 Mean :6.592
3rd Qu.:4.000 3rd Qu.:3.000 3rd Qu.:2.000 3rd Qu.:8.000
Max. :4.000 Max. :3.000 Max. :3.000 Max. :8.000
AgeOfPolicyHolder PoliceReportFiled WitnessPresent AgentType
Min. :1.00 Min. :0.00000 Min. :0.00000 Min. :0.00000
1st Qu.:5.00 1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.:0.00000
Median :6.00 Median :0.00000 Median :0.00000 Median :0.00000
Mean :5.89 Mean :0.02957 Mean :0.00536 Mean :0.01504
3rd Qu.:7.00 3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.:0.00000
Max. :9.00 Max. :1.00000 Max. :1.00000 Max. :1.00000
NumberOfSuppliments AddressChange_Claim NumberOfCars
Min. :0.000 Min. :0.0000 Min. :0.0000
1st Qu.:0.000 1st Qu.:0.0000 1st Qu.:0.0000
Median :1.000 Median :0.0000 Median :0.0000
Mean :1.163 Mean :0.1757 Mean :0.1027
3rd Qu.:2.000 3rd Qu.:0.0000 3rd Qu.:0.0000
Max. :3.000 Max. :3.0000 Max. :3.0000
数据库结构:
str(Fraud_trainX)
'data.frame': 5783 obs. of 32 variables:
$ Make : Factor w/ 19 levels "Accura","BMW",..: 7 18 6 7 6 6 6 3 10 7 ...
$ AccidentArea : Factor w/ 2 levels "Rural","Urban": 2 1 2 1 2 2 2 2 2 2 ...
$ PolicyType : Factor w/ 8 levels "SedA","SedC",..: 5 3 3 2 3 3 1 2 3 2 ...
$ VehicleCategory : Factor w/ 3 levels "Sedan","Sport",..: 2 2 2 1 2 2 1 1 2 1 ...
$ BasePolicy : Factor w/ 3 levels "AP","C","L": 2 3 3 2 3 3 1 2 3 2 ...
$ WeekOfMonthClaimed : num 4 1 3 1 1 5 1 1 1 4 ...
$ Age : num 34 65 28 NA 61 38 41 28 40 21 ...
$ PolicyNumber : num 2 4 13 14 15 16 17 18 21 27 ...
$ RepNumber : num 15 4 11 12 3 16 15 6 3 1 ...
$ Deductible : num 400 400 400 400 400 400 400 400 400 400 ...
$ DriverRating : num 4 2 1 3 1 1 4 1 1 2 ...
$ ClaimSize : num 59294 7584 59748 82212 59552 ...
$ Month : int 1 6 1 1 1 8 4 7 4 3 ...
$ WeekOfMonth : int 3 2 3 5 5 4 4 5 2 3 ...
$ DayOfWeek : int 3 6 5 5 1 2 4 7 5 4 ...
$ DayOfWeekClaimed : int 1 5 5 3 4 1 3 3 2 4 ...
$ MonthClaimed : int 1 7 1 2 2 8 5 8 5 6 ...
$ Sex : int 1 1 1 1 1 1 1 0 1 1 ...
$ MaritalStatus : int 1 2 2 1 2 1 2 2 2 2 ...
$ Fault : int 0 1 0 1 0 0 0 1 0 0 ...
$ VehiclePrice : int 6 2 6 6 6 6 6 2 2 3 ...
$ Days_Policy_Accident: int 4 4 4 4 4 4 4 4 4 4 ...
$ Days_Policy_Claim : int 3 3 3 3 3 3 3 3 3 3 ...
$ PastNumberOfClaims : int 0 1 1 0 0 0 0 0 1 3 ...
$ AgeOfVehicle : int 6 8 7 0 8 6 7 7 8 5 ...
$ AgeOfPolicyHolder : int 5 8 5 1 8 6 6 5 6 4 ...
$ PoliceReportFiled : int 1 1 0 0 0 0 0 0 0 0 ...
$ WitnessPresent : int 0 0 0 0 0 0 0 0 0 0 ...
$ AgentType : int 0 0 0 0 0 0 0 0 0 0 ...
$ NumberOfSuppliments : int 0 3 0 0 0 0 0 1 3 3 ...
$ AddressChange_Claim : int 0 0 0 0 0 0 0 0 0 0 ...
$ NumberOfCars : int 0 0 0 0 0 0 0 0 0 0 ...
可变响应:
summary(Fraud_trainY)
No Yes
5440 343
这里有一些关于我用于模型训练的索引和控制:
indx <- createMultiFolds(Fraud_trainY, k = 5, times = 2)
str(indx)
ctrl <- trainControl(method = "repeatedcv",index = indx,
summaryFunction = twoClassSummary,
sampling = "up",
classProbs = TRUE)
这里是模型参数:
svmRFit <- train(x = Fraud_trainX,
y = Fraud_trainY,
method = "svmRadial",
metric = "ROC",
preProc = c("center", "scale"),
tuneLength = 15,
trControl = ctrl)
我已经尝试加载 pROC 库但它没有给我任何有利的结果,我已经从所有变量中删除了包含 NA 的行,响应变量已经具有级别“No”和“是的”。我还完成了 C5.0(“C5.0”)、神经网络(nnet)和逻辑回归(“multinom”)的培训,所有这些数据都为我服务,它给了我模型的结果,这是唯一一个让我犯了某种错误的模型。
正如@AlvaroMartinez 评论的那样,错误是我将变量设为 factor
,当我将这些变量更改为 integer
时,模型工作正常。
我一直在训练一些模型,当我尝试将支持向量机与径向基函数内核一起使用时,出现以下错误:
> svmRFit <- train(x = Fraud_trainX,
+ y = Fraud_trainY,
+ method = "svmRadial",
+ metric = "ROC",
+ preProc = c("center", "scale"),
+ tuneLength = 15,
+ trControl = ctrl)
Error in if (any(co)) { : valor ausente donde TRUE/FALSE es necesario
Además: Warning messages:
1: In FUN(newX[, i], ...) : NAs introducidos por coerción
2: In FUN(newX[, i], ...) : NAs introducidos por coerción
3: In FUN(newX[, i], ...) : NAs introducidos por coerción
4: In FUN(newX[, i], ...) : NAs introducidos por coerción
5: In FUN(newX[, i], ...) : NAs introducidos por coerción
Called from: .local(x, ...)
Browse[1]>
这是我的数据库的摘要:
summary(Fraud_trainX)
Make AccidentArea PolicyType VehicleCategory
Pontiac :1412 Rural: 597 SedC :2109 Sedan :3660
Toyota :1177 Urban:5186 SedL :1857 Sport :1994
Honda :1054 SedA :1551 Utility: 129
Mazda : 883 SpoC : 126
Chevrolet: 637 Utility - All Perils: 113
Accura : 183 UtiCL : 16
(Other) : 437 (Other) : 11
BasePolicy WeekOfMonthClaimed Age PolicyNumber RepNumber
AP:1675 Min. :1.000 Min. :16.00 Min. : 2 Min. : 1.000
C :2246 1st Qu.:2.000 1st Qu.:31.00 1st Qu.: 3866 1st Qu.: 4.000
L :1862 Median :3.000 Median :38.00 Median : 7757 Median : 9.000
Mean :2.703 Mean :40.71 Mean : 7754 Mean : 8.473
3rd Qu.:4.000 3rd Qu.:49.00 3rd Qu.:11556 3rd Qu.:12.000
Max. :5.000 Max. :80.00 Max. :15420 Max. :16.000
NA's :130
Deductible DriverRating ClaimSize Month
Min. :400.0 Min. :1.000 Min. : 0 Min. : 1.000
1st Qu.:400.0 1st Qu.:1.000 1st Qu.: 4112 1st Qu.: 3.000
Median :400.0 Median :3.000 Median : 8150 Median : 6.000
Mean :407.3 Mean :2.488 Mean : 22921 Mean : 6.384
3rd Qu.:400.0 3rd Qu.:3.000 3rd Qu.: 43446 3rd Qu.: 9.000
Max. :700.0 Max. :4.000 Max. :141394 Max. :12.000
NA's :4
WeekOfMonth DayOfWeek DayOfWeekClaimed MonthClaimed
Min. :1.000 Min. :1.000 Min. :1.000 Min. : 1.000
1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.: 3.000
Median :3.000 Median :4.000 Median :3.000 Median : 6.000
Mean :2.776 Mean :3.844 Mean :2.824 Mean : 6.345
3rd Qu.:4.000 3rd Qu.:5.000 3rd Qu.:4.000 3rd Qu.: 9.000
Max. :5.000 Max. :7.000 Max. :7.000 Max. :12.000
Sex MaritalStatus Fault VehiclePrice
Min. :0.0000 Min. :1.000 Min. :0.0000 Min. :1.000
1st Qu.:1.0000 1st Qu.:1.000 1st Qu.:0.0000 1st Qu.:2.000
Median :1.0000 Median :2.000 Median :0.0000 Median :2.000
Mean :0.8406 Mean :1.698 Mean :0.2722 Mean :2.783
3rd Qu.:1.0000 3rd Qu.:2.000 3rd Qu.:1.0000 3rd Qu.:3.000
Max. :1.0000 Max. :3.000 Max. :1.0000 Max. :6.000
Days_Policy_Accident Days_Policy_Claim PastNumberOfClaims AgeOfVehicle
Min. :0.000 Min. :1.000 Min. :0.000 Min. :0.000
1st Qu.:4.000 1st Qu.:3.000 1st Qu.:0.000 1st Qu.:6.000
Median :4.000 Median :3.000 Median :1.000 Median :7.000
Mean :3.971 Mean :2.993 Mean :1.333 Mean :6.592
3rd Qu.:4.000 3rd Qu.:3.000 3rd Qu.:2.000 3rd Qu.:8.000
Max. :4.000 Max. :3.000 Max. :3.000 Max. :8.000
AgeOfPolicyHolder PoliceReportFiled WitnessPresent AgentType
Min. :1.00 Min. :0.00000 Min. :0.00000 Min. :0.00000
1st Qu.:5.00 1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.:0.00000
Median :6.00 Median :0.00000 Median :0.00000 Median :0.00000
Mean :5.89 Mean :0.02957 Mean :0.00536 Mean :0.01504
3rd Qu.:7.00 3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.:0.00000
Max. :9.00 Max. :1.00000 Max. :1.00000 Max. :1.00000
NumberOfSuppliments AddressChange_Claim NumberOfCars
Min. :0.000 Min. :0.0000 Min. :0.0000
1st Qu.:0.000 1st Qu.:0.0000 1st Qu.:0.0000
Median :1.000 Median :0.0000 Median :0.0000
Mean :1.163 Mean :0.1757 Mean :0.1027
3rd Qu.:2.000 3rd Qu.:0.0000 3rd Qu.:0.0000
Max. :3.000 Max. :3.0000 Max. :3.0000
数据库结构:
str(Fraud_trainX)
'data.frame': 5783 obs. of 32 variables:
$ Make : Factor w/ 19 levels "Accura","BMW",..: 7 18 6 7 6 6 6 3 10 7 ...
$ AccidentArea : Factor w/ 2 levels "Rural","Urban": 2 1 2 1 2 2 2 2 2 2 ...
$ PolicyType : Factor w/ 8 levels "SedA","SedC",..: 5 3 3 2 3 3 1 2 3 2 ...
$ VehicleCategory : Factor w/ 3 levels "Sedan","Sport",..: 2 2 2 1 2 2 1 1 2 1 ...
$ BasePolicy : Factor w/ 3 levels "AP","C","L": 2 3 3 2 3 3 1 2 3 2 ...
$ WeekOfMonthClaimed : num 4 1 3 1 1 5 1 1 1 4 ...
$ Age : num 34 65 28 NA 61 38 41 28 40 21 ...
$ PolicyNumber : num 2 4 13 14 15 16 17 18 21 27 ...
$ RepNumber : num 15 4 11 12 3 16 15 6 3 1 ...
$ Deductible : num 400 400 400 400 400 400 400 400 400 400 ...
$ DriverRating : num 4 2 1 3 1 1 4 1 1 2 ...
$ ClaimSize : num 59294 7584 59748 82212 59552 ...
$ Month : int 1 6 1 1 1 8 4 7 4 3 ...
$ WeekOfMonth : int 3 2 3 5 5 4 4 5 2 3 ...
$ DayOfWeek : int 3 6 5 5 1 2 4 7 5 4 ...
$ DayOfWeekClaimed : int 1 5 5 3 4 1 3 3 2 4 ...
$ MonthClaimed : int 1 7 1 2 2 8 5 8 5 6 ...
$ Sex : int 1 1 1 1 1 1 1 0 1 1 ...
$ MaritalStatus : int 1 2 2 1 2 1 2 2 2 2 ...
$ Fault : int 0 1 0 1 0 0 0 1 0 0 ...
$ VehiclePrice : int 6 2 6 6 6 6 6 2 2 3 ...
$ Days_Policy_Accident: int 4 4 4 4 4 4 4 4 4 4 ...
$ Days_Policy_Claim : int 3 3 3 3 3 3 3 3 3 3 ...
$ PastNumberOfClaims : int 0 1 1 0 0 0 0 0 1 3 ...
$ AgeOfVehicle : int 6 8 7 0 8 6 7 7 8 5 ...
$ AgeOfPolicyHolder : int 5 8 5 1 8 6 6 5 6 4 ...
$ PoliceReportFiled : int 1 1 0 0 0 0 0 0 0 0 ...
$ WitnessPresent : int 0 0 0 0 0 0 0 0 0 0 ...
$ AgentType : int 0 0 0 0 0 0 0 0 0 0 ...
$ NumberOfSuppliments : int 0 3 0 0 0 0 0 1 3 3 ...
$ AddressChange_Claim : int 0 0 0 0 0 0 0 0 0 0 ...
$ NumberOfCars : int 0 0 0 0 0 0 0 0 0 0 ...
可变响应:
summary(Fraud_trainY)
No Yes
5440 343
这里有一些关于我用于模型训练的索引和控制:
indx <- createMultiFolds(Fraud_trainY, k = 5, times = 2)
str(indx)
ctrl <- trainControl(method = "repeatedcv",index = indx,
summaryFunction = twoClassSummary,
sampling = "up",
classProbs = TRUE)
这里是模型参数:
svmRFit <- train(x = Fraud_trainX,
y = Fraud_trainY,
method = "svmRadial",
metric = "ROC",
preProc = c("center", "scale"),
tuneLength = 15,
trControl = ctrl)
我已经尝试加载 pROC 库但它没有给我任何有利的结果,我已经从所有变量中删除了包含 NA 的行,响应变量已经具有级别“No”和“是的”。我还完成了 C5.0(“C5.0”)、神经网络(nnet)和逻辑回归(“multinom”)的培训,所有这些数据都为我服务,它给了我模型的结果,这是唯一一个让我犯了某种错误的模型。
正如@AlvaroMartinez 评论的那样,错误是我将变量设为 factor
,当我将这些变量更改为 integer
时,模型工作正常。