as(x, class(k)) 中的错误:没有将“NULL”强制为“data.frame”的方法或默认值
Error in as(x, class(k)) : no method or default for coercing “NULL” to “data.frame”
我目前遇到下面提到的错误,该错误与将 NULL 值强制转换为数据框有关。数据集确实包含空值,但是我已经尝试使用 is.na() 和 is.null() 函数将空值替换为其他值。数据存储在hdfs上,存储格式为pig.hive。我还附上了下面的代码。如果我从密钥中删除 v[25],代码就可以正常工作。
代码:
AM = c("AN");
UK = c("PP");
sample.map <- function(k,v){
key <- data.frame(acc = v[!which(is.na(v[,1],1],
year = substr(v[!which(is.na(v[,1]),2],1,4),
month = substr(v[!which(is.na(v[,1]),2],5,6))
value <- data.frame(v[,3],count=1)
keyval(key,value)
}
sample.reduce <- function(key,v){
AT <- sum(v[which(v[,1] %in% AM=="TRUE"),2])
UnknownT <- sum(v[which(v[,1] %in% UK=="TRUE"),2])
Total <- AT + UnknownT
d <- data.frame(AT,UnknownT,Total)
keyval(key,d)
}
out <- mapreduce(input ="/user/hduser/input",
output = "/user/hduser/output",
input.format = make.input.format("pig.hive", sep = "\u0001")
output.format = make.output.format("csv", sep = ","),
map= sample.map)
reduce = sample.reduce)
错误:
Warning in asMethod(object) : NAs introduced by coercion
Warning in split.default(1:rmr.length(y), unique(ind), drop = TRUE) : data length is not a multiple of split variable
Warning in rmr.split(x, x, FALSE, keep.rownames = FALSE) : number of items to replace is not a multiple of replacement length Warning in split.default(1:rmr.length(y), unique(ind), drop = TRUE) :
data length is not a multiple of split variable
Warning in rmr.split(v, ind, lossy = lossy, keep.rownames = TRUE) : number of items to replace is not a multiple of replacement length
Error in as(x, class(k)) :
no method or default for coercing “NULL” to “data.frame”
Calls: <Anonymous> ... apply.reduce -> c.keyval -> reduce.keyval -> lapply -> FUN -> as No traceback available
更新
我已经添加了示例数据并编辑了上面的代码。希望这对您有所帮助!
示例数据:
NULL,"2014-03-14","PP"
345689202,"2014-03-14","AN"
234539390,"2014-03-14","PP"
123125444,"2014-03-14","AN"
NULL,"2014-03-14","AN"
901828393,"2014-03-14","AN"
最近发现了一些 和 as
。我不明白为什么 as
默认情况下无法处理此问题,但您可以修改 coerce
,它使用 S4 方法处理转换以调用 as.data.frame
.
setMethod("coerce",c("NULL","data.frame"), function(from, to, strict=TRUE) as.data.frame(from))
[1] "coerce"
as(NULL,"data.frame")
data frame with 0 columns and 0 rows
我目前遇到下面提到的错误,该错误与将 NULL 值强制转换为数据框有关。数据集确实包含空值,但是我已经尝试使用 is.na() 和 is.null() 函数将空值替换为其他值。数据存储在hdfs上,存储格式为pig.hive。我还附上了下面的代码。如果我从密钥中删除 v[25],代码就可以正常工作。
代码:
AM = c("AN");
UK = c("PP");
sample.map <- function(k,v){
key <- data.frame(acc = v[!which(is.na(v[,1],1],
year = substr(v[!which(is.na(v[,1]),2],1,4),
month = substr(v[!which(is.na(v[,1]),2],5,6))
value <- data.frame(v[,3],count=1)
keyval(key,value)
}
sample.reduce <- function(key,v){
AT <- sum(v[which(v[,1] %in% AM=="TRUE"),2])
UnknownT <- sum(v[which(v[,1] %in% UK=="TRUE"),2])
Total <- AT + UnknownT
d <- data.frame(AT,UnknownT,Total)
keyval(key,d)
}
out <- mapreduce(input ="/user/hduser/input",
output = "/user/hduser/output",
input.format = make.input.format("pig.hive", sep = "\u0001")
output.format = make.output.format("csv", sep = ","),
map= sample.map)
reduce = sample.reduce)
错误:
Warning in asMethod(object) : NAs introduced by coercion
Warning in split.default(1:rmr.length(y), unique(ind), drop = TRUE) : data length is not a multiple of split variable
Warning in rmr.split(x, x, FALSE, keep.rownames = FALSE) : number of items to replace is not a multiple of replacement length Warning in split.default(1:rmr.length(y), unique(ind), drop = TRUE) :
data length is not a multiple of split variable
Warning in rmr.split(v, ind, lossy = lossy, keep.rownames = TRUE) : number of items to replace is not a multiple of replacement length
Error in as(x, class(k)) :
no method or default for coercing “NULL” to “data.frame”
Calls: <Anonymous> ... apply.reduce -> c.keyval -> reduce.keyval -> lapply -> FUN -> as No traceback available
更新 我已经添加了示例数据并编辑了上面的代码。希望这对您有所帮助!
示例数据:
NULL,"2014-03-14","PP"
345689202,"2014-03-14","AN"
234539390,"2014-03-14","PP"
123125444,"2014-03-14","AN"
NULL,"2014-03-14","AN"
901828393,"2014-03-14","AN"
最近发现了一些 as
。我不明白为什么 as
默认情况下无法处理此问题,但您可以修改 coerce
,它使用 S4 方法处理转换以调用 as.data.frame
.
setMethod("coerce",c("NULL","data.frame"), function(from, to, strict=TRUE) as.data.frame(from))
[1] "coerce"
as(NULL,"data.frame")
data frame with 0 columns and 0 rows