如何修复 R 中的 "Hist only applies to single numeric columns." 错误
How to fix "Hist only applies to single numeric columns." error in R
我正在尝试从某本书中学习 h2o 和 R。当我尝试执行作者的代码时,出现了我提到的错误。
这是代码:
我正在使用这个数据集;
https://github.com/DarrenCook/h2o/blob/bk/datasets/ENB2012_data.csv
seed = 999
library(h2o)
h2o.init(nthreads = -1)
data <- h2o.importFile("../datasets/ENB2012_data.csv")
factorsList <- c("X6", "X8")
data[,factorsList] <- as.factor(data[,factorsList])
splits <- h2o.splitFrame(data, 0.8, seed = seed)
train <- splits[[1]]
test <- splits[[2]]
x <- c("X1", "X2", "X3", "X4", "X5", "X6", "X7", "X8")
y <- "Y2" #Or "Y1"
######
numericColumns <- setdiff(colnames(train),c("X6","X8"))
d <- round( h2o.cor(train[,numericColumns]) ,2)
rownames(d) <- colnames(d)
d
#####
#breaks defaults to sturges
# "rice" gives more, thinner bars
# "doane" crashes it
# "fd" gives a square for X1, X7 is chunky, X4 are peaks.
# "scott": the bars always touch, but are of different widths.
#
# I've added a bit more formatting than described in the book,
# just to get the plot shown in the book.
par(mar=c(5.1, 6.0, 4.1, 2.1)) #Changed 4.1 to 6.0
par(oma=c(1,0,0,0)) #Def was all zeroes
par(mfrow = c(2,5))
#ylim <- c(0,350)
ylim <- NULL
dummy <- lapply(colnames(train), function(col){
h <- h2o.hist(train[,col], breaks=30, plot = FALSE)
plot(h, main = col, xlab = "", ylim = ylim,
ylab = ifelse(col %in% c("X1","X6"), "Frequency", ""),
cex.lab=2.0, cex.axis=2.0, cex.main=2.5, cex.sub=2.0, cex=2.0
)
})
# If curious, here is how it looks on all data
dummy <- lapply(colnames(data), function(col){
h <- h2o.hist(data[,col], plot = FALSE)
plot(h, main = col, xlab = "", ylim = ylim)
})
您报告的错误 "how to fix "hist 仅适用于 R 中的单个数字列错误”。不是可以 "fixed" 的错误,因为函数 h2o.hist()
仅设计用于应用于单个列。
看起来使用 hist h <- h2o.hist(train[,col], breaks=30, plot = FALSE)
的唯一一行代码正在将 h2o.hist()
应用于单个列并且应该可以正常运行。
我正在尝试从某本书中学习 h2o 和 R。当我尝试执行作者的代码时,出现了我提到的错误。
这是代码:
我正在使用这个数据集; https://github.com/DarrenCook/h2o/blob/bk/datasets/ENB2012_data.csv
seed = 999
library(h2o)
h2o.init(nthreads = -1)
data <- h2o.importFile("../datasets/ENB2012_data.csv")
factorsList <- c("X6", "X8")
data[,factorsList] <- as.factor(data[,factorsList])
splits <- h2o.splitFrame(data, 0.8, seed = seed)
train <- splits[[1]]
test <- splits[[2]]
x <- c("X1", "X2", "X3", "X4", "X5", "X6", "X7", "X8")
y <- "Y2" #Or "Y1"
######
numericColumns <- setdiff(colnames(train),c("X6","X8"))
d <- round( h2o.cor(train[,numericColumns]) ,2)
rownames(d) <- colnames(d)
d
#####
#breaks defaults to sturges
# "rice" gives more, thinner bars
# "doane" crashes it
# "fd" gives a square for X1, X7 is chunky, X4 are peaks.
# "scott": the bars always touch, but are of different widths.
#
# I've added a bit more formatting than described in the book,
# just to get the plot shown in the book.
par(mar=c(5.1, 6.0, 4.1, 2.1)) #Changed 4.1 to 6.0
par(oma=c(1,0,0,0)) #Def was all zeroes
par(mfrow = c(2,5))
#ylim <- c(0,350)
ylim <- NULL
dummy <- lapply(colnames(train), function(col){
h <- h2o.hist(train[,col], breaks=30, plot = FALSE)
plot(h, main = col, xlab = "", ylim = ylim,
ylab = ifelse(col %in% c("X1","X6"), "Frequency", ""),
cex.lab=2.0, cex.axis=2.0, cex.main=2.5, cex.sub=2.0, cex=2.0
)
})
# If curious, here is how it looks on all data
dummy <- lapply(colnames(data), function(col){
h <- h2o.hist(data[,col], plot = FALSE)
plot(h, main = col, xlab = "", ylim = ylim)
})
您报告的错误 "how to fix "hist 仅适用于 R 中的单个数字列错误”。不是可以 "fixed" 的错误,因为函数 h2o.hist()
仅设计用于应用于单个列。
看起来使用 hist h <- h2o.hist(train[,col], breaks=30, plot = FALSE)
的唯一一行代码正在将 h2o.hist()
应用于单个列并且应该可以正常运行。