在 R 中使用 h2o.glm 时出错
Error using h2o.glm in R
我是 R 中 h2o 实现的新手。我有这样一个数据框 (df1):
df<-structure(list(v1 = c(5.24823, 0.839, 3.57348, 1.47869, 2.75093,
1.69665, 0.46366, 1.53827, 2.0149, 2.32103, 1.87223, 2.3392,
2.10579, 1.7236, 1.13056, 1.09144, 3.52515, 1.16248, 1.77885,
0.9991, 0.47375, 2.91148, 1.237, 1.18971, 1.23953, 1.07049, 1.46971,
1.65649, 3.3021, 1.04816), v100 = c(19.60784, 9.27047, 0.5523,
15.05735, 0.93231, 11.73979, 19.53795, 6.22754, 4.54464, 17.0922,
3.60958, 18.23052, 0.06395, 17.17605, 5.52724, 17.85276, 15.57143,
0.05825, 19.85401, 14.51163, 6.64372, 19.60284, 16.40279, 16.89205,
19.6748, 14.64446, 19.34747, 9.04215, 11.37993, 16.81159), v101 = c(10.71683,
7.13707, 3.61956, 9.75558, 4.21413, 8.49785, 6.79572, 5.19486,
7.39523, 6.05496, 2.91676, 9.82552, 5.5107, 5.40719, 10.82138,
12.37154, 5.56351, 3.8549, 9.87455, 5.37746, 3.57747, 8.11406,
6.61883, 7.3667, 7.74248, 12.44785, 12.38174, 5.99648, 7.10452,
8.27756)), .Names = c("v1", "v100", "v101"), row.names = c(85671L,
92268L, 44249L, 68218L, 3250L, 105583L, 4874L, 94393L, 83502L,
61414L, 42987L, 50200L, 80887L, 9321L, 39565L, 79644L, 26265L,
75272L, 104819L, 72782L, 57101L, 59037L, 78810L, 88619L, 21564L,
39198L, 55030L, 44193L, 6116L, 101448L), class = "data.frame")
我想使用 h2o 包制作 glm。所以我有以下代码:
library(h2o)
library(h2oEnsemble)
modellm<-h2o.glm(y="v1", x="v100",training_frame=df ,family="gaussian",
nfolds = 0, alpha = 0.1, lambda_search = FALSE)
但是,执行代码后出现以下错误:
Error in value[[3L]](cond) :
argument "training_frame" must be a valid H2OFrame or ID
我尝试了以下主题:
但是,并没有解决我的问题。在上面 link:
执行推荐的解决方案后,我得到以下结果
> library(devtools)
> install_github("h2oai/h2o-3/h2o-r/ensemble/h2oEnsemble-package")
Downloading github repo h2oai/h2o-3@master
Installing h2oEnsemble
"C:/PROGRA~1/R/R-32~1.4R~/bin/x64/R" --no-site-file --no-environ \
--no-save --no-restore CMD INSTALL \
"C:/Users/ozgur/AppData/Local/Temp/RtmpAfGU5K/devtools8f064866e23/h2oai-h2o-3-30ef929/h2o-r/ensemble/h2oEnsemble-package" \
--library="C:/Users/ozgur/Documents/R/win-library/3.2" \
--install-tests
* installing *source* package 'h2oEnsemble' ...
** R
** tests
** preparing package for lazy loading
Warning: package 'h2o' was built under R version 3.2.5
Warning: package 'statmod' was built under R version 3.2.5
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
*** arch - i386
Warning: package 'h2o' was built under R version 3.2.5
Warning: package 'statmod' was built under R version 3.2.5
*** arch - x64
Warning: package 'h2o' was built under R version 3.2.5
Warning: package 'statmod' was built under R version 3.2.5
* DONE (h2oEnsemble)
Reloading installed h2oEnsemble
h2oEnsemble (beta) for H2O >=3.0
Version: 0.1.8
Package created on 2016-03-29
如果有任何帮助,我将非常高兴。非常感谢。
如果您只是想训练 H2O GLM,则不需要 h2oEnsemble 包,因此您可以从代码中删除 library(h2oEnsemble)
。在 library(h2o)
之后,您还必须将以下行添加到您的代码中,h2o.init(nthreads = -1)
,这将在后台启动一个 H2O 集群——"H2O cluster" 是优化的 Java 代码并行执行。
您遇到的问题与您的 training_frame
有关。在 H2O 中,training_frame
参数必须是 "H2OFrame",而不是典型的 R data.frame。出于可扩展性原因,H2O 使用称为 "H2OFrames" 的分布式数据帧,而不是标准的内存 data.frame 对象。
要将 df
转换为 H2OFrame 并训练 GLM,请执行以下操作:
hdf <- as.h2o(df) #convert data.frame to H2OFrame
modellm <- h2o.glm(y = "v1", x = "v100",training_frame = hdf, family = "gaussian",
nfolds = 0, alpha = 0.1, lambda_search = FALSE)
或者,如果您的数据在 CSV 文件中,例如,您可以使用 h2o.importFile()
函数直接将数据导入 H2O 集群,然后您不需要转换它从 R data.frame 到 H2OFrame.
由于您是 H2O 的新手,我建议您查看我创建的这个 Jupyter R notebook 来教人们如何使用 H2O。欢迎来到 H2O!
我是 R 中 h2o 实现的新手。我有这样一个数据框 (df1):
df<-structure(list(v1 = c(5.24823, 0.839, 3.57348, 1.47869, 2.75093,
1.69665, 0.46366, 1.53827, 2.0149, 2.32103, 1.87223, 2.3392,
2.10579, 1.7236, 1.13056, 1.09144, 3.52515, 1.16248, 1.77885,
0.9991, 0.47375, 2.91148, 1.237, 1.18971, 1.23953, 1.07049, 1.46971,
1.65649, 3.3021, 1.04816), v100 = c(19.60784, 9.27047, 0.5523,
15.05735, 0.93231, 11.73979, 19.53795, 6.22754, 4.54464, 17.0922,
3.60958, 18.23052, 0.06395, 17.17605, 5.52724, 17.85276, 15.57143,
0.05825, 19.85401, 14.51163, 6.64372, 19.60284, 16.40279, 16.89205,
19.6748, 14.64446, 19.34747, 9.04215, 11.37993, 16.81159), v101 = c(10.71683,
7.13707, 3.61956, 9.75558, 4.21413, 8.49785, 6.79572, 5.19486,
7.39523, 6.05496, 2.91676, 9.82552, 5.5107, 5.40719, 10.82138,
12.37154, 5.56351, 3.8549, 9.87455, 5.37746, 3.57747, 8.11406,
6.61883, 7.3667, 7.74248, 12.44785, 12.38174, 5.99648, 7.10452,
8.27756)), .Names = c("v1", "v100", "v101"), row.names = c(85671L,
92268L, 44249L, 68218L, 3250L, 105583L, 4874L, 94393L, 83502L,
61414L, 42987L, 50200L, 80887L, 9321L, 39565L, 79644L, 26265L,
75272L, 104819L, 72782L, 57101L, 59037L, 78810L, 88619L, 21564L,
39198L, 55030L, 44193L, 6116L, 101448L), class = "data.frame")
我想使用 h2o 包制作 glm。所以我有以下代码:
library(h2o)
library(h2oEnsemble)
modellm<-h2o.glm(y="v1", x="v100",training_frame=df ,family="gaussian",
nfolds = 0, alpha = 0.1, lambda_search = FALSE)
但是,执行代码后出现以下错误:
Error in value[[3L]](cond) :
argument "training_frame" must be a valid H2OFrame or ID
我尝试了以下主题:
但是,并没有解决我的问题。在上面 link:
执行推荐的解决方案后,我得到以下结果> library(devtools)
> install_github("h2oai/h2o-3/h2o-r/ensemble/h2oEnsemble-package")
Downloading github repo h2oai/h2o-3@master
Installing h2oEnsemble
"C:/PROGRA~1/R/R-32~1.4R~/bin/x64/R" --no-site-file --no-environ \
--no-save --no-restore CMD INSTALL \
"C:/Users/ozgur/AppData/Local/Temp/RtmpAfGU5K/devtools8f064866e23/h2oai-h2o-3-30ef929/h2o-r/ensemble/h2oEnsemble-package" \
--library="C:/Users/ozgur/Documents/R/win-library/3.2" \
--install-tests
* installing *source* package 'h2oEnsemble' ...
** R
** tests
** preparing package for lazy loading
Warning: package 'h2o' was built under R version 3.2.5
Warning: package 'statmod' was built under R version 3.2.5
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
*** arch - i386
Warning: package 'h2o' was built under R version 3.2.5
Warning: package 'statmod' was built under R version 3.2.5
*** arch - x64
Warning: package 'h2o' was built under R version 3.2.5
Warning: package 'statmod' was built under R version 3.2.5
* DONE (h2oEnsemble)
Reloading installed h2oEnsemble
h2oEnsemble (beta) for H2O >=3.0
Version: 0.1.8
Package created on 2016-03-29
如果有任何帮助,我将非常高兴。非常感谢。
如果您只是想训练 H2O GLM,则不需要 h2oEnsemble 包,因此您可以从代码中删除 library(h2oEnsemble)
。在 library(h2o)
之后,您还必须将以下行添加到您的代码中,h2o.init(nthreads = -1)
,这将在后台启动一个 H2O 集群——"H2O cluster" 是优化的 Java 代码并行执行。
您遇到的问题与您的 training_frame
有关。在 H2O 中,training_frame
参数必须是 "H2OFrame",而不是典型的 R data.frame。出于可扩展性原因,H2O 使用称为 "H2OFrames" 的分布式数据帧,而不是标准的内存 data.frame 对象。
要将 df
转换为 H2OFrame 并训练 GLM,请执行以下操作:
hdf <- as.h2o(df) #convert data.frame to H2OFrame
modellm <- h2o.glm(y = "v1", x = "v100",training_frame = hdf, family = "gaussian",
nfolds = 0, alpha = 0.1, lambda_search = FALSE)
或者,如果您的数据在 CSV 文件中,例如,您可以使用 h2o.importFile()
函数直接将数据导入 H2O 集群,然后您不需要转换它从 R data.frame 到 H2OFrame.
由于您是 H2O 的新手,我建议您查看我创建的这个 Jupyter R notebook 来教人们如何使用 H2O。欢迎来到 H2O!