R中H2o对象的随机离散值生成

Random discrete values gneration for H2o Object in R

我想为我的 H2o 对象(3GB 数据)生成随机离散值,如下例所示。

示例:

  C1    d_rand  d_status
1   0.886581278 1
2   0.117570381 0
3   0.824350102 1
4   0.356774692 0
5   0.995249866 1

我写了如下的 R-h2o 代码,但是我无法得到我的结果。

> rand_num <- h2o.runif(sample_3gb, seed = 123)
> sample_3gb$d_rand = rand_num
> sample_3gb$d_rand
H2OFrame with 9227049 rows and 1 column

First 10 rows:
       d_rand
1  0.06254423
2  0.15162557
3  0.18380040
4  0.66398323
5  0.92064923
6  0.54746199
7  0.45642585
8  0.69650692
9  0.54063600
10 0.77103990
> sample_3gb$d_status = 1
> sample_3gb$d_status[sample_3gb$d_rand <= 0.3] <- 0
Error in `[<-`(`*tmp*`, sample_3gb$d_rand <= 0.3, value = 0) : 
  `i` must be missing or a numeric vector

下面是我的 H2o 集群的详细信息

R is connected to H2O cluster:
    H2O cluster uptime:         3 minutes 57 seconds 
    H2O cluster version:        3.0.0.30 
    H2O cluster name:           H2O_60331 
    H2O cluster total nodes:    2 
    H2O cluster total memory:   9.58 GB 
    H2O cluster total cores:    24 
    H2O cluster allowed cores:  24 
    H2O cluster healthy:        TRUE 

我认为这发生在 R 和 H2o 对象之间的数据类型问题上,即 R 没有将 h2o 对象的数值读取为数字。对于其他一些条件操作,我也面临同样的问题。

我自己找到了答案..

> rand_num <- h2o.runif(sample_3gb, seed = 123)
> sample_3gb[,"status"] <- ifelse(rand_num > 0.3, 1, 0)
> sample_3gb[,"status"]
H2OFrame with 9227049 rows and 1 column

First 10 rows:
   status
1       0
2       0
3       0
4       1
5       1
6       1
7       1
8       1
9       1
10      1