Predictionio 非常大的任务大小

Predictionio very large task size

我正在使用推荐引擎并修改了我的数据集。我的数据集中的几行如下

4695::132687::5
4695::132688::5
4835::132689::5
3691::132690::5

我可以成功构建训练和部署引擎。但是在发布 pio train 时,我得到了太多 very large task size messages。我认为这不是一个严重的问题,因为我可以毫无问题地部署引擎并在 REST API 上工作。下面贴出部分留言。

[INFO] [Engine$] Data santiy check is on.
[INFO] [Engine$] com.marlabs.TrainingData does not support data sanity check. Skipping check.
[INFO] [Engine$] com.marlabs.PreparedData does not support data sanity check. Skipping check.
[WARN] [BLAS] Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
[WARN] [BLAS] Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
[WARN] [TaskSetManager] Stage 16 contains a task of very large size (611 KB). The maximum recommended task size is 100 KB.
[Stage 17:>                                                         (0 + 0) / 4][WARN] [TaskSetManager] Stage 17 contains a task of very large size (614 KB). The maximum recommended task size is 100 KB.
[WARN] [LAPACK] Failed to load implementation from: com.github.fommil.netlib.NativeSystemLAPACK
[WARN] [LAPACK] Failed to load implementation from: com.github.fommil.netlib.NativeRefLAPACK
[WARN] [TaskSetManager] Stage 18 contains a task of very large size (615 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 19 contains a task of very large size (615 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 20 contains a task of very large size (616 KB). The maximum recommended task size is 100 KB.
[Stage 21:>                                                         (0 + 0) / 4][WARN] [TaskSetManager] Stage 21 contains a task of very large size (617 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 22 contains a task of very large size (618 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 23 contains a task of very large size (619 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 24 contains a task of very large size (619 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 25 contains a task of very large size (620 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 26 contains a task of very large size (621 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 27 contains a task of very large size (622 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 28 contains a task of very large size (623 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 29 contains a task of very large size (624 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 30 contains a task of very large size (624 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 31 contains a task of very large size (625 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 32 contains a task of very large size (626 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 33 contains a task of very large size (627 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 34 contains a task of very large size (628 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 35 contains a task of very large size (628 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 36 contains a task of very large size (629 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 37 contains a task of very large size (630 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 38 contains a task of very large size (631 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 39 contains a task of very large size (632 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 40 contains a task of very large size (633 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 41 contains a task of very large size (633 KB). The maximum recommended task size is 100 KB.

还有url http://localhost:7070/events.json?accessKey=<Access_Key>是显示全部事件还是部分事件?我导入了超过 20k 个事件,而 url 只显示了大约 50 个事件。

here所述,忽略此 ALS 警告应该是安全的。

如果您有兴趣深入了解这些警告的更多细节。您可以启动 Spark 独立集群。然后启用事件日志并配置日志目录和 运行 "pio train"。例如:

pio train -- --master <YOUR spark master URL> --conf spark.eventLog.enabled=true --conf spark.eventLog.dir=/your_spark_event_log_directory/event_log

然后您可以转到 Spark UI(默认情况下 http://localhost:8080/)并查看作业的阶段详细信息。

是的。查询事件服务器 http://localhost:7070/events.json?accessKey=<Access_Key> return 默认情况下 20 个事件。您可以传递 limit 参数以获取更多事件。

例如。要获得 100 个事件,请使用 "http://localhost:7070/events.json?accessKey=<Access_Key>&limit=100" 请参阅 here 了解更多详情。