强化学习 - Java 中 Python 强化学习框架的自定义环境实现

Reinforcement Learning - Custom environment implementation in Java for Python RL framework

我有一堆Java构成环境和代理的代码。我想使用 Python 强化学习库之一（stable-baselines、tf-agents、rllib 等）来训练 Java agent/environment 的策略。然后在 Java 端部署策略以进行生产。是否有将其他语言合并到 Python RL 库中的标准做法？我在考虑以下解决方案之一：

将 Java env/agent 代码包装到 REST API 中，并在 Python 中实现自定义环境，调用 API 来遍历环境。
使用 Py4j 从 Python 调用 Java 并实现自定义环境。

哪个更好？还有其他方法吗？

Edit: I ended up going the former - deploying a web server that encapsulates the environments. works quite well for me. Leaving the question open in case there is a better practice to handle this kind of situations!

第一种方法没问题。 RLLib 以与 PolicyServerInput 相同的方式实现它。用于外部环境。 https://github.com/ray-project/ray/blob/82465f9342cf05d86880e7542ffa37676c2b7c4f/rllib/env/policy_server_input.py

所以看看他们的实施。它使用 Python 数据序列化，所以我想自己的实现最好连接到 Java。

强化学习 - Java 中 Python 强化学习框架的自定义环境实现

Reinforcement Learning - Custom environment implementation in Java for Python RL framework

python

java

reinforcement-learning

openai-gym

stable-baselines