调用环境状态元组

Question

我是 Open Ai Gym 的新手，目前运行出租车环境中的强化学习 (RL)，我的研究要求我能够调用状态元组（或调用 "State Space"在 Taxi.py 文件中）用于一些数据挖掘/状态-动作对操作。

有函数调用吗？

例如：State(123) = (taxi_row, taxi_col, passenger_location, destination)

在 RL 中，状态和动作以矩阵形式表示，column = state, row = action。

在源代码 (taxi.py) 中它被称为 "state space is represented by (taxi_row, taxi_col, passenger_location, destination)"

Answer 1

你可以这样做：

>>> import gym
>>> env = gym.make('Taxi-v2')
>>> from gym.envs.toy_text.taxi import *
>>> 
>>> 
>>> x = TaxiEnv()
>>> random_state = 123
>>> taxi_row, taxi_col, passenger_index, destination_index = x.decode(random_state)
>>> taxi_row
1
>>> taxi_col
1
>>> passenger_index
0
>>> destination_index
3

在你的问题中，你想要 passenger_location 和 destination。但是我使用的代码返回了 passenger_index 和 destination_index。因此，如果您了解环境地图，您可以轻松获得位置。

以下是环境中使用的简单贴图：

MAP = [
    "+---------+",
    "|R: | : :G|",
    "| : | : : |",
    "| : : : : |",
    "| | : | : |",
    "|Y| : |B: |",
    "+---------+",
]

在这张地图中，我们有四个不同的位置（R、G、Y、B）。现在，您可以像这样知道索引轻松获取乘客位置和目的地：

乘客位置：
- 0：R（编辑）
- 1：G（绿色）
- 2：Y（黄色）
- 3: B(蓝)
- 4：在出租车里
目的地：
- 0：R（编辑）
- 1：G（绿色）
- 2：Y（黄色）
- 3: B(蓝)

希望这能回答您的问题！！

调用环境状态元组

Calling Env State Tuple

python

reinforcement-learning

openai-gym