在 OpenAI 健身房环境中,初始状态是随机的还是特定的?

In OpenAI gym environments the initial state is random or specific?

是在OpenAI gym这样的强化学习环境中随机选择的初始状态。换句话说,命令 env.reset() 会导致随机选择的初始状态还是特定的初始状态?

通常是的,它是随机的。但是,您最好查看环境的源代码以确保万无一失。例如,the pendulum initial state is uniformly drawn from the whole state space, while for the mountain car the state position is uniformly drawn from [-0.6, -0.4] and the velocity is always 0.