
Ml-agents cooperative push block not returning rewards

我正在使用 Cooperative 推块环境( (exported in order to use the Python API) using the latest stable version. The issue is that I'm not getting the reward (positives or negatives). It is always 0. If I export the Single push block environment, I receive the rewards correctly. Below you have the code I'm using from the collab example

decision_steps, terminal_steps = env.get_steps(behavior_name)
if tracked_agent in decision_steps:
    episode_rewards += decision_steps[tracked_agent].reward

print('REWARD', decision_steps.reward) # Always 0
# Each decision_steps[tracked_agent].reward also returns 0

根据文档,我应该收到负面惩罚 (-0.0001) 或正面信号 +1、+2、+3。即使他们随机推一个区块,我也收到 0 作为奖励。


我从 Unity ml-agents GitHub 问题部分收到了这个答案:

DecisionStep 还有一个 group_reward 字段,它与奖励字段是分开的。给予 Cooperative Pushblock 代理的组奖励应该在这里。 很抱歉合作没有明确指出这一点,我会对其进行更新。