deepmind如何减少Atari游戏Q值的计算？

How deepmind reduce the calculation for Q values for Atari games?

我们知道 q-learning 需要大量的计算：

对于一个游戏AI来说，它需要比OX游戏，GO游戏更多的q值。

这些大量的q值是如何计算的？

谢谢。

MCTS 实际上并没有减少 q-values 的任何计算。

对于一个非常简单的 Atari 游戏 AI，它需要的 q 值远不止 3^(19x19) 个。

查看深度q网络，解决了你的问题。

We could represent our Q-function with a neural network, that takes the state (four game screens) and action as input and outputs the corresponding Q-value. Alternatively we could take only game screens as input and output the Q-value for each possible action. This approach has the advantage, that if we want to perform a Q-value update or pick the action with highest Q-value, we only have to do one forward pass through the network and have all Q-values for all actions immediately available.

https://neuro.cs.ut.ee/demystifying-deep-reinforcement-learning/

deepmind如何减少Atari游戏Q值的计算？

How deepmind reduce the calculation for Q values for Atari games?

c

sql

machine-learning

reinforcement-learning

tensorflow