说明:深度Q-Learning与model-based方法结合来解决连续动作问题
Model-free reinforcement learning has been successfully applied to a range of challenging problems, and has recently been extended to handle large neural network policies and value functions. However, the sample <oldxacorn> 上传 | 大小:1mb