A multi-step on-policy deep reinforcement learning method assisted by off-policy policy evaluation

Huaqing Zhang, Hongbin Ma*, Bemnet Wondimagegnehu Mersha, Ying Jin

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

指纹

探究 'A multi-step on-policy deep reinforcement learning method assisted by off-policy policy evaluation' 的科研主题。它们共同构成独一无二的指纹。

Computer Science

Social Sciences

Chemical Engineering

Engineering