文章要点:这篇文章提出Policy Transfer Framework (PTF)算法来做policy transfer。主要思路就是自动去学什么时候用哪一个source policy用来作为target policy的学习目标,以及什么时候terminate这个source policy并换另一个source policy来学习(learns when and which source policy is the best to reuse for the target policy and when to terminate it by modeling multi-policy transfer as the option learning problem. adaptively select a suitable source policy during target task learning and use it as a complementary optimization objective of the target policy)。