**Cross-Domain Imitation Learning via Optimal Transport**

Arnaud FickingerSamuel Cohen

Stuart RussellBrandon Amos

Paper

![](images/gromov.png) *__tldr__: We provably achieve strong transfer in non-trivial continuous control domains by minimizing the Gromov-Wasserstein distance with deep reinforcement learning. (#) Videos of Learned Policies Below, we visualize examples of the behavior learned by our method. The videos shown on the left correspond to optimal trajectories in the expert's domain. The videos shown on the right correspond to transferred behaviors in the agent's domain learned using a single demonstration in the expert's domain and without any external reward. Different agent's videos correspond to different seeds. **From pendulum to cartpole** ![Expert](videos/pendulum_expert.mp4 width="100%") ![Agent](videos/cartpole_imitation.mp4 width="100%") **From cheetah to walker** ![Expert](videos/cheetah_expert_floor.mp4 width="100%") ![Agent](videos/walker_forward_imitation.mp4 width="100%") ![Agent](videos/walker_backward_imitation.mp4 width="100%") **Isometric mazes** ![Expert](videos/maze_expert.mp4 width="150%") ![Agent](videos/maze_imitation.mp4 width="150%") --------------------------