Zero-Shot Adaptation of Behavioral Foundation Models to Unseen Dynamics

1AIRI, 2Skoltech, 3MIPT, 4MSU, 5Innopolis University

TLDR

Zero-shot reinforcement learning aims to deploy agents in new environments without test-time fine-tuning. Behavioral Foundation Models (BFMs) offer a promising framework, but existing methods fail under dynamics shifts, due to interference—they average over different environment dynamics, leading to entangled policy representations.

We identify this limitation in Forward-Backward (FB) representations, analyze it theoretically and and propose two solutions: Belief-FB (BFB), which infers the latent context (dynamics) via belief estimation and conditions the model accordingly, and Rotation-FB (RFB), which further disentangles policy space by adjusting the prior over \(z_{FB}\).

Main Figure
Our method achieves strong zero-shot performance across both seen and unseen dynamics, significantly outperforming prior approaches.

Interference Problem

We found that zero-shot failures in BFMs stem from interference – training on mixed dynamics entangles
policies, as the model can't distinguish between
environment variations.

Belief Estimation

We mitigate interference by inferring latent dynamics from trajectories. Conditioning on this belief lets BFMs separate policies and generalize across environments.

Learned Q-functions

Our findings show that Forward-Backward fails to adapt Q-functions to dynamics, unlike our method, which produce layout-aware policies that avoid obstacles.

Different Layouts

Environments

We evaluate our method in both discrete and continuous settings with procedurally generated layouts. To test generalization, a subset of environments is held out entirely during training and used only at test time. Here we present some of the layouts.

Different Layouts

Citation:

@article{bobrin2025zero,
      title={Zero-Shot Adaptation of Behavioral Foundation Models to Unseen Dynamics},
      author={Bobrin, Maksim and Zisman, Ilya and Nikulin, Alexander and Kurenkov, Vladislav and Dylov, Dmitry},
      journal={arXiv preprint arXiv:2505.13150},
      year={2025}
      },
}