Abstract: Existing model-based value expansion (MVE) methods typically leverage a world model for value estimation with a fixed rollout horizon to assist policy learning. However, a proper horizon ...