OpenViGA: Video Generation for Automotive Driving Scenes by Streamlining and Fine-Tuning Open Source Models with Public Data
Björn Möller, Zhengyang Li, Malte Stelzer, Thomas Graave, Fabian Bettels, Muaaz Ataya and Tim Fingscheidt
This repo contains the official implementation of OpenViGA, an open video generation system for automotive driving scenes.
OpenViGA consists of an image tokenizer, encodin input frames into a latent representation of discrete tokens, a world model then predicting subsequent latent image tokens, and a video decoder, generating the output frames.
This repository is under construction.
Full implementation and documentation will be available soon.
@misc{moller_openviga_2025,
title = {{OpenViGA}: Video Generation for Automotive Driving Scenes by Streamlining and Fine-Tuning Open Source Models with Public Data},
author = {Möller, Björn and Li, Zhengyang and Stelzer, Malte and Graave, Thomas and Bettels, Fabian and Ataya, Muaaz and Fingscheidt, Tim},
year = {2025},
month = sep,
eprint = {2509.15479},
archivePrefix = {arXiv},
primaryClass = {cs.CV},
note = {arXiv preprint arXiv:2509.15479}
}