1st Foundation Models for Autonomous Driving Workshop

The Workshop

The integration of foundation models into Autonomous Driving (AD) systems has the potential to revolutionize the field. Built on architectures with significant capacity, foundation models are capable of utilizing vast data collections through self-learning approaches. This enables them to achieve remarkable performance across diverse sets of tasks in different domains. Modern models such as DALL-E, CLIP, and SAM stand to be cornerstones for a wide range of solutions in the AD domain and have initiated new research directions.

The application of AD systems provides the opportunity to collect large amounts of data as they are equipped with high quality, multi-modal sensor suits. While the manual annotation of these data proves to be costly and time-intensive, foundation models not only allow tapping into this data source but also rapidly adapting to new tasks. Among other challenges, integrating different sensor modalities, considering the spatial and temporal nature of the data, and determining how to use it for planning and prediction remain open research questions. Additionally, issues like limited availability of computational resources and the importance of safety considerations are inherent to AD platforms and need to be addressed.

Scope and Topics

The topics of interest of the workshop include, but are not limited to:

Trends in model architectures: Examine the latest advancements in large vision models, multi-modal foundation models, and their customization for the AD domain.
Adaptation to AD sensor modalities: Typical perception data is spatially and temporally distributed and needs to be integrated with other inputs, which may include map-based information, language, and more.
Usage of large data sources: Explore the vast data streams generated by AD and their incorporation into the training and adaption of foundation models.
Self-learning methods for AD: Investigate and propose different self-learning methods, such as contrastive learning and reconstruction-based learning, for their use in the field of AD.
Identification of AD tasks: Take a closer look at the best practices for adding known and new downstream tasks to the foundation model and how they are trained.
Interpretability and trust: Investigate techniques for understanding and explaining foundational model decisions and incorporating safety guarantees.

Program

Time	Event
08:30	Greetings and Introduction
08:35	Zhixiang Wei, University of Science and Technology of China Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Generalized Urban Scene Segmentation
08:50	Yun Li, University of Tokyo Large Language Models for Human-like Autonomous Driving: A Survey
09:20	Rares Ambrus, Toyota Research Institute Visual Foundation Models for Embodied Applications
09:50	Coffee break
10:20	Aleksandr Petiushko, Gatik Middle Mile and Foundation Models
10:50	Gilles Puy, Valeo.ai Leveraging image foundation models to pretrain lidar networks
11:20	Short break
11:30	Long Chen, Wayve Building Foundation Models for Autonomous Driving
12:00	Royden Wagner, Karlsruhe Insitute of Technology Representation Learning for Motion Forecasting

Contributions

We welcome researchers in the field to submit papers to be presented in pitch-talks. Submitted manuscripts can be at most 4 pages (excluding references), formatted according to ITSC standards using the Paper Template downloadable on the IEEE ITSC 2024 website (two-column format). We accept previously published research and double submissions from the main conference. However, please note that submissions will not be listed in the official conference proceedings. Submissions will be reviewed and selected based on their originality, relevance to the workshop topics, contributions, technical clarity, and presentation.

We accept submissions through OpenReview.

Dates

Submission Deadline: Sep 11, 2024
Author Notification: Sep 14, 2024
Workshop: Sep 24, 2024

We aim to review papers on a rolling basis and will notify authors as soon as possible. We look forward to receiving your submissions!

P.S. New Openreview profiles created with an institutional email are activated automatically; those without undergo a moderation process that can take up to two weeks—contact us if this poses an issue.