The Workshop
The integration of foundation models into Autonomous Driving (AD) systems has the potential to revolutionize the field. Built on architectures with significant capacity, foundation models are capable of utilizing vast data collections through self-learning approaches. This enables them to achieve remarkable performance across diverse sets of tasks in different domains. Modern models such as DALL-E, CLIP, and SAM stand to be cornerstones for a wide range of solutions in the AD domain and have initiated new research directions.
The application of AD systems provides the opportunity to collect large amounts of data as they are equipped with high quality, multi-modal sensor suits. While the manual annotation of these data proves to be costly and time-intensive, foundation models not only allow tapping into this data source but also rapidly adapting to new tasks. Among other challenges, integrating different sensor modalities, considering the spatial and temporal nature of the data, and determining how to use it for planning and prediction remain open research questions. Additionally, issues like limited availability of computational resources and the importance of safety considerations are inherent to AD platforms and need to be addressed.