Towards Federated Fleet Learning Leveraging Unannotated Data

Thesis Project created by Alexander Viala Bellander
1y ago update


In the quest for safer and more efficient autonomous driving systems, the artificial intelligence community grapples with a significant challenge - the requirement for vast amounts of labelled data, which often cannot leave the client device due to privacy and regulatory concerns. This issue formed the crux of our Master's thesis at Zenseact, "Towards Federated Fleet Learning Using Unannotated Data".

Under the expert guidance of our supervisors, Johan Östman, PhD, and Mina Alibeigi, PhD, we delved into the promising realm of Federated Learning (FL). FL allows for a collective learning approach, where machine learning models are built across multiple data sources, without needing to centralise the data. However, a common assumption in many studies applying FL in autonomous driving is the availability of ample labelled data on client devices. The reality, however, is often starkly different - especially since client data in autonomous driving applications is largely unlabelled.

Recognising this gap, our study pioneered the use of semi-supervised learning techniques for ego-road segmentation and imitation learning for trajectory prediction in the context of FL. This innovative application has effectively navigated the challenge of scarce labelled data that is prevalent in many autonomous driving systems.

What we uncovered was the compelling potential of FL in autonomous driving when leveraging on-vehicle-generated labels or employing semi-supervised or unsupervised learning methods. Even in the face of scarce labelled data, these results underscored the viability of FL as a promising methodology within the autonomous driving landscape.

FL also opens the possibility for integrating privacy-preserving methodologies, a key concern when handling data from client cars. While FL doesn't inherently ensure privacy, the structure of its framework can be leveraged to implement privacy-preserving measures, making it a more feasible and attractive solution for real-world autonomous driving applications.

In conclusion, our study not only highlighted the potential of FL in addressing the key challenges within the autonomous driving sector but also urged for continued innovation and research into utilizing unlabelled data effectively. With its compelling potential and versatility, FL can be a significant game-changer in the race towards fully autonomous vehicles.


This thesis is made in collaboration with Zenseact. The authors will be using Zenseact's Open Dataset.


Connect with us:
Yazan Ghafir
Alexander Viala Bellander


Our Master's thesis set out with the following key objectives:

Exploring Federated Learning (FL) in the Context of Autonomous Driving: Our primary aim was to understand the potential of FL in the context of autonomous driving. Given the inherently decentralised nature of vehicular data, FL provides an intriguing proposition for learning across fleet data while preserving data privacy.

Addressing the Scarcity of Labelled Data: A key challenge in applying FL in autonomous driving lies in the limited availability of labelled data on client devices. We aimed to investigate how this problem could be mitigated, paving the way for more effective utilisation of FL in autonomous driving.

Leveraging Semi-Supervised Learning Techniques: We sought to apply semi-supervised learning for ego-road segmentation and imitation learning for trajectory prediction within the FL framework. The objective was to maximise the utility of available unlabelled data, turning a potential roadblock into an advantage.

Evaluating the Effectiveness of FL with Unlabelled Data: Our study intended to assess the effectiveness of FL when using on-vehicle generated labels or employing semi-supervised or unsupervised learning methods. The goal was to evaluate whether FL can still provide meaningful results in the face of scarce labelled data.

Highlighting the Potential for Privacy-Preserving Methodologies within FL: While FL doesn't inherently ensure privacy, we wanted to highlight its potential for integrating privacy-preserving measures. This focus was driven by the importance of data privacy in handling data from client vehicles.

Through these objectives, our study sought to shed light on the unique challenges and opportunities associated with applying FL in autonomous driving. Our hope was to contribute to the broader conversation around the potential for FL to transform this rapidly advancing field.


As we embarked on our journey through the world of Federated Learning in Autonomous Driving, we forged novel approaches to the challenges we faced. Here's how we addressed our primary objectives:

Harnessing the Power of Unlabelled Data: To overcome the scarcity of labelled data, we made use of on-vehicle generated labels, coupled with semi-supervised and unsupervised learning methods. This allowed us to utilise the vast amounts of data produced by vehicles in a practical way, demonstrating that FL can be effective even in environments where labelled data is sparse.

Using Semi-Supervised Learning Techniques: For ego-road segmentation and trajectory prediction, we deployed semi-supervised learning techniques. By doing so, we unlocked the potential of the wealth of unlabelled data at our disposal. By extrapolating from limited labelled data, we were able to make accurate predictions, illustrating the effectiveness of these techniques in the context of FL.

Privacy-Preserving Approaches: Although FL does not inherently guarantee privacy, it opens up opportunities for implementing privacy-preserving techniques. We highlighted these possibilities in our work, contributing to the ongoing discourse around data privacy in FL and autonomous driving.

Validating the Effectiveness of FL: Through our work, we discovered the compelling potential of FL in autonomous driving. When using on-vehicle generated labels or employing semi-supervised or unsupervised learning methods, FL showed promise as an effective learning methodology for autonomous driving, even in the absence of abundant labelled data.

In addressing these challenges, we have underlined the viability of FL as a practical methodology in autonomous driving. While hurdles remain, our work contributes to a growing body of knowledge demonstrating that with innovative approaches and careful implementation, FL can be a powerful tool in the field of autonomous vehicles.

Outcomes and Future Directions

Our endeavour has produced insightful findings, contributing to the growing body of knowledge on Federated Learning in the context of autonomous driving. We have validated the potential of FL when utilising on-vehicle generated labels or employing semi-supervised or unsupervised learning methods. This emphasises FL's promise as a robust learning methodology, even when faced with sparse labelled data.

We are thrilled to have our work open-sourced on Zenseact's GitHub repository, allowing others to learn from and build upon our research. The repository, found here, will be polished and made more accessible by July.

In a significant affirmation of our work, we're proud to announce that our research has in-part inspired a large scale FL project which has been awarded 5.5MSEK in funding from Vinnova. This team effort, including participants from Volvo Cars, Zenseact, and AI Sweden, is a testament to the promising future of Federated Learning. The project is a potential game-changer, contributing to over six of the UN's Sustainable Development Goals and offering a way forward for data privacy, regulatory compliance, and cost efficiency in the realm of autonomous vehicles.

Our journey has been both challenging and rewarding, and we are excited to see where these initial steps into the world of Federated Learning in autonomous driving will lead us next.


Better Customer Experience
Optimization, Vision
CNN, DNN, Federated, Machine Learning, Multimodal, Self/Unsupervised, Transformer