Efficient deep learning in space. Knowledge distillation and optimization of resource usage in a satellite
Efficient deep learning in space. Knowledge distillation and optimization of resource usage in a satellite
Chalmers University of Technology, Department of Mechanics and Maritime Sciences
Authors: Ebaa Asaad and Sara Larsson
Abstract
The development of micro-satellites and machine learning (ML) has increased drastically in recent years, which has unlocked new possibilities in the field of Earth observation. One of the applications is the tracking of maritime vessels since the current tracking systems such as the automatic identification system (AIS) can be eluded by simply switching it off. This project, therefore, investigates the possibilities of applying ML in a satellite, specifically with the aim of detecting maritime vessels.
An object detector (YOLOv5) was chosen for testing due to its speed, small size, and its user-friendly framework. For comparison with a simpler model, the classification models ShuffleNetV1 and a custom-built CNN model were chosen. Thereafter, for the purpose of optimization, knowledge distillation, as well as different methods for reducing resource usage, were tested.
The results show that it is feasible to implement ML on board a satellite to detect
maritime vessels, where the best result for YOLOv5 was 2.1 min per 10,000×10,000
pixels RGB image on the target hardware using the GPU. Using the CPU with
multiple threads achieved a result of 2.2 min for the same image. Increasing the
batch size did not yield better results. ShuffleNetV1 was not supported by the
TFLite framework, due to a network structure called group convolutions. Neither
was quantization supported by the target hardware, but it did decrease the file size of the model by half.
Using knowledge distillation showed great results for the classifiers. Using ShuffleNetV1 to train the simpler CNN model yielded an increase of 12 % in accuracy.
It also shows that it is possible to apply a non-supported network on the target
device by distilling the knowledge to a supported network. Distilling knowledge using YOLOv5 was more difficult, due to the complexity of the network and the task of object detection. Two methods were therefore tested: using the teacher’s output (logits) as a target and using the feature maps within the network as targets (feature imitation). However, only minimal improvements were reflected in the results.