Transformer-based architectures represent the state-of-the-art in sequence modeling tasks like machine translation and language understanding . The Transformer architecture has an encoder-decoder structure, where both the encoder and the decoder use only self-attention mechanisms and point-wise, fully-connected layers. Since their introduction, these models have been employed in several applications, not just limited to natural language, but also including images and videos. A concern in the use of these models is due to the quadratic time and memory complexity of attention mechanism [2, 3]. Therefore, new variants of these architectures have been proposed, that aim at improving the efficiency of Transformers architectures.
The goal of this thesis is to study the variants of Transformer architectures that have been recently proposed in literature, with particular emphasis on analyzing the efficiency of these architectures implementations and their possible application to an industrial problem.
- Initial research on state-of-the-art transformer architectures;
- Research on how Transformers handle text, images and videos;
- Research on the efficient implementation of these models;
- Application of efficient Transformer architectures to an industrial problem.
Who we’re looking for
Students that are about to get their Master Degree in: computer science, computer engineering, mechatronic engineering, mathematical engineering, mathematics, physics, informatics.
- Proficiency in at least one programming language (Python, C++), Python is preferred;
- Basic knowledge of machine learning and Deep Learning (CNN, RNN) algorithms;
- Basic knowledge of one of these Deep Learning frameworks (Tensorflow, Pytorch)
- Good mathematical and analytical skills
Duration of this Projects: 6-8 months