The human genome project , an international scientific research project with the goal of determining the sequence of nucleotide base pairs that make up human DNA, lasted roughly 15 years and cost $5 billion (adjusted for inflation). With the recent advances in genome sequencing technology, that cost has now reduced to a few hundreds dollars  and can be done overnight.
Being able to access this kind of information may have a deep impact on the way complex diseases are treated: physicians will shift from general-purpose treatments to specific ones, tailored on the individual patient’s genomic features. This approach is referred to as precision medicine.
There are however several caveats: first of all, due to the nature of the problem, knowledge of both the biomedical and the computer science domain are required in order to correctly approach it; second, unlike more classical scenarios such as image classification or object detection, it is much more difficult to determine the accuracy of the system due to the complex and multifactorial nature of complex diseases such as cancer and neurodegenerative diseases.
Moreover, a black box kind of solution is unlikely to be of any use, due to legal and ethical reasons: interpretability of the model is crucial more than ever.
The goal of this thesis is to explore the possibilities and the limits of techniques based on deep neural networks for the analysis of biomolecular data, experimenting with publicly available datasets.
4-6 weeks for study of literature
Identification of the tasks / questions to be answered.
Review of classical approaches
State of the art of DL algorithms used in genomics
4 weeks dataset recovery / explorative analysis
8 weeks implementation
Competencies to be acquired:
- Expertise on recent Deep Learning algorithms;
- Application of DL techniques to
- Experience in algorithm design, analysis and comparison with respect to a real application.
Duration of this Project: 5-6 months.
Who we’re looking for
Students that are about to get their Master Degree in: mathematics, physics, computer science, mathematical engineering, bioinformatics, computer engineering, mechatronic engineering, mathematical engineering, physics of complex systems.
- Good knowledge of machine learning from a probability perspective;
- Basic knowledge of at least one programming language (preferably Python)
- Good knowledge of the english language.
Optional Skills, considered as a plus:
- Proficiency in at least one programming language (Python, Lua, Matlab, C++, Java);
- Some background in basic biology.