Ph.D. Student
Contacts
Dipartimento di Informatica, Sistemistica e ComunicazioneUniversità degli Studi di MilanoBicoccaRoom 1004, building U14Viale Sarca 336, 20126 Milano, ItalyTelephone number: +39 026448 7879Email address: citrolo@disco.unimib.it
Research areas
Combinatorial Optimization, Evolutionary algorithm, Protein Folding Prediction.
Synopsis of research activities
One of the main challenges in computational biology is to predict 3D sturcture of a protein using only information contained in is amminoacidic sequence. This problem is called protein structure prediction (PSP) problem. My research is divided in two major units that will be eventually joined in order to introduce a new protein structure prediction tool. The first unit focuses on the study of a new sampling strategy to better address the peculiarity of PSP. The second unit focuses on the development of functions for the evaluation of structural models and is based on genetic programming (GP) and statistical potentials. Moreover, I am also collaborating in the definition and development of a new method for the Molecula Distance Geometry (MDG) problem: this is the problem of identify the three dimensional structure of a molecule given a sparse set of the internal distances between its atoms.
Protein folding prediction is a nonconvex optimization problem very relevant in biology. It consists in predicting the three dimensional structure of a protein molecule starting from the sequence information alone. The complexity of this problem arises from many sources: in the first place the search space is huge because an astonishing number of different conformations have to be considered in order to find the currect one. In second place evaluating each model is hard since thousands of atomic interactions have to be computed during and the energy description currently available is only an approximation of the true energy. Many large scale projects have been established to develop PSP tools and to perform predictions of real protein structures using both supercomputers and distributed computing , nevertheless the key to achieve reliable predictions is yet to be found. Simplified representation models such as the HP model (bottom right) preserve a considerable complexity (HPPSP is NPhard) and many aspects of the original problem without the need for astronomical computational resources; today these models allow us to test new sampling strategies.
Statistical potentials are functions used to evaluate protein models in many PSP tools. A statistical potential is the ratio between an observed distribution of a given properties (i.e. a compositional and/or geometrical descriptors) and a neutral model distribution. In the case of protein structure evaluation, the distributions of the first kind are extracted from a representative set of all the known protein structures, while those of the second kind are defined in order to model the expected value of the considered properties basing only on compositional and geometrical considerations. A statistical potential is primarily evaluated considering its ability to sort a set of structural models for a target sequence (with known structure) coherently with some structural metrics that considers the distance between each model and the real structure of the target.


Genetic Programming is a stochastic optimization heuristic for functional regression that allows solutions to be complex entities such as programs. During a GP run a population of solutions is exposed to an evolutionary pressure and iteratively perturbed using geneticinspired operators (crossover and mutation). The result of these procedure is a progressive increase in the fitness (optimality) of the solutions in the population.


Publications (papers and proceedings)
Citrolo, Andrea G., and Giancarlo Mauri. "A local landscape mapping method for protein structure prediction in the HP model." Natural Computing 13.3 (2014): 309319.
Nobile, M.S., Citrolo, A.G., Cazzaniga, P., Besozzi, D., and Mauri, G. "A memetic hybrid method for the Molecular Distance Geometry Problem with incomplete information." Evolutionary Computation (CEC), 2014 IEEE Congress on. IEEE, 2014.
Citrolo, Andrea G., and Giancarlo Mauri. "A Hybrid Monte Carlo Ant Colony Optimization Approach for Protein Structure Prediction in the HP Model."EPTCS 130: 6169.
Simone Bianco, Andrea G. Citrolo. High Contrast Color Sets under Multiple Illuminants. Lecture Notes in Computer Science Volume 7786, 2013, pp 133142.
Education
2011
Università degli studi di MilanoBicocca
Master degree in: "Bioinformatics"
Dissertation title: "Genetic programming applied to statistical potential optimization for protein structure evaluation"
Advisors: Prof. Luca De Gioia, Prof. Leonardo Vanneschi
Grading: 110/110 magna cum laude
2008
Università degli studi di MilanoBicocca
Bachelor degree in: "Molecular Biotechnologies"
Dissertation Title: "Application of computational design to protein stability and binding specificity"
Advisor: Prof. Luca De Gioia
Grading: 110/110
Teaching
2015
"C programming": teaching activity, Bachelor's program in physics, University of MilanoBicocca.
2014
"Computational Biology": lecturer, Master's program in Computer Science, University of MilanoBicocca
"Introduction to Computer Science": teaching activity, Bachelor's program in Languages and Foreign Literatures, University of Bergamo.
2013
"Computational Biology": lecturer, Master's program in Computer Science, University of MilanoBicocca
"Introduction to Computer Science": teaching activity, Bachelor's program in Languages and Foreign Literatures, University of Bergamo.
2012
"Laboratory of computational biology": lecturer, Master's program in Biology, University of MilanoBicocca