COMPUTATIONAL BIOPHYSICS LABORATORY
Learning outcomes of the course unit
At the end of the course, the student should have learnt and understood the main features of protein structures and the fundamental physical and biological principles that underlie the methodologies and techniques explained. In particular, the student must be able to:
1. consult the main biological databanks, particularly of protein sequences and structures
2. starting from a protein sequence, obtain all the available structural/functional informations
3. analyze the structural features of a protein by means of molecular graphics softwares
4. run a molecular dynamics simulation of a protein system
5. run a molecular docking simulation of a protein system
6. relate all the useful data to understand the fundamental properties of the system under study, in particular regarding the structure-activity relationship
7. use the specific terminology
8. consult and understand the scientific literature
In addition, the student should be able to choose the better approach for the study of a specific problem regarding a protein system, and to verify its efficacy and usefulness, searching solutions by themselves and developing their knowledge in depth.
Finally, the student should be able to communicate the results of their analyses and studies in a clear, complete and incisive manner. Such ability will be exerted particularly during the course, in which the drawing up of a written report of each practice performed in the lab will be required.
Course contents summary
Protein structure. Covalent and not-covalent interactions that are present in a biomolecular structure. Primary structure. Secondary structure. Topological representation. Super-secondary structures. Ramachandran plot. Tertiary structure and protein folding. Theoretical and experimental methods for determining secondary and tertiary protein structure. Comparative modeling and fold recognition methods.
Fold classification. Quaternary structure.
Biological databases. Characteristics and search engines of the main databases of protein sequences and structures.
Protein and nucleic acid sequence analysis: similarity and sequence alignment tools (pairwise and multiple alignment). Patterns and conserved motifs recognition. Protein physico-chemical profiles. Secondary structure prediction from the protein sequence.
Analysis of structural and functional features of proteins and protein complexes by means of molecular graphics softwares and web servers.
Computational techniques for the study of protein structure and dynamics. Potential energy and force fields. Solvent simulation. Energy minimization. Molecular dynamics simulations.
Molecular recognition. Molecular interaction simulations: docking and drug design.
Practices in the lab during each part of the course.
Structure of the biological molecules. The nitrogenous bases. The genetic code and its translation. Structural levels in proteins. Primary structure. The amino acids and their characteristics. The peptide bond and the backbone torsion angles. Covalent and not-covalent interactions that are present in a biomolecular structure: disulphide bridges, salt bridges, conventional and not-conventional H-bonds, cation-pi interaction, amino-aromatic interaction, aromatic stacking, hydrophobic interaction. Protein secondary structure: alpha-helix, beta-sheet, 3-10 helix, pi-helix, polyproline helix, turns, omega-loop, random coil and intrinsically unordered proteins. Topological representation. Super-secondary structures. Ramachandran plot. Secondary structure determination methods: DSSP and Stride. Protein tertiary structure. Anfinsen’s thermodynamic hypothesis. The folding problem. Homologous protein and structure preservation. Comparative modelling and fold recognition methods. Tertiary structure classification: SCOP and CATH. Protein quaternary structure.
Biological databases. Databases of sequences and structures of biological molecules and research methods. Characteristics of the main databases of nucleotide sequences. Example of an entry of the EMBL Nucleotide Sequence Database. Characteristics of the main databases of protein sequences. Example of an entry of UniProt database. The protein structure database PDB. Some other structural databases (DSSP, HSSP, PDBePISA...).
Sequence analysis. Homology and similarity. Basic protocol for sequence analysis. DNA sequence translation and protein sequence identification. Similarity search. Sequence alignment. Gap insertion. Point matrix and substitution matrix. PAM and BLOSUM. Global and local alignment algorithms (Needleman and Wunsch; Smith and Waterman). Similarity search vs. a sequence database: heuristic methods. FASTA, BLAST, PSI-PLAST, PHI-BLAST: characteristics and use. Alignment significativity: Z-score, probability, expectation-value. Sequence multiple alignment. ClustalOmega. Multiple alignment profile. Pattern recognition. Prosite, BLOCKS, PRINTS: characteristics and use.
Analysis of physico-chemical profiles of a protein sequence. Helical wheels. Secondary structure prediction methods.
Molecular graphics analysis. Characteristics of the PDB databank. The PDB file. Molecular graphics softwares and the analysis of a protein structure: structure-function relationship. RasMol, VMD, Swiss-Pdb viewer: characteristics and use.
Computational techniques. Utility, limits and possible applications. Main assumption from classic molecular mechanics. Potential energy and force field. Ewald summation and PME. Solvent simulation. Absolute and local potential energy minimum. Energy minimization.
Molecular dynamics. Statistical mechanics recalls. Ergodic hypothesis. Atomic trajectory calculation. Standard simulation protocol. Main results analysis methods. Characteristics and use of a simple MD software. Main Unix commands. The Gromacs software: characteristics and use.
Molecular recognition and binding theory. Molecular docking. Virtual screening and drug design. Lamarckian Genetic Algorithm. The Autodock4 software: characteristics and use.
The listed book are indicative and comprehend only some arguments of the course. The slides used for the lessons and some review articles will be given. The manual of the softwares are freely available on the web.
A.M. Lesk, "Introduction to protein science", Oxford Univeristy Press.
A.M. Lesk, "Introduzione alla Bioinformatica", McGraw-Hill Ed.
M. Helmer Citterich, F. Ferrè, G. pavesi, C. Romualdi, G. Pesole, “Fondamenti di Bioinformatica”, Zanichelli Ed.
G. Valle, M. Helmer Citterich, M. Attimonelli, G. Pesole, "Introduzione alla Bioinformatica", Zanichelli Ed.
D.E. Krane, M.L. Raymer, "Fondamenti di Bioinformatica", Pearson Education Ed.
Oral lessons and practices at the computer will be given in presence. Presence to te practices is mandatory. Lessons will be recorded and will be available to students for two weeks. However, this modality could vary during the semester depending on the Covid-19 emergency status.
For the oral lessons the teacher will use some slides that will be part of the educational material. These slides will be uploaded on the "Elly" platform some days before the beginning of each new argument. In addition, some review articles will be given. To download the educational material, the on-line registration to the course at the "Elly" platform is needed.
For each topic explained, there is a practical exercise, performed by means of a computer and, in some cases, of the high performance computing cluster of our university. During the practical exercises, both databanks and web server, and several software for analyses and calculations will be used. Such softwares are free for download and students that follow the lessons on-line, must dowload them in their own computer, following the instructions given by the teacher. The practical exercises first will be led by the teacher on an example system, then will be performed personally by the student, to provide them with advanced techniques and methodologies in the field of computational biophysics, in particular for the determination, prediction and analysis of the structure and the dynamics of protein systems.
A written report of each practice performed in the lab will be required, in which the student will explain the studied system, the methods used, the results obtained, the conclusions and the comments. Such reports could be given during the course (and in such a case, checked by the teacher); in any case, the reports will be due a couple of days before the exam, and will be corrected and evaluated. Therefore, the delivering of the reports is needed to do the exam.
Students have to check on the "Elly" platform the availability of the educational material and the notices from the teacher.
Assessment methods and criteria
Oral exam in presence (unless changes in the Covid-19 emergency status occur). A written report of each practice performed at the computer will be required, in which the student will explain the studied system, the methods used, the results obtained, the conclusions and the comments. Such reports could be given during the course (and in such a case, checked by the teacher); in any case, the reports will be due a couple of days before the exam, and will be corrected and evaluated. Therefore, the delivering of the reports is needed to do the exam. The exam will start from the discussion of the reports; some questions on the fundamental concepts of the studied techniques will follow. These two parts have the same weight on the final evaluation.