Learning outcomes of the course unit
The student should acquire the necessary tools for in silico analysis of DNA and proteins, also through the use of the information present in complete genomes.
Basic knowledge of Molecular Biology and Biochemistry.
Thorough knowledge of using a computer and Internet
Course contents summary
Introduction: sequences and structures, data management and analysis, central dogma, evolutionary history of sequences and genome. Evolution of DNA and proteins: Neutral theory, homology, orthology, paralogy, similarity, metrics for sequence comparison, PAM, divergence, molecular clock, Ks, Ka, accelerated evolution, convergent evolution. Biochemical predictions: predictable biochemical properties, patterns and signals, convergent evolution of patterns, Prosite, search for patterns, degradation signals, PEST, protein sorting, signal sequences, anchor sequences, glycosylation, phosphorylation, ProtParam. Structure of RNA and proteins: Types of secondary RNA structures, hairpin, bulge, loop, pseudoknots, Minimum free energy, stacking energy, covariance analysis, prediction of secondary structures, PHD, accessibility to solvent, classes, architecture, fold, CATH, SCOP, homology modelling, threading, Coiled coils, membrane proteins, transmembrane topology. Pairwise alignment: combinatorial alignment, Dot plots, repeated sequences, algorithm, dynamic programming, Needleman-Wunsh, Smith-Waterman, gap penalty, significance. Multiple alignment: Uses of multiple alignment, MSA, progressive alignment, CLUSTALW, iterative alignment, PRRP, BAliBASE, profiles, hidden Markov models, HMM profiles, Pfam, Sequence logos. Databases and search for homology: entry, GenBank, SwissProt, PDB, Expressed Sequence Tags, IMAGE, SRS, Entrez, FASTA, BLAST, PSI-BLAST, significance, sensitivity, selectivity, coverage. Trees and network: structure and properties of biological networks; Phylogenetic analysis: tree of life, nomenclature of phylogenetic trees, cladograms, phylograms, ultrametric trees, rooted and unrooted trees, amino acid and nucleotide distances, UPGMA, Neighbour-joining, maximum likelihood, parsimony, bootstrap. Genomes: physical maps and genetic maps, DNA fingerprinting, BAC, genomic sequencing methods: clone by clone and WGS, assembly, contig, scaffold, draft and finished sequences, ORF, gene-finding. Comparative genomics. Functional associations between proteins inferred from complete genomes.
Introduzione alla bioinformatica G. Valle et al., Zanichelli, 2003
Bioinformatics: Sequence and Genome analysis. D. W. Mount, CSHL Press, 2001
Protein Evolution. L. Patty, Blackwell Science, 1999
Oral lectures, accompanied by practical exercises for the teaching of sequence analysis programs in Windows and Linux environment.
The student must present a written paper with the computer characterization of a hypothetical protein present in complete genomes. An oral examination will follow judgement of the paper.