Learning outcomes of the course unit
Provide the basis of chemometrics illustrating its application and that of multivariate data
analysis in general, in the following contexts: optimization and experimental design;
exploratory and descriptive statistical data analysis; methodologies for quality control of
products and industrial processes (QbD, PAT); management of data, signals and images
from hyphenated techniques; applied aspects of regression and classification, multivariate
calibration; validation of chemometric models.
Lectures, discussions, study materials provide the basic knowledge of chemometrics and its
application to: optimization and experimental design; quality control of products and
industrial processes (QbD, PAT); management of data, signals, images, data from
hyphenated techniques; applied regression and classification; multivariate calibration.
Computer exercises, group discussions furnish capability to apply knowledge of studied
methods: i) plan factorial experimental design, calculate and interpret the response surface
identifying the experimental conditions corresponding to optimal values; ii) carry out the
exploratory analysis of two-dimensional data tables; iii) calculate, represent, interpret and
validate regression and classification models; iv) use chemometric software in MATLAB
Seminars and/or visits to companies give the student ability to identify non-invasive
techniques for non-destructive analysis of materials, foodstuff, environmental matrices, and
for process monitoring.
Written reports, practical exercises give the capability to understand and critically discuss
the results. Group discussions make the student acquire the ability to suggest effective
methods of data analysis as a function of the problem, discuss the results in term of chemical
Written exercises reports make the student able to write reports, presenting the results in a
manner understandable to novice users. Classroom discussions and preparation of the final
exam give the ability to communicate the results with appropriate vocabulary and using
The activities described allow developing the ability to identify the sources of information
and effective software to improve knowledge relative to data analysis. He acquire as well the
ability to learn autonomously side/synergistic aspects to those covered.
Basic knowledge of General Chemistry and Analytical Chemistry regarding the basics of
sampling, measurement uncertainty, an overview of the main instrumental analysis
techniques. Some univariate statistical concepts: the concept of distribution of a variable,
mean, standard deviation, correlation between pairs of variables. Knowing how to use a
computer, a spreadsheet and a text editor.
Course contents summary
Introduction to Experimental Design and Optimization techniques: exploration and
screening (Full and fractional factorial designs, pluckett and Burman design); optimization
(central composite designs); d-optimal designs; designs for the study of formulations.
Introduction to elaboration of response surfaces. Diagnosis of models through the analysis
of residuals and normal probability plots.
Exploratory data analysis: univariate methods (frequency histograms, box plots, scatter
plots); multivariate methods, meaning, definition and calculation of the latent variables.
Principal component analysis, PCA (definition, derivation, application), graphical
representation (scores, loadings, biplot); Cluster analysis. Pretreatment of multivariate data:
punctual variables, instrumental signals, images. Introduction to multivariate methods for
process monitoring: multivariate control charts. Introduction to NIR spectroscopy and
pretreatments of NIR signals. Outline of spectroscopic techniques that can be implemented
at/ on/in-line in the PAT context.
Methods for Uni and Multivariate Regression: MLR, PCR and PLS. Multivariate calibration.
Illustration of some variable ranking methods.
Introduction to classification: SIMCA, LDA, PLS-DA, differences and context of application.
Each topic includes integration with exercises that provide for the analysis (and in some
cases the acquisition) of the data set related to chemical experimentation. Softwares used:
Open Office, PLS-Toolbox for MATLAB.
PLS toolbox Manual - Eigenvectors: www.eigenvector.com
Several Tutorial papers suggested by the Teacher.
K. Varmuza, P. Filzmoser, Introduction to multivariate statistical analysis in chemometrics,
CRC press 2009
Print ISBN: 978-1-4200-5947-2 eBook ISBN:978-1-4200-5949-6
P. Brereton, Chemometrics: data analysis for the laboratory and chemical plant, Wiley
Publisher 2003 ISBN: 0-471 48978-6
R. Wherens, Chemometrics with R, Springer 2011, www.springer.com/life+science
L. Eriksson, E. Johansson, et al.
Multi- and Megavariate Data Analysis Part I: Basic Principles and Applications, Second
edition, Umetrics Academy: www.umetrics.com/services/literature ISBN-10:91-973730-2-8
M. Forina Fondamenta di Chimica Analitica e-book:
www.sisnir.org/452/index.html (download at the bottom of the page)
R. Todeschini, Introduzione alla Chemiometria EdiSES 1998,ISBN: 8879591460
D.L. Massart and B. Vandeginste, Chemometrics: a textbook, Elsevier 1988, ISBN:
Frontal lessons with power points slides. Interactive excercises with chemometrics software.
Autonomus exercises by the students on real data sets. Discussion on the presented topics.
Comments and correction of the students' exercises reports.Seminars which illustrate
chemometrics application to industry.
Assessment methods and criteria
Evaluation during the course: it is required to write a report for each exercise session. Each
report is evaluated on a numerical scale from 0 to 10, according to the criteria: organization,
language and ability to synthesize (0-3); Selection of appropriate methods of analysis (0-2);
correct application (0-2); ability to describe and interpret the results (0-3). The evaluation
does not lead to the attribution of a valid score for the final exam but serves as a
qualification to take the exam (average> = 6).
Final verifcation,, it is possible to choose between two modalities:
1) to each student is assigned, towards the end of the course a data set to be processed by
some of the methods used during the exercises, following the requests assigned. Then, it is
requested to make a presentation of the processed data, in elctronic format, in order to be
discussed during the final exam. During the discussion some questions are asked taking a
cue from the student discussion. In addition, two questions are posed which may relate to
other course topics not discussed previously, in order to ascertain the degree of knowledge.
2) to the student are posed 3 broad questions on the three basic parts of the program (DoE,
exploratory analysis, modeling) and other 2-3 questions on more specific topics aimed at
evaluating the ability to apply the methodologies studied.
In the final assessment are evaluated: the correct selection/description and
application/discussion of processing methods used (30%); the ability to apply the acquired
knowledge (30%); communication abilities (10%); the level of theoretical knowledge (30%).
The final score is expressed in thirtieths with eventual praise.
Non-attending students, who have not presented excersise reports, have to answer further
questions to ascertain their knowledge and judgment in the analysis of data sets.