ELEMENTI DI ANALISI MULTIVARIATA E MODELLISTICA PER LA CHIMICA E L'AMBIENTE

Degree course: 
Corso di Second cycle degree in ENVIRONMENTAL SCIENCES
Academic year when starting the degree: 
2021/2022
Year: 
1
Academic year in which the course will be held: 
2021/2022
Course type: 
Compulsory subjects, characteristic of the class
Credits: 
6
Period: 
Second semester
Standard lectures hours: 
56
Detail of lecture’s hours: 
Lesson (40 hours), Exercise (16 hours)
Requirements: 

Basic knowledge of the basic functions of the software EXCEL. Elements of Statistics.

Final Examination: 
Orale

All topics covered in lessons, including any insights covered in the form of articles or seminars, are the subject of evaluation.
The final exam consists of a written test (2 hours) with 3 questions. Two questions are about one of the main topics addressed in the course, and one exercise.
The evaluation of the exam is expressed in marks out of thirty (the exam is passed when the evaluation 18/30 is reached). The evaluation will consider the overall performance of the test and in particular, the following criteria:
• Relevance and correctness of the answers.
• Ability to present, argue and synthesize the different topics and address them using an appropriate language
• Ability to recognize and find solutions for a problem based on the topics addressed in the course.

The final exam will be the same for attending and non-attending students.

Assessment: 
Voto Finale

The course aims to provide students with the knowledge and skills necessary for the exploration and modeling of complex data using the main multivariate analysis techniques and modeling techniques (regression and classification).
At the end of the course the student will be able to:
• recognize and understand complex data structures and the principal techniques to explore and model data of interest for chemistry and environmental sciences.
• apply the gained knowledge in a multidisciplinary context, and in particular to problems arising from the impact of chemicals on the environment.

As a result, the student will develop the following skills:
• ability to explore and manage complex data systems using quantitative methodologies
• ability to identify appropriate models based on the problem under investigation
• ability to develop and validate qualitative and quantitative predictive models

Finally, the student must develop adequate communication skills regarding the exposure of the identified problems, the methods used and the results achieved, using an appropriate language, as well as the ability to formulate a judgment and derive conclusions based on the information available or derived through the application of multivariate analysis and modeling.

Introduction to the course and evaluation of basic knowledge.
Introduction to chemometrics and its utility in multiple fields of application. Elements of descriptive statistics. Analysis of the structure of data and pre-treatment methods: missing data, variable transformation and scaling. Association between variables.Basic concepts of matrix algebra.
Methods of explorative analysis: Principal Component Analysis and Cluster analysis.
General introduction to data modeling. Fitting and predictivity. Validation techniques: cross-validation, external validation. Variable selection methods.
Multivariate Regression Methods: Ordinary Squares Minimum Method (OLS). Diagnostic methods.
Classification methods: k-NN as an example of methods based on minimum distance. CART as an example of tree classification methods. Discriminant analysis. Evaluation parameters of the efficiency of the classification.
In silico alternatives to animal testing. QSAR modeling with examples of application for the prediction of properties and activities of organic environmental pollutants.

Introduction to chemometrics and its utility in multiple fields of application. Elements of descriptive statistics. Analysis of the structure of data and pre-treatment methods: missing data, variable transformation and scaling. Association between variables. Basic concepts of matrix algebra.
Methods of explorative analysis: Principal Component Analysis and Cluster analysis.
General introduction to data modeling. Fitting and predictivity. Validation techniques: cross-validation, external validation. Variable selection methods.
Multivariate Regression Methods: Ordinary Squares Minimum Method (OLS). Diagnostic methods.
Classification methods: k-NN as an example of methods based on minimum distance. CART as an example of tree classification methods. Discriminant analysis. Evaluation parameters of the efficiency of the classification.
In silico alternatives to animal testing. QSAR modeling with examples of application for the prediction of properties and activities of organic environmental pollutants.

The preparation for the exam should follow the power-point presentations provided by the teacher and must be supported using the recommended textbook. Recommended textbook:
Roberto Todeschini, Introduzione alla chemiometria. 1998, Edises, Milano.

Power Point slides and additional material will be made available on the e-learning website.

The course is organized in 56 hours of which 40 hours are lectures (Lecturer in Varese, video-connection with Como), and 16 hours consist of a computational laboratory in Varese. Attendance at the computational laboratory is mandatory for 75% of the hours.
The computational laboratory will support the theory with practical examples by providing the student the opportunity to use appropriate software.

Attendance to the Lectures is optional but recommended. The final exam will be the same for attending and non-attending students.

The teacher is available on appointment previously arranged by e-mail or telephone (Via Dunant, 3, Piano Rosso).