Learning objectives
Knowledge of useful tools for the analysis and process of heterogeneous data and more generally for the development of a big data analytics process.
Big Data is often considered to be the new oil and Big Data Analytics is the process of collecting and analyzing large volumes of data (big data) to extract hidden information useful for outlining an effective strategy in the decision-making processes of companies and society in general
Prerequisites
No propedeutic courses. However, Students should have knowledge of programming (especially python) and Databases.
Course unit content
1 Introduction (2 hours)
2 Business Intelligence e descriptive analysis (2 hours))
3 Data science (10 hours)
4 Python and data analysis (10 hours)
5 Technologies for Big Data (4 hours)
6 Storage and Data Process in a company ( 8 hours)
7 Common processing in Big Data (6 hours)
8 Study cases (6 hours)
Full programme
1 Introduction (2 hours)
1.1 Big Data Definitions
2 Business Intelligence e descriptive analysis (2 hours))
3 Data science (10 hours)
3.1Methodology
3.2 Data exploration (basic statistics)
3.3 Predictive Algorithms (Machine Learning)
3.4 Communication of results
4 Python and data analysis (10 hours)
4.1 NumPy
4.2 Pandas
4.3 PyPlot
4.4 Scikit-Learn
5 Technologies for Big Data (4 hours)
5.1 Hadoop e Spark
6 Storage and Data Process in a company ( 8 hours)
6.1 Relational Database and NoSQL
6.2 Django, Flask, scaffolding
7 Common processing in Big Data (6 hours)
7.1 Preprocessing text (stemming, Bag of words, vectorization)
7.2 Graphs and geolocation (Gephi, Qgis, GMaps etc.)
8 Study cases (6 hours)
Bibliography
A. Rezzani (2017). Big Data Analytics. Il manuale del data scientist. Maggioli Editore (Aopogeo Education).
S. Ozdemir. Data Science: guida ai principi e alle tecniche base della scienza dei dati. Apogeo
Teaching methods
Lectures and laboratory exercises.
Lectures will cover the theoretical aspects of the course subjects.
Practical exercises on real problems will be carried out in laboratory
Assessment methods and criteria
There are no mid-term tests.
The exam consist of two parts:
i) a written exam consisting of four open questions on the theoretical topics of the course covered in class with the aim of evaluating the knowledge gained on these matters.
ii) a written report (and its oral presentation) on a project work that explores one of topics covered in class or in the labs, so as to assess the
ability to apply the knowledge gained during the course. However, the value of its evaluation will also depend on the quality of the developed
system and the attached documentation.
The exam is passed if, in each of the two parts, the student reaches at least the sufficiency.
The final mark is a weighted average trail score obtained in the written test (40%) and the one obtained in the project work (60%).
Praise is given in case of achieving the highest score on all partials.
Other information
Course notes and teaching materials will be distributed during the course
in electronic form.