BIG DATA AND BUSINESS INTELLIGENCE
cod. 1007077

Academic year 2020/21
3° year of course - First semester
Professor
Academic discipline
Sistemi di elaborazione delle informazioni (ING-INF/05)
Field
Ingegneria informatica
Type of training activity
Characterising
48 hours
of face-to-face activities
6 credits
hub: PARMA
course unit
in ITALIAN

Learning objectives

Knowledge of useful tools for the analysis and process of heterogeneous data and more generally for the development of a big data analytics process.

Big Data is often considered to be the new oil and Big Data Analytics is the process of collecting and analyzing large volumes of data (big data) to extract hidden information useful for outlining an effective strategy in the decision-making processes of companies and society in general

Prerequisites

No propedeutic courses. However, Students should have knowledge of programming (especially python) and Databases.

Course unit content

1 Introduction (2 hours)
2 Business Intelligence e descriptive analysis (2 hours))
3 Data science (10 hours)
4 Python and data analysis (10 hours)
5 Technologies for Big Data (4 hours)
6 Storage and Data Process in a company ( 8 hours)
7 Common processing in Big Data (6 hours)
8 Study cases (6 hours)

Full programme

1 Introduction (2 hours)
1.1 Big Data Definitions
2 Business Intelligence e descriptive analysis (2 hours))
3 Data science (10 hours)
3.1Methodology
3.2 Data exploration (basic statistics)
3.3 Predictive Algorithms (Machine Learning)
3.4 Communication of results
4 Python and data analysis (10 hours)
4.1 NumPy
4.2 Pandas
4.3 PyPlot
4.4 Scikit-Learn
5 Technologies for Big Data (4 hours)
5.1 Hadoop e Spark
6 Storage and Data Process in a company ( 8 hours)
6.1 Relational Database and NoSQL
6.2 Django, Flask, scaffolding
7 Common processing in Big Data (6 hours)
7.1 Preprocessing text (stemming, Bag of words, vectorization)
7.2 Graphs and geolocation (Gephi, Qgis, GMaps etc.)
8 Study cases (6 hours)

Bibliography

A. Rezzani (2017). Big Data Analytics. Il manuale del data scientist. Maggioli Editore (Aopogeo Education).
S. Ozdemir. Data Science: guida ai principi e alle tecniche base della scienza dei dati. Apogeo

Teaching methods

Lectures and laboratory exercises.
Lectures will cover the theoretical aspects of the course subjects.
Practical exercises on real problems will be carried out in laboratory

Assessment methods and criteria

There are no mid-term tests.
The exam consist of two parts:
i) a written exam consisting of four open questions on the theoretical topics of the course covered in class with the aim of evaluating the knowledge gained on these matters.
ii) a written report (and its oral presentation) on a project work that explores one of topics covered in class or in the labs, so as to assess the
ability to apply the knowledge gained during the course. However, the value of its evaluation will also depend on the quality of the developed
system and the attached documentation.
The exam is passed if, in each of the two parts, the student reaches at least the sufficiency.
The final mark is a weighted average trail score obtained in the written test (40%) and the one obtained in the project work (60%).
Praise is given in case of achieving the highest score on all partials.

Other information

Course notes and teaching materials will be distributed during the course
in electronic form.