Planeamento
Aulas Teóricas
AT1 - Data Science
Data science: context and goals.
Course organization and planning.
AT2 - Data profiling
Data profiling: granularity, distribution, dimensionality and sparsity.
AT3 - Classification and Analogizers
Modelling: mining tasks; supervised, semi-supervised and unsupervised learning.
Evaluation: measures, training strategies and statistical significance.
AT4 - Deloitte presentation
AT5 - Classification: Bayesians
Bayesians: MAP and Naive Bayes and Bayesian nets.
AT6 - Classification: symbolists
Symbolists: decision trees - algorithms, measures and the pruning.
Deloitte case study: comparing naive Bayes, KNN and decision trees. Overfitting identifying overfitting and discussion the possibility of applying feature selection: pros and cons; discussion of possible new variables to generate, and false predictors.
AT7 - Classification: ensembles - bagging
Ensembles: bagging and random forests.
AT8 - Classification: connectionists and boosting
Classification: Neural Networks - Gradient descent. MLP and backpropagation. Brief summary of Deep Learning.
AT9 - Clustering
Introduction to clustering data analysis.
Feature extraction and PCA.
Exercises: data preparation and clustering.
AT10 - Pattern mining and Anomaly detection
Pattern Mining and Sequential Pattern Mining. The Apriori algorithm.
AT11 - Time series
AT12 - Forecasting
Time series forecasting: regression and LSTMs.
Case study.
AT13 - Social Network Analysis
Social network analysis: properties and description.
AT14 - Privacy and ethical concerns
Technical challenges.
Ethical concerns, privacy issues and the GDPR.
Aulas Laboratoriais
Teams registration (Week 1)
Team registration on Fénix after 1st class at 20:00 - no lab session.
Project support (Week 1)
Project support. Configuration of the development environment. Project goals.
Lab 1 (Week 2)
Data profiling - dimensionality, distribution, granularity, sparsity and correlation.
Lab 2 (Week 2)
Data preparation - missing values and outliers imputation, dummification.
Lab 3 (Week 3)
Training strategies.
KNN and scaling.
Lab 4 (Week 3)
Naive Bayes and KNN.
Lab 5 (Week 4)
Decision trees.
Lab 6 (Week 4)
Random forests.
Lab 7 (Week 5)
MLP and Gradient Boosting.
Lab 8 (Week 5)
Clustering and Feature extraction
Lab 9 (Week 6)
Pattern Mining and anomaly detection.
Lab 10 (Week 6)
Time series analysis and Matrix Profile.
Lab 11 (Week 7)
Forecasting.
Project support (Week 7)
Project support.