Sumários
Introduction to Data Warehouse
9 novembro 2015, 15:30 • Helena Galhardas
- Motivation
- Definition of data warehouse
- New domains and challenges
- The multidimensional model: facts, dimensions, hierarchies, aggregation functions, data cube, SQL extensions ROLLUP and CUBE
- Typical data warehouse architecture.
Lab 7
9 novembro 2015, 09:30 • Diogo Ribeiro Ferreira
Data profiling with DataCleaner. Analyzing the customers table and the orderdetails table. Integration of DataCleaner with Pentaho. Data profiling in a PDI transformation.
Lab 6
5 novembro 2015, 16:00 • Diogo Ribeiro Ferreira
Data cleaning. Development of a PDI transformation to clean the data in a database table. Using regular expressions to find patterns in data. Pencil-and-paper exercise about approximate duplicate detection and elimination with CLEENEX.
Data profiling
5 novembro 2015, 14:30 • Helena Galhardas
- Introduction: definition, challenges, use cases, existing technology, typical data profiling procedure.
- Typical data profiling tasks: single column, multiple column, detection of dependencies
- Single column metadata: cardinalities, patterns and data types, value distributions
- Multiple column metadata: correlations, association rules, clustering
- Dependencies: unique column combinations,inclusion dependencies, functional dependencies
- Data profiling tasks: research and commercial
- Visualization challenge: incorporating the user ,
Data profiling
5 novembro 2015, 14:30 • Helena Galhardas
- Introduction: definition, challenges, use cases, existing technology, typical data profiling procedure.
- Typical data profiling tasks: single column, multiple column, detection of dependencies
- Single column metadata: cardinalities, patterns and data types, value distributions
- Multiple column metadata: correlations, association rules, clustering
- Dependencies: unique column combinations,inclusion dependencies, functional dependencies
- Data profiling tasks: research and commercial
- Visualization challenge: incorporating the user ,