Sumários

Introduction to Data Warehouse

9 novembro 2015, 15:30 Helena Galhardas

  • Motivation
  • Definition of data warehouse
  • New domains and challenges
  • The multidimensional model: facts, dimensions, hierarchies, aggregation functions, data cube, SQL extensions ROLLUP and CUBE
  • Typical data warehouse architecture.


Lab 7

9 novembro 2015, 09:30 Diogo Ribeiro Ferreira

Data profiling with DataCleaner. Analyzing the customers table and the orderdetails table. Integration of DataCleaner with Pentaho. Data profiling in a PDI transformation.


Lab 6

5 novembro 2015, 16:00 Diogo Ribeiro Ferreira

Data cleaning. Development of a PDI transformation to clean the data in a database table. Using regular expressions to find patterns in data. Pencil-and-paper exercise about approximate duplicate detection and elimination with CLEENEX.


Data profiling

5 novembro 2015, 14:30 Helena Galhardas

  • Introduction: definition, challenges, use cases, existing technology, typical data profiling procedure.
  • Typical data profiling tasks: single column, multiple column, detection of dependencies
  • Single column metadata: cardinalities, patterns and data types, value distributions
  • Multiple column metadata: correlations, association rules, clustering
  • Dependencies: unique column combinations,inclusion dependencies, functional dependencies
  • Data profiling tasks: research and commercial
  • Visualization challenge: incorporating the user ,


Data profiling

5 novembro 2015, 14:30 Helena Galhardas

  • Introduction: definition, challenges, use cases, existing technology, typical data profiling procedure.
  • Typical data profiling tasks: single column, multiple column, detection of dependencies
  • Single column metadata: cardinalities, patterns and data types, value distributions
  • Multiple column metadata: correlations, association rules, clustering
  • Dependencies: unique column combinations,inclusion dependencies, functional dependencies
  • Data profiling tasks: research and commercial
  • Visualization challenge: incorporating the user ,