Sumários

Data Matching

20 outubro 2016, 14:30 Helena Galhardas

Record-oriented matching techniques:

  • Rule-based matching
  • Learning-based matching
Record-set oriented matching:
  • Sorted Neighborhood Method (SNM)
  • Variant of SNM: Incremental Merge/Purge
Measures and data sets:
  • Recall, Precisin and F1-measure.


Lab 5: Working with Databases

20 outubro 2016, 13:00 Diogo Ribeiro Ferreira

- Using PDI to read and write data from/to a database.

- Pencil & Paper exercise: using Naive Bayes for schema matching.


Lab 5: Working with Databases

18 outubro 2016, 15:30 Diogo Ribeiro Ferreira

- Using PDI to read and write data from/to a database.

- Pencil & Paper exercise: using Naive Bayes for schema matching.


Schema mappings (cont.) and Introduction to Data Cleaning

17 outubro 2016, 15:30 Helena Galhardas

Schema mappings:

  • From matchings to mappings: query discovery algorithm (Clio system).
  • Example.
Introduction to Data Cleaning:

  • Motivation
  • Application Contexts
  • Data quality dimensions
  • Taxonomy of the data quality problems
  • Data quality process
  • Main data quality tools
  • Real-world examples.


Schema mappings (cont.) and Introduction to Data Cleaning

17 outubro 2016, 15:30 Helena Galhardas

Schema mappings:

  • From matchings to mappings: query discovery algorithm (Clio system).
  • Example.
Introduction to Data Cleaning:

  • Motivation
  • Application Contexts
  • Data quality dimensions
  • Taxonomy of the data quality problems
  • Data quality process
  • Main data quality tools
  • Real-world examples.