Sumários

Approximate duplicate detection

23 outubro 2020, 14:00 Diogo Ribeiro Ferreira

Duplicate detection with string matching. Pairwise comparisons and setting a threshold. Tradeoff between false positives and false negatives. Compound similarity of multiple fields. Clustering and transitive closure. Merging records by value frequency.


Approximate duplicate detection

23 outubro 2020, 14:00 Diogo Ribeiro Ferreira

Duplicate detection with string matching. Pairwise comparisons and setting a threshold. Tradeoff between false positives and false negatives. Compound similarity of multiple fields. Clustering and transitive closure. Merging records by value frequency.


Lab 4

22 outubro 2020, 11:00 Diogo Ribeiro Ferreira

Calculation of several string matching measures over a sample dataset. Lab guide and exercises.


Lab 4

20 outubro 2020, 15:30 Diogo Ribeiro Ferreira

Calculation of several string matching measures over a sample dataset. Lab guide and exercises.


Lab 4

20 outubro 2020, 14:00 Diogo Ribeiro Ferreira

Calculation of several string matching measures over a sample dataset. Lab guide and exercises.