Sumários

Clustering based on Gaussian mixture models

14 junho 2023, 16:30 João Pedro Castilho Pereira Santos Gomes

Gaussian mixture models (GMM). The minorization-maximization (MM) approach for iterative optimization and its elation to the expectation-maximization (EM) algorithm for statistical inference. The EM algorithm for GMMs. Relation with Lloyd's algorithm for k-means. Model complexity and data requirements. Model order selection: Akaike and Bayesian information criteria. Density estimation. [Murphy 11.2-11.2.1, 11.4-11.4.2.5, 11.4.2.7, 11.5] 


Spectral clustering

14 junho 2023, 14:30 João Pedro Castilho Pereira Santos Gomes

Basics of graph theory and the rationale for spectral clustering. Normalized and unnormalized adjacency and Laplacian matrices of a graph. Properties of the eigenvalues and eigenvectors of the Laplacian. The Fiedler eigenvalue/eigenvector. Graph cuts and graph partitioning. Ratio and normalized cuts and their relaxation. Spectral clustering algorithm. [Zaki 16.1, 16.2-16.2.2]


Representative-based clustering

7 junho 2023, 16:30 João Pedro Castilho Pereira Santos Gomes

Clustering validation: Internal and external measures. k-means: Lloyd's algorithm and initialization strategies. [Zaki 17.1.1, 17.1.3, 17.2, 13.1]


Hierarchical clustering

7 junho 2023, 14:30 João Pedro Castilho Pereira Santos Gomes

The clustering problem. Types of clustering. Measuring distances between data vectors and between clusters. Hierarchical clustering. [Zaki 14]


Probabilistic principal component analysis

31 maio 2023, 16:30 João Pedro Castilho Pereira Santos Gomes

Probabilistic principal component analysis (PPCA). Iterative maximum-likelihood estimation using the EM algorithm for latent linear models. [Murphy 12.2, see also Secs. 11.4.1, 11.4.2, 11.4.7]