exercise 2 (correction): the dendrograms distances are inaccurate. For instance, dendrogram under single link is {{x4,x8}[1.4], x1}[2.2], {x3,x5,x6}[1.4]}[3.6] {x2,x7}[3.2]}[4.1] and under maximum link {{x4,x8}[1.4], x1}[3.6], {{x3,x5}[1.4], x6}[2] }[3.6] {x2,x7}[3.2]}[7.3] }[8.5]
Practical exercises (4v) - Prediction: Decision Trees, kNN, Evaluation (Confusion Matrices and Residue-based Errors) - Description: Agglomerative Clustering, Evaluation (silhouette and purity) - Bivariate Exploration: Correlation and Information Gain
True or False (1v)
FAQ
Exercise 6-7 (homework): Which variables should be considered for clustering the observations? And to assess purity? Answer: Consider the input variables (y1 and y2) to learn the clustering solution; and the output variable z to provide the reference groups (ground truth) for assessing the purity of the clustering solution.