FAQ
Exam
- Can the allowed notes for the exam be printed notes?
Answer: Yes. As long as it is a single A4 page, you can write or print any contents.
Homework 4
- Important notes
- on question I-3: please replace "the larger cluster" by "both clusters", i.e. compute the silhouette for both clusters
- a new version of the homework (v2) was released Saturday 21/10/2023 with the correct covariance matrices
- (new) show your reasoning/calculus, including in your answer I-4. In question I-1 and I-3, you can detail the calculus for one observation and summarize the calculus for others.
- on question I-3: please replace "the larger cluster" by "both clusters", i.e. compute the silhouette for both clusters
- In part I, can you clarify what is the index of the given probability functions?
Answer: each probability function pk and Nk is defined in the context of a specific cluster ck. At the clustering solution level, you can alternatively specify them as conditionals, e.g. pk=p(y1=θ | ck) - Are the questions I-2 and I-3 to be applied after the EM update?
Answer: Yes. Please compute the membership of x_new (I-2) and the assignment of training observations (II-3) for the adjusted clusters after the EM update (undertaken in I-1). - Should we consider x_new to answer question I-3?
Answer: No. Please only consider the four training observations. - The reference clustering solution to use in I-4 is the one obtained in I-3?
Answer: Yes. - In question II-2i, should we consider the ratio of explained variability?
Answer: Yes, please provide the percentage of explained variability.
Homework 3
- In the MLP practicals (Neural Networks P7-9), do the given outputs (e.g., z=[0 1 0]^T in exercise 2) correspond to the encodings of a single output categorical variable (e.g., target B in exercise 2)?
Answer: No. The provided outputs can be drawn from alternative tasks (e.g., a multiple-output predictive task in exercise 2). In particular, in exercise 2, the given output (z=[0 1 0]^T) cannot correspond to an encoding of a single categorical target (class) due to the studied properties of tanh activation (codomain in [-1,1]). - (new) In the context of the previous entry, please note that although the use of a one-hot vector (one exit with 1 and remaining exits with zeros) will lead to a non-optimal learning for the exercise I-2, we will not apply discounts
- In part II, the learning of the MLPRegressor returns convergence warnings. Do we need to address them?
Answer: No. The convergence warnings are signalling that the MLP reached the maximum number of mini-batch updates (200 by default) without early stopping (a behavior to be covered this week). Although the number of updates/iterations could be increased (via max_iter parameter), we suggest to preserve the default parameterization. - In question II-1, should the residues produced from the different runs contribute to a unique histogram?
Answer: Yes. (new) Please further note that there is no need for averaging results on this question (II-1). - In question II-2, should we be aware of any specific bounds for the graded winde quality?
Answer: You can assume that quality grading is in {1,...,10} (domain knowledge), although we will not discount alternative assumptions based on the distribution of the observed values for the output variable. - In question II-3, for simplicity, instead of "one iteration corresponds to a batch", one should consider "one iteration corresponds to a default mini-batch".
Homework 2
- Should we preserve the (in)dependence assumptions in I-1.a for the subsequent questions 1.b and 1.c?
Answer: Yes. Answer questions 1.b and 1.c with the distributions computed in 1.a, i.e. without the need to revisit the train observations. - Important notes:
- on question I-2, please consider the domain of y2 to be [0,1]
- on question I-1, instead of "𝑦1 ⫫ 𝑦2", one should read "𝑦1
⫫𝑦2" (i.e. dependence). The statement was updated with this change. - on question I-1, please consider input variables y3, y4 and y5 to be categorical variables.
- On question I-3, can you clarify how ML assumption is linked with the illustrated posterior probabilities?
Answer: For simplicity sake, please consider p(x|h) as a rough proxy for posteriors and ensure ∑h∈y_out p(h|x) = 1. - On question I-1, do I need to show the detailed calculus or can I promptly use and present the results obtained using numpy or scipy?
Answer: For question 1a, please show the detailed calculus for one of the classes; and for question 1b, please show the detailed calculus for one posterior. - Where can we find illustrative python programming facilities to work with univariate and multivariate Gaussians?
Answer: Please find some illustrative examples on how to fit Gaussians and access pdf values in this notebook (and accompanying data). - In question I-2b, should we compute the given MAE for the training or testing observations?
Answer: Testing observations. Commonly, we assume error and efficacy metrics to be computed over testing data unless they are explicitly preceded by 'training'. - In question II-2, what is a cumulative confusion matrix?
Answer: One that results from the sum of fold-specific confusion matrixes.
Homework 1
- In which language should we answer the homeworks?
Answer: Both reports in Portuguese and English are perfectly acceptable. - Can pen-and-paper exercises be delivered using prints from digital boards or scans of handwriting paper?
Answer. Yes, both these options are valid, as well as others (e.g., word/latex formulae writting facilities). - Do we need to bring the code and outputs from the notebook to the report?
Answer: Yes. The report is the primary way of assessing the homework, so make sure all relevant information is included in the report. - Are there are other constraints in question I.1 that we should be aware?
Answer: No, just the ones specified in the statement (e.g., there is not a fixed depth limit). - In the context of I-1, should we consider the possibility of reusing numeric variables (in this case y2) multiple times along the tree path?
Answer: For simplicity, discard this possibility for this exercise, i.e. assume that input variables can only a appear a single time within a specific tree path. - In the context of I-1, should the splits of categorical variables be binary?
Answer: No, please assess discriminative power and draw the splits using the original cardinality of categorical variables (in other words, no need to perform aggregation). - Is the I-5 challenge subject to evaluation?
Answer: Yes, it is an inherent part of I-5 question and will be graded. The term is placed due to the critical thinking necessary to understand how class-conditional distributions can offer a solid and alternative criteria to learn decision trees.
In this question, can you elaborate on what root split? - Answer: Following the previous clarification, you should reflect on how class-conditional distributions can be used to produce alternative splitting criteria in decision trees. In this context, assuming that y1 is selected as the root of a new decision tree, you can attempt to identify the branches that directly depart from this root node.
- Where can I submit the homework?
Answer: Submission is already possible (since 26/9) at Fenix.
Contents
- In some of the first lectures, some of the attendants in last rows
experienced hearing difficulties. Can you help us mapping the lectured
contents with the slides?
Answer: Certainly, I recorded this short guiding video to help with the organization of the introductory contents of the course. As we master these foundations, we become ready to start our machine learning journey.
Update: a playlist was created with the content from subsequent chapters. - How do I map the starting lecture/slide contents to the bibliography?
Answer: Consider the following synergistic options as complementary reading (even though the match is not very good):- T1 (introduction): Zaki chapters 1-3 or reference book for the Statistics course
- T2 (associative learning): Zaki chapter 19, Wichert chapter 2, Mitchell chapter 3
- T3/T6 (evaluation): Zaki chapter 22, Mitchell chapter 5
- T4 (Bayesian learning): Zaki chapter 18 and 20, Wichert chapter 2, Mitchell chapter 6
- T5 (local/lazy learning): Zaki chapter 18, Mitchell chapter 8
- T1 (introduction): Zaki chapters 1-3 or reference book for the Statistics course
- I was unable to register in my preferred practical shift as it is already full. Can I still attend it?
Answer: Although we recommend attending the practical shift of registration, it is possible to attend other practices outside your registration as long as there are free seats (i.e. priority given to students registered in that practice if overcrowding occurs).
There are no evaluations during practical classes and no requirement for the homework groups to be enrolled in the same practice.
- I was unable to register in my preferred theoretical shift as it is already full. Can I still attend it?
Answer: Yes, you can attend the alternative theoretical shift. Still, due to the high number of student registrations this year, we suggest attending your selected theoretical shift in Fenix to guarantee a more uniform distribution of individuals per lecture.
Students unable to register in any theoretical shift, please attend your preferred theoretical shift — 10-20 students will be in this situation due to capacity limits set in Fenix. - How do I register my homework group? Any restrictions to have in mind?
Answer: starting September 11th (until September 24th) you can register your group of two in Fenix. This is a mandatory step for your group to be able to deliver the homeworks via Fenix. Registrations are atomic - incomplete groups cannot be registered. Please note that there are no restrictions for group formation (e.g., no need to attend the same practical shifts). Working students should communicate their decision on the homework component until September 24th.
Others
- Can the professor write a recommendation letter to apply for fellowships such as la Caixa?
Answer: Although we would love to support you in this goal, several dozens of requests arrive every year, making it impossible to write customized recommendation letters to all :(. In this context, we gently ask to first assess whether there is an established collaboration or significant interactions with your faculty host and, only when positive, consider requesting a recommendation.