FenixEdu™

===========================

FAQs about MP2

===========================

Q: Can we use X library in the project?

A: You can. Just be sure that I might need to install in on my Mac and run it. Anyway, no stress.

Q: Can I use a neural network with a small number of layers?

A: No, you can't. I don't want us to spend time discussing how many layers will make it deep, so no neural networks. Anyway, don't be sad, please. In the last lab we will show you how to solve your project with deep learning and pretrained models.

Q: Can I use a pretained model?

No, you can't. By pretrained model I am assuming a BERT, GPT2 or any other language model already available, as well as any kind of neural word embeddings. Notice, however, that you can use a PoS tagger or a NER, no matter how they were trained.

Q: Can you make it clear the final scores for the dev and test?

A. Yes. Sorry. My original idea is difficult to explain, so let us make a small change and assume the following:

A) 5 points to the dev evaluation (instead if 4): 2,5 points if you beat my weak baseline; 5 points if you beat my best baseline

B) 5 points to the test evaluation (instead of 6): 2,5 points if you beat my weak baseline; 5 points if you beat my best baseline

I add this information to the document about the project.

Q: Sometimes there are two labels per entry in the given sets. What should I do?

A: Considering the train set you can do whatever you want. Considering the dev set you cannot remove those lines. In addition, the question is: which label is the correct one? So, here it goes again the dev set (dev_clean.txt), with the categories that should be considered correct. I guarantee that the test set will not have these problems. Anyway, the changes are the following:

Line 218: LITERATURE Leo Tolstoy's story about Hadji Murat, "who slew the Russian swine", opens in this present-day Russian republic Chechnya (GEOGRAPHY removed)

Line 306: MUSIC It's the nursery rhyme that inspired the title of a famous musical based on a 1913 G.B. Shaw work London Bridge (HISTORY REMOVED)

Linha 309: MUSIC It's where Fats Domino found his thrill in 1956 Blueberry Hill (GEOGRAPHY REMOVED)

Q: What do you mean when you say "We will randomly select a set of projects and we will run them in the dev and test sets. If any difference in results is found, the group will have a 0 in the project."

A: The idea is to be sure that the results you describe in the report and the ones I obtain are the same (in case you are selected). And that no manual correction was done to the results and to the models you report.

Q: What can be my baseline?

A: You need to present two models, so, your baseline can be one of these models. You can even have more than a baseline. The idea is to guarantee that you compare your best/most interesting model with at least another one.

Q: Which model should I use to run the dev and test set? My most interesting model is not the one that gets the best results. :-(

A: Describe with extra detail in the report your most interesting model. Run the one that will bring you a higher score. Don't forget to mention the latter in your report.

Q: Will efficiency be evaluated?

No.

Q: What is the size of the test set?

It has around 3300 entries.

Q: Do I need to upload all the models I've created?

No, just the best one.

Q: Is it normal to have 98% accuracy?

Hum... no. If you are using fit_transform, be sure that you are not applying it to the test set.

=========================

Notes about the course

=========================

- the members of each group don't need to attend to the same lab/theoretical class

- there will be no evaluation during labs/theoretical classes

- as long as the room capacity is respected, you can attend a different lab/theoretical class

- all the learning materials and labs' handouts will be made available before class

- to login to Moodle, students must use the "IST login" button. The fields "nome de utilizador" and "senha" should left empty.

Língua Natural