Project#1 Grades

22 junho 2021, 14:09 Carlos Augusto Santos Silva

Dear all, 

 

Please find the grades for project#1 in the tab Grades. Remember that this is only 30% of the final grade.

 

I apologize for taking so long to review them.

In the pdf files you can see the evaluation criteria that have been used  and your quotation in each group. Any questions, you are welcomed to send an email. 

You are welcome to submit an improved version of the process, to improve the grade.

 

The main feedback is the following:

In general you all did a very thorough and complete project. We actually have new records for civil building and north tower, congratulations!

 

In the first part (Data Cleaning and Processing) most you deleted the missing weather data which resulted in a loss of electricity consumption data. However, in most cases, there was no discussion on the impact in the regression (very small) and that was penalized in 1 point. It is Ok to dop it as long as it is justified. Still many of you choose to correct this data somehow.

 

In the second part (Exploratory Data Analysis), most of you plotted the data, performed some basic statistics and cleaned the outliers, but in sort of automatic mode, with no critical analysis to the data.

 

In the third part (Clustering) many of you did the analysis but without really extracting any insights. The main insights are the fact that the data can be clustered in 2 to 3 subsets and this could be used with success as a feature for the regression.  

 

In the fourth part (Feature Selection) most of you did all the analysis and choose the best features. Still many of you after doing the feature selection didn´t do any analysis or ended up using all the features for regression (with worse results). Still many of you engineered some interesting new features.

 

Finally, in the fifth part (regression) most of you applied all the methods or just one and tried to improve the results. In general Random Forest and Extreme Gradient Boosting outperformed the other methods. The results are very good with new best results for Civil and North Tower.

 

The notebooks and scripts are in general clear and well commented and easy to follow.

Congratulations.

Carlos Santos Silva