{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "-jlGXqM_t452" }, "source": [ "# P8 - Convolutional Neural Networks (CNNs)\n", "We have now learned about the Perceptron, Linear and logistic regression, Multi-layer perceptron and backpropagation, Auto-encoders. \n", "\n", "In this pratical session about Convolutional Neural Networks (CNNs) we will use the MNIST datasets.\n", "\n", "First, we will obtain baselines using a Logistic Regression and a Feed-forward Neural Network." ] }, { "cell_type": "markdown", "metadata": { "id": "ITJR4snhxdT0" }, "source": [ "## 0.0 - Imports\n", "We will need to import some libraries to be used in this session. Libraries include data visualizers ([matplotlib](https://matplotlib.org/)), neural network package ([torch](https://pytorch.org/)), and other helper packages for data handling ([sklearn](https://scikit-learn.org/), [numpy](https://numpy.org/))." ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "id": "MWGjU3tDw4bD" }, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "from sklearn.base import BaseEstimator\n", "from sklearn.datasets import load_digits\n", "from sklearn.linear_model import LogisticRegression\n", "from sklearn.preprocessing import StandardScaler\n", "from sklearn.utils import check_random_state\n", "import torch\n", "import torch.nn as nn\n", "import torch.nn.functional as F\n", "import torch.optim as optim\n", "from torchvision import datasets, transforms\n", "from torch.autograd import Variable\n", "from torch.utils.data import Dataset, DataLoader\n", "from torch.utils.data.sampler import SubsetRandomSampler\n", "import time\n", "import copy" ] }, { "cell_type": "markdown", "metadata": { "id": "W-od7M6WMN0N" }, "source": [ "Then, other variable definitions are needed to be set. This includes the size of the dataset we will use, and the configuration of the GPU to be activated:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ECqewHJ0MM62", "outputId": "e5377940-a224-4e98-b427-bad0a9579863" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "cpu\n" ] } ], "source": [ "# Configure Device\n", "device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')\n", "print(device)" ] }, { "cell_type": "markdown", "metadata": { "id": "odY0Ng9yycgr" }, "source": [ "### 0.1 - Create Dataloaders\n", "#### MNIST dataset \n", "Using torchvision we can easily download and use the MNIST dataset to create our train and validation dataloaders" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "id": "snFv-Hu-zRnW" }, "outputs": [], "source": [ "# Define tranform - Convert data to tensor and normalize using dataset mean and std\n", "# mean and std are computed offline using the training dataset\n", "# tranforms.Normalize expects a value of mean and std per image channel\n", "mnist_transform = transforms.Compose(\n", " [transforms.ToTensor(),\n", " transforms.Normalize((0.1307,), (0.3081,))])\n", "\n", "# These random translations will be added in the end of this notebook, for now we skip this. \n", "#mnist_transform_test = transforms.Compose(\n", "# [transforms.ToTensor(),\n", "# transforms.RandomAffine(0, translate=[0.1, 0]),\n", "# transforms.Normalize((0.1307,), (0.3081,))])\n", "\n", "# Download and create MNIST train and validation dataloaders\n", "mnist_train_dataset = datasets.MNIST('../data', download=True, train=True, transform=mnist_transform)\n", "mnist_val_dataset = datasets.MNIST('../data', download=True, train=False, transform=mnist_transform)\n", "#mnist_val_dataset = datasets.MNIST('../data', download=True, train=False, transform=mnist_transform_test)\n", "mnist_train_dataloader = DataLoader(mnist_train_dataset, batch_size=64, shuffle=True)\n", "mnist_val_dataloader = DataLoader(mnist_val_dataset, batch_size=64, shuffle=True)\n", "\n", "# MNIST Dataloaders to get data into numpy for Logistic Regression\n", "mnist_train_dataloader_numpy = DataLoader(mnist_train_dataset, batch_size=len(mnist_train_dataset))\n", "mnist_val_dataloader_numpy = DataLoader(mnist_val_dataset, batch_size=len(mnist_val_dataset))\n", "X_y_train = next(iter(mnist_train_dataloader_numpy))\n", "X_y_val = next(iter(mnist_val_dataloader_numpy))\n", "X_train = X_y_train[0].numpy()\n", "y_train = X_y_train[1].numpy()\n", "X_val = X_y_val[0].numpy()\n", "y_val = X_y_val[1].numpy()\n", "\n", "dataloaders = dict(train=mnist_train_dataloader, val=mnist_val_dataloader)\n" ] }, { "cell_type": "markdown", "metadata": { "id": "vJANJSJ0hsqd" }, "source": [ "We can check the MNIST dataset properties such as:\n", "\n", "- shape of train and validation datasets - \\[number of samples, width, height\\]\n", "- number of input feature on the flattened/reshaped input for Logistic Regression or MLP\n", "- shape of train and validation batches - \\[batch size, number of channels, width, height\\]" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "KuluBqnnCbn5", "outputId": "fa915c87-a262-4a70-ea87-d55cf5985317" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Datasets shapes: {'train': torch.Size([60000, 28, 28]), 'val': torch.Size([10000, 28, 28])}\n", "N input features: 784 Output classes: 10\n", "Train batch: torch.Size([64, 1, 28, 28]) torch.Size([64])\n", "Val batch: torch.Size([64, 1, 28, 28]) torch.Size([64])\n" ] } ], "source": [ "# get batch to extract properties and plot example images\n", "# next(enumerator(dataloader)) -> creates an iterator of the dataloader and gets the next batchß\n", "batch_idx, (example_imgs, example_targets) = next(enumerate(mnist_train_dataloader))\n", "# info about the dataset\n", "D_in = np.prod(example_imgs.shape[1:])\n", "D_out = len(mnist_train_dataloader.dataset.targets.unique())\n", "print(\"Datasets shapes:\", {x: dataloaders[x].dataset.data.shape for x in ['train', 'val']})\n", "print(\"N input features:\", D_in, \"Output classes:\", D_out)\n", "print(\"Train batch:\", example_imgs.shape, example_targets.shape)\n", "batch_idx, (example_imgs, example_targets) = next(enumerate(mnist_val_dataloader))\n", "print(\"Val batch:\", example_imgs.shape, example_targets.shape)" ] }, { "cell_type": "markdown", "metadata": { "id": "JnFAmoinjY1T" }, "source": [ "We can plot some examples with corresponding labels using the following function. This function can also receive the predicted labels." ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 284 }, "id": "5ZWvjQOvC2ep", "outputId": "c77ced2a-931a-4fb1-db71-5354316f0e6d" }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZoAAAELCAYAAADgPECFAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/Il7ecAAAACXBIWXMAAAsTAAALEwEAmpwYAAAmJ0lEQVR4nO3dedxV0/4H8M+3QfNcNNwGDboNqG5Kl5QhKrqUMqVEKD8zt7olicrQjQz5xS8y/IiSWa+QMaEfFSIq0UBz6qJZWr8/9n6WtZbnnGefYZ2hPu/X63m1vmftYT279Zx19lrrrC1KKRAREflSLNsFICKiAxsbGiIi8ooNDRERecWGhoiIvGJDQ0REXrGhISIir9jQJEhEHheRsdkuB+UfEXlPRC7Ndjko/+R73Ynb0IjIduNnv4jsMuK+mSqkiJwuIvNE5D8iskFEHhGRCnG2X2WUdWPYOJTPVHmNcgwQESUiQ53XfxSRzpkuTyblSt0Jy3KBiKwWkR0i8pKIVI2zrQq32y4ia0XkHhEpnsnyhuUYHZblHOO1EuFrDTJdnkzKlbojIp3D85vluSjO9qw7McRtaJRS5Qt+AKwB0MN47emC7USkhOdyVgIwFkBtAM0A1AHw7yL26RGWuw2AtgBGuhtkoNwAsBXA0HgN44EoV+qOiLQA8DCAfgAOA7ATwH8XsdvRYblPBnABgMsKOW6m6s6t2XizyqZcqTuhdWZ5lFJPFLE9604hkuo6C1v6H0VkmIhsAPBY+Ol9nrOdEpHGYbqUiEwQkTXhXcZDIlImyvmUUtOUUq8rpXYqpbYBmALguIj7rgUwG0BLo0xXisi3AL4NXztDRD4P75g+EpGjjN+htYgsEpFfRWQ6gNJRzmv4BsDHAG4oLDO8LveKyLrw514RKRXmFVznG0Vkk4isF5GLnX2TuqbZkum6A6AvgFeVUnOVUtsB3AygV5SGXym1FMAHAFqKSIOwTANFZA2Ad8KyXSIi34jINhF5Q0TqG79DFxFZKiI/i8gkABKxzAVeB7AXwIWFZYpIJRF5UkQ2S3DHNlJEioV5AyToBZgQlm2liHRz9n00rFNrRWRsrrwpxZKFupM01h1bKmM0NQFUBVAfwOURtr8TwBEAWgFojOCuZFRBZvgmf3zEc58AYEmUDUWkLoDuAD4zXj4LQHsAzUWkNYCpAAYBqIbg0+8rYQU9BMBLAP4Xwe/6HICzneNHKffNAK6TwrtsbgJwLILrcjSAdrDvvmoiuKOrA2AggAdFpEqYF/ea5rBM1p0WAL4oCJRS3yH4AzyiqJOKSHMAHWHXnU4I7qpPE5EzAYwA0AtADQRvLM+E+1YH8AKC/8vqAL6D8eFIROqF5a4XpwgKQd25RURKFpL/AIK60TAsV38AFxv57QEsC88/HsCjIlLwhvU4gH0IrmdrAKcCyIcxgEy/7xwaNlArRWSiiJSLUkjWHbc0SkX6AbAKwClhujOCP9bSRv4AAPOcfVRYGAGwA0AjI68DgJVRz2/s1wXANgBHFFHW7QD+A2A1gq6SMkaZTjK2nQxgjLP/svDinwBgHQAx8j4CMDZiWfU1ATADwF1h+kcAncP0dwC6G/ucBmCVcZ13AShh5G9C0DCl7Zr6/slm3QHwNoDBzmtrC65/IdsrAL+Edew7BF22xQA0CPMaGtvOBjDQiIsh6Jqrj+APd76RJ+H/+6URyz0awFNh+v8AXAGgRFiGBgCKh9exubHPIADvGdd0hZFXNty3JoIuxD0FfxNh/vkA3s12XcmxulMTQPPw//VwAHMBPBxne9adGD+p9BVuVkrtjrhtjbCwC/9oFCHhLxyZiBwLYBqA3kqp5UVsfpZS6q0YeT8Y6foALhKRq43XDkEwHqQArFXh1QytTqTMhlEAPhGRe5zXazvHXB2+VuAnpdQ+I94JoDzSdE2zJJN1ZzuAis5rFQH8GmefNkqpFeYLxrndunOfiNxtborgU3Ntc1ullBIRc99EjATwGII76wLVAZTEn+tOHSPeYJx/Z/g7lEdwR1ASwHrj9yoG+3fLVRmrO0qpDfjjGq6UYFLPawjelGNh3SlEKg2Nu+zzDgT/qQAAEalp5G1B8Mm8hQrGTBIWdnG9AuASpdTbyRzDYJb9BwDjlFLjCjlnJwB1RESMxqYegk8riZ1QqaUi8gKCrjLTOgSVrqArsF74WlFSvqZZlMm6swRBl2TBsRsCKAWgqA8qsRRWd552NxKRJgDqGrGYcUInVGqOiKwA8F/Gy1sA/Iag7nwdvlYPwd1aUX5A8Km0uvMhJh9k9H2nkHOnMtxw0NaddH6P5gsALUSklYiURnD7BgBQSu1HMIA/UUQOBQARqSMip0U5sIi0RDC4dbVS6tU0lhlhuQaLSHsJlJNgOnUFBIP4+wBcIyIlRaQXgjGUZN2KoB+0svHaMwBGikiNsG92FICnijpQqtc0x3irOwCeBtBDRDqG/eu3AXhBKRXvjiaqhwAMl2BmW8EgaZ8wb1b4O/WSYJbRNQi6HpJ1EwA9TV4p9TuC7thxIlJBgoHkGxCt7qwH8CaAu0WkoogUE5FG4QerfOPzfedEEakfvi/URTDe83Kayn1Q1Z20NTRhV9ZtAN5CMJtrnrPJMAArAMwXkV/C7ZoWZEow97xjjMPfiOA2+FH5Yz57pMkAEcq9AMEUxEkI+lZXIOijhFJqL4LBugEIpguei2CQTiui3O65ViK4hTUHFMcCWABgMYAvASwKX4si7jXNFz7rjlJqCYDBCBqcTQAqwP50l0q5XwRwF4Bnw3J9BaBbmLcFQB8Eb04/AWgC4EOjzPXCcscb0DXP9SGAT5yXr0bwif57BNdsGoKJLVH0R9BF/DWCej8TQK2I++YMz+87rRGMye4I//0SwZt+Osp9UNUdsYcfiIiI0otL0BARkVdsaIiIyCs2NERE5BUbGiIi8ooNDREReZXwFzZFhNPUcpBSKtFF9zKK9SZnbVFK1ch2IeJh3clZkesO72iIDm7JLqlEFLnusKEhIiKv2NAQEZFXbGiIiMgrNjREROQVGxoiIvKKDQ0REXnFhoaIiLxiQ0NERF6xoSEiIq/Y0BARkVdsaIiIyCs2NERE5FXCqzfngrZt2+p0t27dIu/317/+1YovuOCCmNvedtttVnzLLbdEPg8d2CpUqGDF5cqVi7zv5s2bdfr3339PW5mIchnvaIiIyCs2NERE5JUoldgzhbLxEKIZM2ZYcc+ePXW6ePHiXs7pXpfVq+1HL4wZM0anH3vsMS9lSAQffJaahg0bWnH79u11umvXrlZemzZtrLhZs2aRz1O7dm2d3rRpUyJF9GWhUqpt0ZtlT67XnYNY5LrDOxoiIvKKDQ0REXnFhoaIiLzKyTEad0ymd+/evk+ZsLVr1+r0iSeeaOWtWLEi08XhGE0EpUqV0uk77rjDyrvwwgutuGrVqjGPI2Jf6kT+hu6++26dnjx5spW3atWqyMdJI47RZFHp0qV1evfu3VZe2bJlY8bHHHOMlWeOIbv1s27dula8YMGCyOXbsmVLvGyO0RARUW5gQ0NERF7lZNfZnj17rLhkyZK+T5mSc845x4pnzpyZ8TKw66xoV111lU7fe++9SR8nla4z08aNG634vvvu0+nx48cndcwksOssg8477zwrHjJkiE7/8MMPVl6jRo2suHnz5pHOka76CRT59RF2nRERUW5gQ0NERF6xoSEiIq9yZvXm66+/XqdTGZNxp0Zv375dpydNmmTluX3k5jIk7nFq1aqVdJkoO6ZOnWrFF110kU6n0m89YcKEmHnuVPz69evH3LZmzZpWbK4Y7q7sbE6LpuwyV48H/rwE1a233qrTvXr1svL69OljxcWK/fFZv3Xr1lZevDr67bffWrFZXxIZo3GPM2vWrJjbpoJ3NERE5BUbGiIi8ooNDREReZUzYzQrV67U6f3791t5Zj8mACxfvlyn3T7xpUuXWvG+ffsil2H9+vU6vWzZMiuPYzS5yVxW5uGHH7by+vXrZ8VmPXLr2N69e6140KBBOv3kk09GLs+wYcOsuEWLFlZ888036/Tpp59u5ZlLjLjfo5kzZ44VL168OHKZKHWVK1fW6YkTJ1p57v9xnTp1dLpatWpW3sKFC634yy+/1OkPP/zQyjPf51yffvqpFf/2228xt80FvKMhIiKv2NAQEZFXOdN19tJLL+n0F198YeW50/7eeecdnf7qq6+8lotyS5kyZaz4/vvv12l3BWZ3WqfZXbZz504r75prrrHiRLrL4lmyZIkVm0uQuEsXTZs2LeZxzCe6AkD//v11+ueff06liBSB2UV/3HHHWXlul5e5lJCZPpjxjoaIiLxiQ0NERF6xoSEiIq9yZozG1LVrVyt+8MEHrfiee+7xct4aNWrodPXq1eNu+8orr+i0O/WU/OnSpYsVX3zxxZH3Nacwu2Mw7jIivjRo0ECn77zzzsj7uVOhmzZtqtOffPJJyuWi+Bo3bqzT7thfiRL226i5tL67lNDBinc0RETkFRsaIiLyig0NERF5lZNjNJs3b7Zi9/sGvphjQS1btrTy3O8qmP3r/B6DXyeccIJOu0v/J8J8fPPw4cNTKVLSmjVrptM7duxI+jjmUjY9evRIqUxUNPcxy6b27dtb8bhx43T6X//6l7cy5RPe0RARkVdsaIiIyCtJ9EmDIpL8owlzzLHHHmvFb7zxhk5XqFDByluwYIEVt2vXzl/BkqCUkqK3yp5U6s27776r0x07doy83+TJk63Y7MZIpdsqXRJZgiYed3ptghYqpdoWvVn25MJ7TtWqVXX6rbfesvJatWplxeaUZvfJqDfddFPMbfNQ5LrDOxoiIvKKDQ0REXnFhoaIiLzKyenNmXL00UdbsTsuY1qzZo3v4lAM5viYOdW5KGPHjrXiXBiXMb333ntWvGfPHp0uXbp03H0nTJjgo0gUw9atW3W6W7duVt77779vxU2aNNHpIUOGWHnueM706dN1OlPLIGUD72iIiMgrNjREROQVGxoiIvLqoBqjMR/HCgB33XVXzG23b99uxewTzx7zu17xvvflPlI315cG2rRpkxXPnDlTp/v27Zvp4lBEGzdutOJOnTpZ8ciRI3X6yiuvtPJOPfXUmLH7KIh+/fpZ8a5duxIvbI7gHQ0REXnFhoaIiLw64LvOKleurNO33367lVexYsWY+w0dOtSK58+fn9ZyUfqtXLnSinfv3p2lktDBxO1Ku+6663R66dKlVt6IESOsuGbNmjrdq1cvK+/rr7+2YnO6vvm02HzAOxoiIvKKDQ0REXnFhoaIiLw64MZoevbsacXmWEvjxo0jH2fZsmVpKxNlRsOGDa24TJkyVpxr00Pd5XTc6a2Un8yl/82n9gLAE088YcUff/yxTrdo0cLKcx8pYE6HnzRpUsrlzCTe0RARkVdsaIiIyCs2NERE5FVejtHUr19fp83H8wLAhRdeaMXlypWLfNwlS5botDuHPV2KFy9uxXXr1tXpVatWeTnnweLvf/+7FZ900klWPGvWrEwWp0i33nqrFZvf+XK59dF9RDDlB3dpq+eff16n3TEal1mfOUZDRERkYENDRERe5UXXWZcuXaz4lVde0elSpUql7TxVq1bV6cGDB1t5jzzySOTjVKlSRaevv/56K899cqK5orS7WuuMGTMin/NAZq6c7a6GG+9JlGY9AYC5c+fq9OzZs628KVOmWPG2bdsil69SpUo67U6pbtasWcwylS9f3srbv39/zPOfd955Vuyu/Ez5yXx67IGMdzREROQVGxoiIvKKDQ0REXkl8Z5YWOgOIontkAbmstsAcM8992S6CBnhLv0db/zBpZSSdJcnndJVb9yxiqeeeiqRMui0W+/Xr19vxe+8806k4wBAhw4ddPrwww9PqjwAsGbNGp12p2q75UujhUqptr4Ong6+3nM2bNig0+746Jw5c5I+7l/+8hed7tixo5Xnfh3jyCOP1Gm3PrjvByeffLJOz5s3L+nypVHkusM7GiIi8ooNDREReZUX05uJCrjf7jenoZ999tlWXqdOnaw43lT4WrVqWXHfvn1jbut2cSTa/Vxg+fLlVjx+/Hid9thVRqHVq1frtFuvVqxYEXO/b7/91ord+tCuXTudrlGjRtwymHXnu+++s/LOPfdcK160aFHcY+Uy3tEQEZFXbGiIiMgrNjRERORVXozRmE+sA+xpf4cccoiVZy7j4cYlSuTer/vrr7/q9OjRo7NXkDxhXi/AXhrIXSaoefPmVlyxYkWddpcYSkS8MRp3Be6ZM2fGPM7GjRutePPmzUmXiRLXrVs3nXa/MmHWFXfbpk2bWnmJjNmZK8QDwP3336/Tb775ppVnTnfPd7yjISIir9jQEBGRV2xoiIjIq7xYgiaea6+91ordPnJzznvPnj2tvCZNmljx8ccfr9Nuv2siS4uYS7y7fb87d+604okTJ0Y+bjwHyxI0lHYH7RI0iTjttNN0+qijjrLyPvjgg5j7ud+5cf/+d+3alYbSZQ2XoCEiotzAhoaIiLzK+64zCrDrjJLErjNKFrvOiIgoN7ChISIir9jQEBGRV2xoiIjIKzY0RETkFRsaIiLyig0NERF5xYaGiIi8YkNDREResaEhIiKv2NAQEZFXbGiIiMgrNjREROQVGxoiIvKqRBL7bAGwOt0FoZTUz3YBImC9yU2sO5SsyHUn4efREBERJYJdZ0RE5BUbGiIi8ooNDRERecWGhoiIvGJDQ0REXrGhISIir9jQEBGRV2xoiIjIKzY0RETkFRsaIiLyig0NERF5xYaGiIi8YkNDREResaFJkIg8LiJjs10Oyj+sO5QsEXlPRC7NdjmSFbehEZHtxs9+EdllxH0zVciwLDVEZJqI/Cwi20Tk6TjbrjLKujH8Ay+fyfKG5RggIkpEhjqv/yginTNdnkxi3UkN607O1J0LRGS1iOwQkZdEpGqcbVW43XYRWSsi94hI8UyWNyzH6LAs5xivlQhfa5Dp8gBFNDRKqfIFPwDWAOhhvKb/WEUkmQeoJeoFABsA1ANwKIAJRWzfIyx3GwBtAYx0N8hQubcCGCoiFTJwrpzBupMWrDtZrDsi0gLAwwD6ATgMwE4A/13EbkeH5T4ZwAUALivkuJmqO7dmo6ErTFJdZyLSOfxkNUxENgB4LPwENs/ZTolI4zBdSkQmiMia8JPiQyJSJuL5TgVQF8AQpdTPSqnflFKfRdlXKbUWwGwALY0yXSki3wL4NnztDBH5XET+IyIfichRxrlbi8giEflVRKYDKB3lvIZvAHwM4IYYv1spEblXRNaFP/eKSKkwr+A63ygim0RkvYhc7Oyb1DXNFtadhLDuGDJddwD0BfCqUmquUmo7gJsB9IrS8CullgL4AEBLEWkQlmmgiKwB8E5YtktE5BsJ7rLfEBH9xEoR6SIiSyW4C58EQCKWucDrAPYCuLCwTBGpJCJPishmCe7YRopIsTBvgIjMC6/bNhFZKSLdnH0fDevUWhEZK0U0aKmM0dQEUBXB4zwvj7D9nQCOANAKQGMAdQCMKsgM/1CPj7HvsQCWAXhCRH4SkU9FpFOUQopIXQDdAZhvLmcBaA+guYi0BjAVwCAA1RB8gnklrKCHAHgJwP+Gv+tzAM52jh+v3AVuBnCdFH7bfVP4+7UCcDSAdrA/QdcEUAnB9RoI4EERqRLmxb2mOYx1p+hyF2DdsWWy7rQA8EVBoJT6DsGb9xFFnVREmgPoCLvudALQDMBpInImgBEAegGogaBReibctzqCu/CRAKoD+A7Accax64XlrhenCApB3blFREoWkv8AgrrRMCxXfwAXG/ntEfzdVAcwHsCjIlLQ2D0OYB+C69kawKkA4o8fKaUi/QBYBeCUMN0ZwQUvbeQPADDP2UeFhREAOwA0MvI6AFgZ8dz/Ex5rIICSAM4D8B8A1eOUdXu4zWoEt7tljDKdZGw7GcAYZ/9l4cU/AcA6hI+8DvM+AjA2Yrn1NQEwA8BdYfpHAJ3D9HcAuhv7nAZglXGddwEoYeRvQvDmktI1zeQP6w7rTp7WnbcBDHZeW1tw/QvZXgH4BcC28P9mLIIP8w3CvIbGtrMBDDTiYgi65uojeNOfb+RJ+P9+acRyjwbwVJj+PwBXACgRlqEBgOLhdWxu7DMIwHvGNV1h5JUN962JoAtxT8HfRJh/PoB345Uplb7CzUqp3RG3rREWduEfjSIk/IWj2IXgD+jRMH5WRG5C0Mq/HGOfs5RSb8XI+8FI1wdwkYhcbbx2CIDaCC7uWhVezdDqiGV2jQLwiYjc47xe2znm6vC1Aj8ppfYZ8U4A5ZH6Nc0m1p3EsO78IZN1ZzuAis5rFQH8GmefNkqpFeYLxrndunOfiNxtborgjqu2ua1SSomIuW8iRgJ4DMGddYHqCD50uXWnjhFvMM6/M/wdyiO4mywJYL3xexWD/bv9SSpdZ8qJdyD4TwUAiEhNI28Lgj/4FkqpyuFPJRUMmkWxuJDzuXEizH1/ADDOKFdlpVRZpdQzANYDqGPcMgLBgHLiJwz6bF9A0N1hWoeg0pnHXxfhkKle02xi3UnkhKw7pkzWnSUIuiQLjt0QQCkAy5Mq+Z/rziCn7pRRSn2EoO7UNc4rZpzQCZWaA2AFgP8yXt4C4Df8ue6sjXDIHxDc0VQ3yl1RKdUi3k7p/B7NFwBaiEgrESmN4PYNAKCU2g9gCoCJInIoAIhIHRE5LeKxXwRQRUQuEpHiItIbwF8AfJiGck8BMFhE2kugnIicHg74fYygL/IaESkpIr0Q9IMn61YE/aCVjdeeATBSgim41RF8en2qqAOl4ZrmEtadorHuFM5n3XkaQA8R6Sgi5QDcBuAFpVS8O5qoHgIwXIKZbQUD7H3CvFnh79RLghlq1yDotkrWTQD0NHml1O8IumPHiUgFCSYh3IBodWc9gDcB3C0iFUWkmIg0KmrcM20NjVJqOYL/iLcQzMiZ52wyDEHLOl9Efgm3a1qQKcHc844xjr0VwD8A/BPAzwD+BeBMpdSWNJR7AYIpiJMQ9K2uQNBHCaXUXgSDdQMQTBc8F8EnSy1euQs510oEt7DljJfHAliA4JP3lwAWha9FEfea5gvWnUjnYt0phOe6swTAYAQNziYAFWDfGaRS7hcB3IWgK/cXAF8B6BbmbQHQB8FEhp8ANIHxwSicDLC9iMkA5rk+BPCJ8/LVCO4Gv0dwzaYhmNgSRX8EXcRfI6j3MwHUireD2F3IRERE6cUlaIiIyCs2NERE5BUbGiIi8ooNDRERecWGhoiIvEp4ZQAR4TS1HKSUSnTRvYxivclZW5RSNbJdiHhYd3JW5LrDOxqig1uyy+IQRa47bGiIiMgrNjREROQVGxoiIvKKDQ0REXnFhoaIiLxiQ0NERF6xoSEiIq/Y0BARkVcJrwyQb1q1aqXTY8aMsfK6d+9uxTt37tTpTp3sB8YtWrQo/YWjnHXIIYfo9DXXXGPljRw50ornzp2r0+eff76Vt2PHDg+lI8ovvKMhIiKv2NAQEZFXbGiIiMgrUSqxhVHzbSXV119/XadPOeWUuNtu2bJFp+fMmWPl9evXL70FSzOu3pya4sWLW/HFF1+s0w8//HDcfUX+uPRnnnmmlffqq6+moXReLVRKtc12IeLJ9brjqlatmk6XLFnSyvv111+tOM/H8CLXHd7REBGRV2xoiIjIqwNuevOJJ55oxW3atIm57YQJE6x46tSpOl21atX0FoxyWtu2dg9AvO6yJUuWWPH48eN1+v33309vwSjnlClTxoq7dOlixeb7SOXKla28Dz/80Iovu+wynV6+fHmaSph7eEdDREResaEhIiKv2NAQEZFXeT+92ZxKCADLli2zYrOP9LXXXrPyevfubcX79u1Lb+EyiNObE3Psscda8ezZs624YsWKMfdt2rSpFa9YsSJ9Bcs8Tm+OoFSpUjo9adIkK8+cCl8Ucyo8YE93HjJkiJX37LPPxtw2R3B6MxER5QY2NERE5FXeT2/u0KGDFbvTCU133nmnFedzVxmlZuLEiVYcr6vs66+/tmJzlW86MPXp08eKR40apdPNmjWz8t5++20rNleDuOOOO6y8smXLWnH58uV1evLkyVZe165drfj222/X6YULF8Ysey7iHQ0REXnFhoaIiLxiQ0NERF7l/RiN+yRMd/rgSy+9pNPz58/PRJEoR5199tk63aJFi7jbfv/99zp93HHHWXm//PJLegtGWeeO395www1WbK7u/fjjj1t5V1xxhRXv3btXp92lrMqVK2fFNWrU0On+/ftbee5K4OZSN8OGDbPy3PGdXMM7GiIi8ooNDRERecWGhoiIvMrLJWgOPfRQnXbnsLtz3Lt166bT7lMzDyRcgqbQc1rx9OnTddocrwH+/N0Ys97MmzfPQ+lyxkG7BM2YMWN0esSIEXG33bRpk06ffPLJVp77PatkNWjQwIpvvvlmKzbHcEqUsIfXzeW1brzxRivP4+MHuAQNERHlBjY0RETkVV5ObzZvIZs3b27luSuc/vTTTxkpE+Ues4sV+HN3menee++14gO8u+ygVK9ePSs+6aSTdNodQti8ebMVm91l6eoqc61atcqKBw4caMVm993QoUOtPLOrd/369Vbetddea8W7du1KpZhJ4R0NERF5xYaGiIi8YkNDRERe5eUYjTuF2WQuHQIAixYt8l0cIspR5pMxH3nkESuvffv2Ou2Oa/To0cOKfY3LJGL48OE6/dlnn1l506ZN0+lLLrnEynN/t1tuucVD6eLjHQ0REXnFhoaIiLxiQ0NERF7l5RiNOWfc9dBDD2WwJESUy6pUqaLT5vdmXO5SVp9//rmvIqXFiy++aMX//ve/dXrIkCFW3pVXXmnF999/v05n6nuGvKMhIiKv2NAQEZFXedl1Zq7KW6yY3Va60xIbN26s0+606O7du1uxeaz9+/dbeatXr7Zic+XXJ5980sr7/fffY5adssddzTldWrVqpdPuirs9e/aMud+GDRusuGvXrjq9ePHi9BTuIPfoo4/GzDP/ps2/53zw22+/WfGoUaN02py2DQAnnHCCFZtPC80U3tEQEZFXbGiIiMgrNjRERORVXo7RmEt6u2Mp7tTneFOh3aXBv/rqK512x3PcJcanTJmi09WrV7fyzKmGlDsSeZps+fLlddrt8x4wYIAV9+3bN+Y54p3zsMMOs+Jhw4YVekxKnjk+4Y7RmbGv8btMMcds3PEb93fr3LmzTs+YMcNruQrwjoaIiLxiQ0NERF6xoSEiIq8kkX5rABCRxHbwYN26dTrt9nO7j3L++OOPddr9vsuWLVuseO7cuTrtzj2//PLLrTje9yPOP/98nX7uuedibpdOSqmc7mTORr1x64ZZb1xuXfj55591ulGjRnHPY/aBr1271sqbPXu2FV900UU6XaKEPURqntP8/hcAbN26NW4ZUrBQKdXW18HTIZW6c+KJJ+r0nDlzYm63cuVKKzaXaQGABx54INkiZNzrr79uxaeccooVf/PNNzp95JFHpnKqyHWHdzREROQVGxoiIvIqL6c3P/HEEzo9dOhQK2/69OlWPGjQoKTO4d5mz58/34pbtmyp002aNLHy6tevn9Q5Kb02bdpkxTNnztTp3r17W3nuFHU3Nrkr3prdqAsXLrTydu/ebcWvvfaaTj/77LNWXqVKlXTa7Vaj5JirMLt/03/72990+vDDD7fyJk6caMVmF9z48eOtPPe9Idvc5bJc8Z5Q7AvvaIiIyCs2NERE5BUbGiIi8iovO4LjPRXumGOO8XJOd9r0vHnzdNodo6Hc4E7dd6ceRzVu3Dgrdvvot2/fHvlY5nTnDz74wMpr3bq1TrvLiFBytm3bptPuclTm+K471lu5cmUrPvPMM3W6S5cuVp77dM7nn39ep81xQQDYtWtXhFKnxjw/AAwcOND7OYvCOxoiIvKKDQ0REXnFhoaIiLzKyzGaHTt26LT7KOeSJUtacalSpXR6z549SZ/TfFwvAPzjH//Q6XxfYvxgYT7u1u1nb968ecz9zLGTVJlL0LhLg5jfqzHHFsgPc6zNfeTz1VdfbcXmI7rLlClj5Z1xxhkxY/eRIe62CxYsSKDE+Yt3NERE5BUbGiIi8iovu84mT56s0+3atbPy+vXrZ8XmqqvXXnutlRdvqqH7RM0HH3zQiqtVq6bT7jTazZs3xzwuZY85DdldjsRdlsPsDu3evbuV9+KLL1rxe++9p9PuE1/NpYoA4LzzzotZPk5pzh73KxOjR4+OGbvTpM866ywrvvTSS3W6Ro0aVt6nn35qxRs3btRpcwo1ACxatMiK49UPc8hg+PDhVp7btb9z586Yx/GFdzREROQVGxoiIvKKDQ0REXmVl0/YNFWpUsWKFy9ebMW1atXS6alTp1p57vIQ5cqV02n3CXvmcQBg/fr1Ov3YY49ZeeY02kzhEzZT8/TTT1txvLEUlznF3h2jiWfKlClWbNYb9xEHHh3QT9jMBPe94aqrrtLp/v37W3m1a9e24njvv7NmzbLit956S6fdsZ4RI0botDum6DLHcNzp1wniEzaJiCg3sKEhIiKv8r7rzNWmTRsrfvnll3XavcV1mdMA3evirtBq3n660xCzgV1nqWnQoIEVm6uA33jjjTHzgPj1Zs2aNVZsPvH1iy++sPLMqa4ZxK4zj9x6dfTRR1vxP//5T53u0KFD5OO6U5bNeudO1X7qqaes2Hzv2rt3b+RzFoJdZ0RElBvY0BARkVdsaIiIyKsDbozGZa66PGbMGCvPXUpi7ty5Om0+CREA7rvvPitOsW8z7ThG40/ZsmWt+JlnnrHiSpUq6fRnn31m5T300ENWvGzZsjSXLmUco8kicyXoRo0aWXnmSt8AMHjwYJ126+TWrVt1uk+fPlaeuURSmnGMhoiIcgMbGiIi8ooNDREReXXAj9EcLDhGQ0niGE2eMB854D5Z2HyEgDle4xnHaIiIKDewoSEiIq/y8gmbREQHm3x+ci/vaIiIyCs2NERE5BUbGiIi8ooNDRERecWGhoiIvGJDQ0REXrGhISIir9jQEBGRV2xoiIjIKzY0RETkFRsaIiLyig0NERF5xYaGiIi8YkNDREReJfOYgC0AVqe7IJSS+tkuQASsN7mJdYeSFbnuJPwoZyIiokSw64yIiLxiQ0NERF6xoSEiIq/Y0BARkVdsaIiIyCs2NERE5BUbGiIi8ooNDRERecWGhoiIvPp/OxYtLcz5y6wAAAAASUVORK5CYII=", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "def plot_img_label_prediction(imgs, y_true, y_pred=None, shape=(2, 3)):\n", " y_pred = [None] * len(y_true) if y_pred is None else y_pred\n", " fig = plt.figure()\n", " for i in range(np.prod(shape)):\n", " plt.subplot(*shape, i+1)\n", " plt.tight_layout()\n", " plt.imshow(imgs[i][0], cmap='gray', interpolation='none')\n", " plt.title(\"True: {} Pred: {}\".format(y_true[i], y_pred[i]))\n", " plt.xticks([])\n", " plt.yticks([])\n", "\n", "plot_img_label_prediction(imgs=example_imgs, y_true=example_targets, y_pred=None, shape=(2, 3))\n" ] }, { "cell_type": "markdown", "metadata": { "id": "Mj3utDDuzDCj" }, "source": [ "### 1.1 Logistic Regression\n", "\n", "We can use a very simple Logistic Regression that receives our input images as a vector and predicts the digit. This will be our first baseline to compare with the CNNs." ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "TniyY4bQzBMS", "outputId": "54a7e07e-3078-4a71-95f6-5670a051b4b2" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test score with penalty: 0.9002\n" ] } ], "source": [ "scaler = StandardScaler()\n", "X_train = scaler.fit_transform(np.reshape(X_train, (X_train.shape[0], -1)))\n", "X_val = scaler.transform(np.reshape(X_val, (X_val.shape[0], -1)))\n", "\n", "clf = LogisticRegression(C=50., multi_class='multinomial', solver='sag', tol=0.1)\n", "clf.fit(X_train, y_train)\n", "score = clf.score(X_val, y_val)\n", "\n", "print(\"Test score with penalty: %.4f\" % score)" ] }, { "cell_type": "markdown", "metadata": { "id": "A8rylkCnrwIy" }, "source": [ "We can select the coefficients for each class and reshape them into the image shape to plot them. This allows us to visualize what are the pixels that are contributing more to the classification for each of the digits. \n", "\n", "But what happens if the digits are not centered? Will we still get such a good performance? Lets test that out later!" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 339 }, "id": "2pucfjpaDF9_", "outputId": "3d370f1b-27ce-4a5a-a05a-62e25f560876" }, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "coef = clf.coef_.copy()\n", "plt.figure(figsize=(10, 5))\n", "scale = np.abs(coef).max()\n", "for i in range(10):\n", " l1_plot = plt.subplot(2, 5, i + 1)\n", " l1_plot.imshow(coef[i].reshape(28, 28), interpolation='nearest',\n", " cmap=plt.cm.RdBu, vmin=-scale, vmax=scale)\n", " l1_plot.set_xticks(())\n", " l1_plot.set_yticks(())\n", " l1_plot.set_xlabel('Class %i' % i)\n", "plt.suptitle('Classification coefficient vectors for...')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "kK8v34AK6xXJ" }, "source": [ "### 1.2 Feed-Forward Neural Network\n", "\n", "The first step is to create the functions that will allow us to implement a feed-forward neural network and manage the training and validation process.\n", "\n", "The MLP class will define the architecture of a feed-forward neural network, with a set of hidden layers (fully connected layers [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)), with a activation function in between them ([relu](https://pytorch.org/docs/stable/generated/torch.nn.functional.relu.html#torch.nn.functional.relu)), and a [softmax](https://pytorch.org/docs/stable/generated/torch.nn.functional.log_softmax.html#torch.nn.functional.log_softmax) in the last layer. Since the dataset poses a multiclass classification problem, the last layer should have a number of neurons equal to the number of classes." ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "id": "In9r_o8vvNaz" }, "outputs": [], "source": [ "class MLP(nn.Module):\n", " def __init__(self, dim_layers):\n", " super(MLP, self).__init__()\n", " self.dim_layers = dim_layers\n", " layer_list = [nn.Linear(dim_layers[l], dim_layers[l+1]) for l in range(len(dim_layers) - 1)]\n", " self.lin_layers = nn.ModuleList(layer_list)\n", "\n", " def forward(self, X):\n", " X = X.view(-1, self.dim_layers[0])\n", " # apply relu\n", " for layer in self.lin_layers[:-1]:\n", " X = F.relu(layer(X))\n", " # use softmax for output layer\n", " return F.log_softmax(self.lin_layers[-1](X), dim=1)" ] }, { "cell_type": "markdown", "metadata": { "id": "h6OVD_1xUwWH" }, "source": [ "##### training validation function for the MLP and CNN" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "id": "B1eUu01N8wIR" }, "outputs": [], "source": [ "def train_val_model(model, criterion, optimizer, dataloaders, num_epochs=25,\n", " scheduler=None, log_interval=None):\n", " since = time.time()\n", "\n", " best_model_wts = copy.deepcopy(model.state_dict())\n", " best_acc = 0.0\n", "\n", " # init dictionaries to save losses and accuracies of training and validation\n", " losses, accuracies = dict(train=[], val=[]), dict(train=[], val=[])\n", "\n", " for epoch in range(num_epochs):\n", " if log_interval is not None and epoch % log_interval == 0:\n", " print('Epoch {}/{}'.format(epoch, num_epochs - 1))\n", " print('-' * 10)\n", "\n", " # execute a training and validation phase for each epoch\n", " for phase in ['train', 'val']:\n", " if phase == 'train':\n", " model.train() # set model to train mode\n", " else:\n", " model.eval() # Set model to eval mode\n", "\n", " running_loss = 0.0\n", " running_corrects = 0\n", "\n", " # iterate over the data\n", " nsamples = 0\n", " for inputs, labels in dataloaders[phase]:\n", " inputs = inputs.to(device)\n", " labels = labels.to(device)\n", " nsamples += inputs.shape[0]\n", "\n", " # set the parameter gradients to zero\n", " optimizer.zero_grad()\n", "\n", " with torch.set_grad_enabled(phase == 'train'):\n", " outputs = model(inputs)\n", " _, preds = torch.max(outputs, 1)\n", " loss = criterion(outputs, labels)\n", "\n", " # if in training phase, perform backward prop and optimize\n", " if phase == 'train':\n", " loss.backward()\n", " optimizer.step()\n", "\n", " # increment loss and correct counts\n", " running_loss += loss.item() * inputs.size(0)\n", " running_corrects += torch.sum(preds == labels.data)\n", "\n", " if scheduler is not None and phase == 'train':\n", " scheduler.step()\n", "\n", " epoch_loss = running_loss / nsamples\n", " epoch_acc = running_corrects.double() / nsamples\n", "\n", " losses[phase].append(epoch_loss)\n", " accuracies[phase].append(epoch_acc)\n", " if log_interval is not None and epoch % log_interval == 0:\n", " print('{} Loss: {:.4f} Acc: {:.2f}%'.format(\n", " phase, epoch_loss, 100 * epoch_acc))\n", "\n", " # deep copy the best model\n", " if phase == 'val' and epoch_acc > best_acc:\n", " best_acc = epoch_acc\n", " best_model_wts = copy.deepcopy(model.state_dict())\n", " if log_interval is not None and epoch % log_interval == 0:\n", " print()\n", "\n", " time_elapsed = time.time() - since\n", " print('Training complete in {:.0f}m {:.0f}s'.format(\n", " time_elapsed // 60, time_elapsed % 60))\n", " print('Best val Acc: {:.2f}%'.format(100 * best_acc))\n", "\n", " # load best model weights to return\n", " model.load_state_dict(best_model_wts)\n", "\n", " return model, losses, accuracies" ] }, { "cell_type": "markdown", "metadata": { "id": "0CBE5tRMZEfr" }, "source": [ "We will start by creating a simple network with some hidden layers. Thus, in addition to the input, it will have 3 fully connected layer which, in this implemetation, is assigned to the input of the MLP Class. We will use the Stochastic Gradient Descend optimizer ([optim.SGD](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html)) with 0.01 learning rate and 0.5 momentum. The loss function to be optimized will be negative log likelihood ([nn.NLLLoss](https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html)). Training and validation will be managed by the function \"train_val_model\" previously define." ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 981 }, "id": "200WI3xND6_M", "outputId": "79913a00-abf0-4e48-8177-64b49bbd6fac" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 0/14\n", "----------\n", "train Loss: 0.8353 Acc: 75.38%\n", "val Loss: 0.3007 Acc: 91.05%\n", "\n", "Epoch 2/14\n", "----------\n", "train Loss: 0.1805 Acc: 94.77%\n", "val Loss: 0.1527 Acc: 95.51%\n", "\n", "Epoch 4/14\n", "----------\n", "train Loss: 0.1114 Acc: 96.74%\n", "val Loss: 0.1116 Acc: 96.67%\n", "\n", "Epoch 6/14\n", "----------\n", "train Loss: 0.0776 Acc: 97.72%\n", "val Loss: 0.0882 Acc: 97.21%\n", "\n", "Epoch 8/14\n", "----------\n", "train Loss: 0.0565 Acc: 98.35%\n", "val Loss: 0.0810 Acc: 97.32%\n", "\n", "Epoch 10/14\n", "----------\n", "train Loss: 0.0415 Acc: 98.81%\n", "val Loss: 0.0764 Acc: 97.46%\n", "\n", "Epoch 12/14\n", "----------\n", "train Loss: 0.0307 Acc: 99.20%\n", "val Loss: 0.0722 Acc: 97.79%\n", "\n", "Epoch 14/14\n", "----------\n", "train Loss: 0.0226 Acc: 99.44%\n", "val Loss: 0.0748 Acc: 97.62%\n", "\n", "Training complete in 2m 35s\n", "Best val Acc: 97.79%\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "model_mlp = MLP([D_in, 256, 128, 64, D_out]).to(device) # [D_in, 512, 256, 128, 64, D_out]\n", "\n", "optimizer = optim.SGD(model_mlp.parameters(), lr=0.01, momentum=0.5)\n", "criterion = nn.NLLLoss()\n", "\n", "model_mlp, losses, accuracies = train_val_model(model_mlp, criterion, optimizer, dataloaders,\n", " num_epochs=15, log_interval=2)\n", "\n", "_ = plt.plot(losses['train'], '-b', losses['val'], '--r')" ] }, { "cell_type": "markdown", "metadata": { "id": "HXhHtX1TkUba" }, "source": [ "### 1.3 Convolutional Neural Network\n", "\n", "Convolutional layers capture patterns corresponding to relevant features independently of where they occur in the input. To do so, they slide a window over the input and apply the convolution operation with a set of kernels or filters that represent the features. Although it is not their only field of application, convolutional neural networks are mainly praised for their performance on image processing tasks.\n", "\n", "The training and validation management for the CNN implementation will be performed as the feed-forward network, however we will have to define the network's architecture.\n", "\n", "For that we will implement a CNN class to define how many layers it comprises and how the layers will be connected.\n", "\n", "The initialization (`__init__`) function will define the architecture and the `foward` function will implement how the different layers are connected. This architecture will be a sequece of 2 convolutional layers ([nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html)) (1st: output channels 10, kernel size 5; 2nd: output channels 20, kernel size 5), then 2 fully connected layers ([nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)) (1st: output features 50; 2nd: output features 10 (the number of classes)). Once again, the final layer will be a [softmax](https://pytorch.org/docs/stable/generated/torch.nn.functional.log_softmax.html#torch.nn.functional.log_softmax) function that will choose the most probable class of the 10 in the input.\n", "\n", "Between the second convolution layer and the first fully connected, we will set a dropout layer ([nn.Dropout2d](https://pytorch.org/docs/stable/generated/torch.nn.Dropout2d.html)). The idea behind dropout is to disable a percentage of randomly selected neurons during each step of the training phase, in order to avoid overfitting." ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "id": "PZ0mCl24EoaM" }, "outputs": [], "source": [ "class CNN(nn.Module):\n", " \"\"\"Basic Pytorch CNN for MNIST-like data.\"\"\"\n", "\n", " def __init__(self):\n", " super(CNN, self).__init__()\n", " self.conv1 = nn.Conv2d(1, 10, kernel_size=5)\n", " self.conv2 = nn.Conv2d(10, 20, kernel_size=5)\n", " self.conv2_drop = nn.Dropout2d()\n", " self.fc1 = nn.Linear(320, 50)\n", " self.fc2 = nn.Linear(50, 10)\n", "\n", " def forward(self, x, T=1.0):\n", " # Batch size = 64, images 28x28 =>\n", " # x.shape = [64, 1, 28, 28]\n", " x = F.relu(F.max_pool2d(self.conv1(x), 2))\n", " # Convolution with 5x5 filter without padding and 10 channels =>\n", " # x.shape = [64, 10, 24, 24] since 24 = 28 - 5 + 1\n", " # Max pooling with stride of 2 =>\n", " # x.shape = [64, 10, 12, 12]\n", " x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))\n", " # Convolution with 5x5 filter without padding and 20 channels =>\n", " # x.shape = [64, 20, 8, 8] since 8 = 12 - 5 + 1\n", " # Max pooling with stride of 2 =>\n", " # x.shape = [64, 20, 4, 4]\n", " x = x.view(-1, 320)\n", " # Reshape =>\n", " # x.shape = [64, 320]\n", " x = F.relu(self.fc1(x))\n", " x = F.dropout(x, training=self.training)\n", " x = self.fc2(x)\n", " x = F.log_softmax(x, dim=1)\n", " return x" ] }, { "cell_type": "markdown", "metadata": { "id": "mv9vdZZ7OlSh" }, "source": [ "As previously, lets describe the model to be trained. We will use the ADAM optimizes ([optim.Adam](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html#torch.optim.Adam)), with learning rate 0.001, and the same negative log likelihood ([nn.NLLLoss](https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html))." ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "id": "ImTlr5JeEsb6", "outputId": "a8d8e0a6-e3cc-4b37-d022-adc0241e8b88" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 0/24\n", "----------\n", "train Loss: 0.5470 Acc: 83.14%\n", "val Loss: 0.1097 Acc: 96.72%\n", "\n", "Epoch 2/24\n", "----------\n", "train Loss: 0.2096 Acc: 93.83%\n", "val Loss: 0.0579 Acc: 98.28%\n", "\n", "Epoch 4/24\n", "----------\n", "train Loss: 0.1699 Acc: 94.98%\n", "val Loss: 0.0485 Acc: 98.46%\n", "\n", "Epoch 6/24\n", "----------\n", "train Loss: 0.1550 Acc: 95.46%\n", "val Loss: 0.0444 Acc: 98.61%\n", "\n", "Epoch 8/24\n", "----------\n", "train Loss: 0.1385 Acc: 95.94%\n", "val Loss: 0.0395 Acc: 98.74%\n", "\n", "Epoch 10/24\n", "----------\n", "train Loss: 0.1330 Acc: 96.06%\n", "val Loss: 0.0360 Acc: 98.79%\n", "\n", "Epoch 12/24\n", "----------\n", "train Loss: 0.1220 Acc: 96.41%\n", "val Loss: 0.0352 Acc: 98.85%\n", "\n", "Epoch 14/24\n", "----------\n", "train Loss: 0.1208 Acc: 96.44%\n", "val Loss: 0.0326 Acc: 98.94%\n", "\n", "Epoch 16/24\n", "----------\n", "train Loss: 0.1156 Acc: 96.51%\n", "val Loss: 0.0318 Acc: 98.97%\n", "\n", "Epoch 18/24\n", "----------\n", "train Loss: 0.1127 Acc: 96.69%\n", "val Loss: 0.0312 Acc: 99.02%\n", "\n", "Epoch 20/24\n", "----------\n", "train Loss: 0.1092 Acc: 96.65%\n", "val Loss: 0.0338 Acc: 98.86%\n", "\n", "Epoch 22/24\n", "----------\n", "train Loss: 0.1078 Acc: 96.85%\n", "val Loss: 0.0307 Acc: 99.00%\n", "\n", "Epoch 24/24\n", "----------\n", "train Loss: 0.1077 Acc: 96.74%\n", "val Loss: 0.0287 Acc: 99.04%\n", "\n", "Training complete in 7m 6s\n", "Best val Acc: 99.04%\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "model = CNN().to(device)\n", "optimizer = torch.optim.Adam(model.parameters(), lr=0.001)\n", "criterion = nn.NLLLoss()\n", "\n", "model, losses, accuracies = train_val_model(model, criterion, optimizer, dataloaders,\n", " num_epochs=25, log_interval=2)\n", "\n", "_ = plt.plot(losses['train'], '-b', losses['val'], '--r')" ] }, { "cell_type": "markdown", "metadata": { "id": "ULZ91b0cPhy5" }, "source": [ "We have now completed training and validation with 3 different models: Logistic Regression, Feed-Forward Network, and Convolutional Neural Network. \n", "\n", "We have seen that with the CNN, the performance of the model in the validation set, outperforms the other models (~99% accuracy against ~90% and ~98%). " ] }, { "cell_type": "markdown", "metadata": { "id": "PHyGUuZbTvhr" }, "source": [ "The difference in performance between CNNs and MLP is small but how many learnable parameters are we using in the MLP and in CNN models?\n", "\n", "We can find it out using the following lines of code:" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "acy0l3-YQjT2", "outputId": "ceb6251b-4ea1-4168-f23a-0c42feea37ce" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of parameters in the MLP model: 242762\n", "Number of parameters in the CNN model: 21840\n" ] } ], "source": [ "#model_mlp = MLP([D_in, 256, 128, 64, D_out]).to(device)\n", "model_parameters_mlp = filter(lambda p: p.requires_grad, model_mlp.parameters())\n", "params_mlp = sum([np.prod(p.size()) for p in model_parameters_mlp])\n", "print('Number of parameters in the MLP model: {}'.format(params_mlp))\n", "\n", "model_parameters_cnn = filter(lambda p: p.requires_grad, model.parameters())\n", "params_cnn = sum([np.prod(p.size()) for p in model_parameters_cnn])\n", "print('Number of parameters in the CNN model: {}'.format(params_cnn))" ] }, { "cell_type": "markdown", "metadata": { "id": "Sj28CWvrMbOw" }, "source": [ "You can see that we have ~11x more learnable parameters to achieve almost the same performance.\n", "\n", "We can experiment and try to find out the number of layers and corresponding sizes." ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "2RmgJhIPMECw", "outputId": "3c677343-5c57-4a1d-a047-9bca2d8fe52f" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of parameters in the MLP model: 25450\n" ] } ], "source": [ "model_mlp_test = MLP([D_in, 32, D_out]).to(device)\n", "model_parameters_mlp_test = filter(lambda p: p.requires_grad, model_mlp_test.parameters())\n", "params_mlp_test = sum([np.prod(p.size()) for p in model_parameters_mlp_test])\n", "print('Number of parameters in the MLP model: {}'.format(params_mlp_test))" ] }, { "cell_type": "markdown", "metadata": { "id": "B_oq9682QWCF" }, "source": [ "And how does that model perform? We are about to find out" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 979 }, "id": "w6oa0TeBQU9E", "outputId": "e8c478bf-27c6-4ade-f0bd-59302eaf0e49" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 0/14\n", "----------\n", "train Loss: 0.4684 Acc: 87.31%\n", "val Loss: 0.2689 Acc: 92.38%\n", "\n", "Epoch 5/14\n", "----------\n", "train Loss: 0.1535 Acc: 95.53%\n", "val Loss: 0.1545 Acc: 95.36%\n", "\n", "Epoch 10/14\n", "----------\n", "train Loss: 0.1099 Acc: 96.78%\n", "val Loss: 0.1270 Acc: 96.04%\n", "\n", "Training complete in 2m 14s\n", "Best val Acc: 96.54%\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "optimizer = optim.SGD(model_mlp_test.parameters(), lr=0.01, momentum=0.5)\n", "criterion = nn.NLLLoss()\n", "\n", "model_mlp_test, losses, accuracies = train_val_model(model_mlp_test, criterion, \n", " optimizer, dataloaders,\n", " num_epochs=15, \n", " log_interval=5)\n", "\n", "_ = plt.plot(losses['train'], '-b', losses['val'], '--r')" ] }, { "cell_type": "markdown", "metadata": { "id": "rpmgachOUCnX" }, "source": [ "We can see a drop in performance compared with the previous MLP model. So we can understand that although we have less learnable parameters, due to properties of CNNs (e.g., invariance and parameter sharing), which allow them to have fewer weights as some parameters are shared.\n", "\n", "CNNs are expected to be invariant to the location where important features occur in the input. In fact, it's not unusual that there is a dataset shift where the data acquisition process suffers some modification. We will do this by applying a transformation with horizontal translations to our validation dataset and see how robust each model is to these shifts.\n", "\n", "We can do this by going back to **0.1 - Create Dataloaders -\n", "MNIST dataset** cell to define the test transform using the following code \n", "\n", "```\n", "mnist_transform_test = transforms.Compose(\n", " [transforms.ToTensor(),\n", " transforms.RandomAffine(0, translate=[0.1, 0]),\n", " transforms.Normalize((0.1307,), (0.3081,))])\n", "```\n", "\n", "and replace\n", "\n", "`mnist_val_dataset = datasets.MNIST('../data', download=True, train=False, transform=mnist_transform)`\n", "\n", "with\n", "\n", "`mnist_val_dataset = datasets.MNIST('../data', download=True, train=False, transform=mnist_transform_test)`" ] }, { "cell_type": "markdown", "metadata": { "id": "-5gcf_gMlcqI" }, "source": [ "After rerunning the different models we can see that the accuracy of the Logistic Regression drops from ~90% to ~72%, the MLP drops from ~98% to ~87%, and the CNN drops from ~99% to ~97%. This shows that the learned features are more robust to variances in location, as expected." ] }, { "cell_type": "markdown", "metadata": { "id": "nU3NwQ7Nuvhv" }, "source": [ "# Bonus Case - Attention with small images and CNNs. (And how to create a dataset that takes numpy arrays)\n", "\n", "In this case we will use the Scikit-Learn's digits dataset\n", "\n", "## Scikit-Learn Digits\n", "\n", "This dataset is provided by scikit-learn and the digit images are returned as numpy ndarray. We will use PIL (Python Image Library) to convert the numpy ndarray to a image, tranform it to a tensor and normalize it.\n", "\n", "In this case we don't have a predefined Digits Dataset provided by torchvision so we will need to write a custom Dataset class and implement three functions: \n", "\n", "`__init__`, `__len__`, and `__getitem__`.\n", "\n", "Scikit-Learn return the digits images and labels as ndarrays. Each digit image is an 8x8 array.\n", "\n", "To use the previous CNN, we will use a transform to resize the images to the MNIST image size." ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "id": "A4v-XFzcv9If" }, "outputs": [], "source": [ "SKLEARN_DIGITS_TRAIN_SIZE = 1247\n", "SKLEARN_DIGITS_VAL_SIZE = 550\n", "\n", "class NumpyDataset(Dataset):\n", "\n", " def __init__(self, data, targets, transform=None):\n", " self.data = torch.from_numpy(data).float()\n", " self.targets = torch.from_numpy(targets).long()\n", " self.transform = transform\n", "\n", " def __getitem__(self, index):\n", " x = np.expand_dims(self.data[index], axis=2)\n", " y = self.targets[index]\n", " if self.transform:\n", " x = self.transform(x)\n", " return x, y\n", "\n", " def __len__(self):\n", " return len(self.data) \n", "\n", "digits_transform = transforms.Compose([\n", " transforms.ToPILImage(),\n", " transforms.Resize(28),\n", " transforms.ToTensor(),\n", " ])\n", "\n", "# Get sklearn digits dataset\n", "X, y = load_digits(return_X_y=True)\n", "X = X.reshape((len(X), 8, 8))\n", "y_train = y[:-SKLEARN_DIGITS_VAL_SIZE]\n", "y_val = y[-SKLEARN_DIGITS_VAL_SIZE:]\n", "X_train = X[:-SKLEARN_DIGITS_VAL_SIZE]\n", "X_val = X[-SKLEARN_DIGITS_VAL_SIZE:]\n", "\n", "digits_train_dataset = NumpyDataset(X_train, y_train, transform=digits_transform)\n", "digits_val_dataset = NumpyDataset(X_val, y_val, transform=digits_transform)\n", "digits_train_dataloader = torch.utils.data.DataLoader(digits_train_dataset, batch_size=64, shuffle=True)\n", "digits_val_dataloader = torch.utils.data.DataLoader(digits_val_dataset, batch_size=64, shuffle=True)\n", "\n", "dataloaders = dict(train=digits_train_dataloader, val=digits_val_dataloader)" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "dhQU3v7Zv9Ih", "outputId": "030f1fa0-62a0-4dc0-d61f-013e09e2457d" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Datasets shapes (before transformations): {'train': torch.Size([1247, 8, 8]), 'val': torch.Size([550, 8, 8])}\n", "N input features: 784 Output classes: 10\n", "Train batch: torch.Size([64, 1, 28, 28]) torch.Size([64])\n", "Val batch: torch.Size([64, 1, 28, 28]) torch.Size([64])\n" ] } ], "source": [ "# Get some examples of images and targets\n", "_, (example_train_imgs, example_train_targets) = next(enumerate(digits_train_dataloader))\n", "_, (example_val_imgs, example_val_targets) = next(enumerate(digits_val_dataloader))\n", "\n", "# Info about the dataset\n", "D_in = np.prod(example_imgs.shape[1:])\n", "D_out = len(digits_train_dataloader.dataset.targets.unique())\n", "\n", "# Output information\n", "print(\"Datasets shapes (before transformations):\", {x: dataloaders[x].dataset.data.shape for x in ['train', 'val']})\n", "print(\"N input features:\", D_in, \"Output classes:\", D_out)\n", "print(\"Train batch:\", example_train_imgs.shape, example_train_targets.shape)\n", "print(\"Val batch:\", example_val_imgs.shape, example_val_targets.shape)" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 284 }, "id": "Vx78pb7Ov9Ih", "outputId": "207e26b2-c2f7-41e7-ae13-9421b398027e" }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_img_label_prediction(imgs=example_train_imgs, y_true=example_train_targets, y_pred=None, shape=(2, 3))\n" ] }, { "cell_type": "markdown", "metadata": { "id": "xbBAH9OTv9Ii" }, "source": [ "### Logistic Regression" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "W46ofUE6v9Ii", "outputId": "5e698e3a-c465-45c3-9f72-07f566ef8055" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1247, 8, 8)\n", "Test score with penalty: 0.8909\n" ] } ], "source": [ "scaler = StandardScaler()\n", "print(X_train.squeeze().shape)\n", "X_train = scaler.fit_transform(np.reshape(X_train, (X_train.shape[0], -1)))\n", "X_val = scaler.transform(np.reshape(X_val, (X_val.shape[0], -1)))\n", "\n", "# Turn up tolerance for faster convergence\n", "clf = LogisticRegression(C=50., multi_class='multinomial', solver='sag', tol=0.1)\n", "clf.fit(X_train, y_train)\n", "#sparsity = np.mean(clf.coef_ == 0) * 100\n", "score = clf.score(X_val, y_val)\n", "\n", "print(\"Test score with penalty: %.4f\" % score)" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 339 }, "id": "E_G2Rt0Gv9Ij", "outputId": "59791f8a-e8ac-41f9-81e9-8cdc92cadd42" }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "coef = clf.coef_.copy()\n", "plt.figure(figsize=(10, 5))\n", "scale = np.abs(coef).max()\n", "for i in range(10):\n", " l1_plot = plt.subplot(2, 5, i + 1)\n", " l1_plot.imshow(coef[i].reshape(8, 8), interpolation='nearest',\n", " cmap=plt.cm.RdBu, vmin=-scale, vmax=scale)\n", " l1_plot.set_xticks(())\n", " l1_plot.set_yticks(())\n", " l1_plot.set_xlabel('Class %i' % i)\n", "plt.suptitle('Classification coefficient vectors for...')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "id": "PfYdXpde4bg0" }, "source": [ "### Feed-forward using digits dataset" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "id": "UDKn4WS636Bg", "outputId": "754a52a5-d33e-4a7f-9431-df84fb38b3d8" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 0/19\n", "----------\n", "train Loss: 2.2229 Acc: 20.93%\n", "val Loss: 2.1097 Acc: 35.64%\n", "\n", "Epoch 2/19\n", "----------\n", "train Loss: 1.1401 Acc: 74.66%\n", "val Loss: 0.8908 Acc: 74.55%\n", "\n", "Epoch 4/19\n", "----------\n", "train Loss: 0.3788 Acc: 88.69%\n", "val Loss: 0.4712 Acc: 84.91%\n", "\n", "Epoch 6/19\n", "----------\n", "train Loss: 0.1971 Acc: 94.39%\n", "val Loss: 0.5091 Acc: 83.27%\n", "\n", "Epoch 8/19\n", "----------\n", "train Loss: 0.1401 Acc: 96.07%\n", "val Loss: 0.4645 Acc: 87.64%\n", "\n", "Epoch 10/19\n", "----------\n", "train Loss: 0.1273 Acc: 95.99%\n", "val Loss: 0.3447 Acc: 90.18%\n", "\n", "Epoch 12/19\n", "----------\n", "train Loss: 0.0914 Acc: 97.67%\n", "val Loss: 0.3651 Acc: 89.27%\n", "\n", "Epoch 14/19\n", "----------\n", "train Loss: 0.0922 Acc: 97.11%\n", "val Loss: 0.3925 Acc: 87.64%\n", "\n", "Epoch 16/19\n", "----------\n", "train Loss: 0.0628 Acc: 98.08%\n", "val Loss: 0.3407 Acc: 90.18%\n", "\n", "Epoch 18/19\n", "----------\n", "train Loss: 0.0547 Acc: 98.64%\n", "val Loss: 0.3234 Acc: 90.36%\n", "\n", "Training complete in 0m 6s\n", "Best val Acc: 90.36%\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "model = MLP([D_in, 512, 256, 128, 64, D_out]).to(device)\n", "\n", "optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)\n", "criterion = nn.NLLLoss()\n", "\n", "model, losses, accuracies = train_val_model(model, criterion, optimizer, dataloaders,\n", " num_epochs=20, log_interval=2)\n", "\n", "_ = plt.plot(losses['train'], '-b', losses['val'], '--r')" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 724 }, "id": "R9ImEBeo4MW6", "outputId": "c402c8da-dd25-4212-86f6-d3768e1619ed" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 0/49\n", "----------\n", "train Loss: 2.4328 Acc: 16.28%\n", "val Loss: 2.0682 Acc: 44.91%\n", "\n", "Epoch 10/49\n", "----------\n", "train Loss: 0.4297 Acc: 85.89%\n", "val Loss: 0.3002 Acc: 89.82%\n", "\n", "Epoch 20/49\n", "----------\n", "train Loss: 0.2546 Acc: 91.98%\n", "val Loss: 0.2488 Acc: 92.73%\n", "\n", "Epoch 30/49\n", "----------\n", "train Loss: 0.1890 Acc: 93.42%\n", "val Loss: 0.2325 Acc: 93.64%\n", "\n", "Epoch 40/49\n", "----------\n", "train Loss: 0.1606 Acc: 93.99%\n", "val Loss: 0.2394 Acc: 93.64%\n", "\n", "Training complete in 0m 20s\n", "Best val Acc: 94.55%\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "model = CNN().to(device)\n", "optimizer = torch.optim.Adam(model.parameters(), lr=0.001)\n", "criterion = nn.NLLLoss()\n", "\n", "model, losses, accuracies = train_val_model(model, criterion, optimizer, dataloaders,\n", " num_epochs=50, log_interval=10)\n", "\n", "_ = plt.plot(losses['train'], '-b', losses['val'], '--r')" ] }, { "cell_type": "markdown", "metadata": { "id": "uIhei09Ruvf-" }, "source": [ "# Bonus Information - Visualizing CNN filters\n", "\n", "Some work have been done to demonstrate the type of features learned by different filters in different layers. \n", "\n", "For instance, considering a known CNN called **VGG16** which has the following architecture\n", "\n", "![image](https://media.geeksforgeeks.org/wp-content/uploads/20200219152327/conv-layers-vgg16.jpg)\\[taken from: https://www.geeksforgeeks.org/vgg-16-cnn-model/ \\]\n", "\n", "these would be some of the filters from some of the layers: \n", "\n", "\t \n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\n", "
Layer 2
(Conv 1-2)
Layer 10
(Conv 2-1)
Layer 17
(Conv 3-1)
Layer 24
(Conv 4-1)
\n", "\n", "or obtain the class activations:\n", "\n", "\t \n", " \t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\n", "\t\t\t\n", "\t\t\t\n", "\t\t\t\n", "\t\t\n", "\t\n", "
Input Image Layer Vis. (Filter=0) Filter Vis. (Layer=29)
\n", "\n", "\\[examples taken from: http://www.github.com/utkuozbulak/pytorch-cnn-visualizations \\]\n" ] }, { "cell_type": "markdown", "metadata": { "id": "VU21L86BvAXK" }, "source": [ "# Bonus Information - Predefined architectures, pre-trained models and transfer learning\n", "\n", "Packages like [torchvision](https://pytorch.org/vision/stable/index.html) and [timm](https://rwightman.github.io/pytorch-image-models/) offer you the possibility of using predefined architectures or even use pre-trained models that can be used to fine tune the models for that same task or used for transfer learning.\n", "\n", "Besides datasets, transforms and others, **Torchvision** has a large number of predefined architecture with the possibility of loading the pre-trained weights." ] }, { "cell_type": "markdown", "metadata": { "id": "w_1YpmkV-PbU" }, "source": [ "#### Torchvision classification models examples\n", "\n" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "id": "l5yOWA9l4lrF" }, "outputs": [], "source": [ "import torchvision.models as models\n", "\n", "# construct a model with random weights to be trained\n", "resnet18 = models.resnet18()\n", "\n", "# load a pre-trained model\n", "resnet18 = models.resnet18(pretrained=True)" ] }, { "cell_type": "markdown", "metadata": { "id": "nAsfqUIyDBD7" }, "source": [ "For examples of different models and how to use pre-trained weights please visit https://pytorch.org/vision/stable/models.html#\n", "\n", "\n", "\n", "Another possibility is **timm** which contains models for classification only.\n", "In **timm** you are not restricted to have inputs only with 1/3-channels, allowing you to use architectures or pre-trained models using images that have 2 or > 3-channels." ] }, { "cell_type": "markdown", "metadata": { "id": "4zS8Ykbo-ZUg" }, "source": [ "#### timm classification models examples" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "w0FQYnKQ-lr_", "outputId": "0d3ef180-ef41-4653-99dd-6721c41c2475" }, "outputs": [ { "ename": "ModuleNotFoundError", "evalue": "No module named 'timm'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;34m'google.colab'\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mstr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mget_ipython\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mget_ipython\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msystem\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'pip install -q timm'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0;32mimport\u001b[0m \u001b[0mtimm\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;31m# list all models\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'timm'" ] } ], "source": [ "if 'google.colab' in str(get_ipython()):\n", " !pip install -q timm\n", "import timm\n", "\n", "# list all models\n", "print(timm.list_models())\n", "\n", "# list pre-trained models\n", "print(timm.list_models(pretrained=True))\n", "\n", "# list models architectures by wildcards\n", "print(timm.list_models('*resne*t*'))\n", "\n", "# construct a model with random weights to be trained\n", "model = timm.create_model('resnet18')\n", "\n", "# load a pre-trained model\n", "model = timm.create_model('resnet18', pretrained=True)" ] }, { "cell_type": "markdown", "metadata": { "id": "OSEwqB8ADIao" }, "source": [ "For more details on how to use this package visit https://rwightman.github.io/pytorch-image-models/" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "kaank-9kDI72" }, "outputs": [], "source": [] } ], "metadata": { "accelerator": "GPU", "colab": { "collapsed_sections": [], "name": "P8-CNNs.ipynb", "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.2" } }, "nbformat": 4, "nbformat_minor": 1 }