{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Multilayer Perceptron and Backpropagation" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Question 1 \n", "\n", "Consider the neural network considered in the first question of the theoretical component of the practical class, with number of units: 4,4,3,3.\n", "\n", "![](https://drive.google.com/uc?id=1SHUgdosKp6AX8rRAACCZ5nb4kUXreI3g)\n", "\n", "Assume all units, except the ones in the output layer, use the hyperbolic tangent activation function. \n", "\n", "Consider the following training example:\n", "\n", "$\\mathbf{x} =\\begin{bmatrix} 1, 0, 1, 0 \\end{bmatrix}^\\intercal $, $\\mathbf{y} =\\begin{bmatrix} 0\\\\ 1\\\\ 0 \\end{bmatrix}$\n", "\n", "❓ Using the squared error loss do a stochastic gradient descent update, initializing all connection weights and biases to 0.1 and a learning rate η = 0.1:\n", "\n", "1. Perform the forward pass\n", "2. Compute the loss\n", "3. Compute gradients with backpropagation\n", "4. Update weights" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "inputs = np.array([[1, 0, 1, 0]])\n", "labels = np.array([[0, 1, 0]])\n", "\n", "# First is input size, last is output size.\n", "units = [4, 4, 3, 3]\n", "\n", "# Initialize weights with correct shapes \n", "\n", "W1 = .1 * np.ones((units[1], units[0]))\n", "b1 = .1 * np.ones(units[1])\n", "W2 = .1 * np.ones((units[2], units[1]))\n", "b2 = .1 * np.ones(units[2])\n", "W3 = .1 * np.ones((units[3], units[2]))\n", "b3 = .1 * np.ones(units[3])" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0.16396106 0.16396106 0.16396106]\n" ] } ], "source": [ "# Forward Pass\n", "\n", "h0 = inputs[0]\n", "\n", "z1 = W1.dot(h0) + b1\n", "h1 = np.tanh(z1)\n", "\n", "z2 = W2.dot(h1) + b2\n", "h2 = np.tanh(z2)\n", "\n", "z3 = W3.dot(h2) + b3\n", "\n", "y = labels[0]\n", "\n", "print(z3)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.37636378397755565\n" ] } ], "source": [ "# Loss\n", "\n", "loss = .5*(z3 - y).dot(z3 - y)\n", "\n", "print(loss)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# Backpropagation\n", "\n", "grad_z3 = z3 - y # Grad of loss wrt z3.\n", "\n", "# Gradient of hidden parameters.\n", "grad_W3 = grad_z3[:, None].dot(h2[:, None].T)\n", "grad_b3 = grad_z3\n", "\n", "# Gradient of hidden layer below.\n", "grad_h2 = W3.T.dot(grad_z3)\n", "\n", "# Gradient of hidden layer below before activation.\n", "grad_z2 = grad_h2 * (1-h2**2) # Grad of loss wrt z3.\n", "\n", "# Gradient of hidden parameters.\n", "grad_W2 = grad_z2[:, None].dot(h1[:, None].T)\n", "grad_b2 = grad_z2\n", "\n", "# Gradient of hidden layer below.\n", "grad_h1 = W2.T.dot(grad_z2)\n", "\n", "# Gradient of hidden layer below before activation.\n", "grad_z1 = grad_h1 * (1-h1**2) # Grad of loss wrt z3.\n", "\n", "# Gradient of hidden parameters.\n", "grad_W1 = grad_z1[:, None].dot(h0[:, None].T)\n", "grad_b1 = grad_z1" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# Update Gradients\n", "\n", "# Gradient updates.\n", "eta = 0.1\n", "W1 -= eta*grad_W1\n", "b1 -= eta*grad_b1\n", "W2 -= eta*grad_W2\n", "b2 -= eta*grad_b2\n", "W3 -= eta*grad_W3\n", "b3 -= eta*grad_b3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "❓ Let's say we were using the same training example but with the following changes:\n", "- The output units have a softmax activation function\n", "- The error function is cross-entropy\n", "\n", "Keeping the same initializations and learning rate, adjust your computations to the new changes." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Solution:** We need only to change: \n", "- the output, *i.e.*, $\\hat{y} = softmax(z_3)$ instead of $\\hat{y} = z_3$\n", "- the loss computation to $L = -y.log(\\hat{y})$\n", "- the gradient of the loss with respect to $z_3$: $\\frac{dL}{dz_3}$\n", "\n", "All other steps remain unchanged." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "# Your code here\n", "\n", "W1 = .1 * np.ones((units[1], units[0]))\n", "b1 = .1 * np.ones(units[1])\n", "W2 = .1 * np.ones((units[2], units[1]))\n", "b2 = .1 * np.ones(units[2])\n", "W3 = .1 * np.ones((units[3], units[2]))\n", "b3 = .1 * np.ones(units[3])\n", "\n", "# Forward Pass\n", "\n", "h0 = inputs[0]\n", "\n", "z1 = W1.dot(h0) + b1\n", "h1 = np.tanh(z1)\n", "\n", "z2 = W2.dot(h1) + b2\n", "h2 = np.tanh(z2)\n", "\n", "z3 = W3.dot(h2) + b3\n", "\n", "p = np.exp(z3) / sum(np.exp(z3))\n", "y = labels[0]\n", "\n", "# Loss\n", "\n", "loss = -y.dot(np.log(p))\n", "\n", "# Backpropagation\n", "\n", "grad_z3 = p - y # Grad of loss wrt p\n", "\n", "# Gradient of hidden parameters.\n", "grad_W3 = grad_z3[:, None].dot(h2[:, None].T)\n", "grad_b3 = grad_z3\n", "\n", "# Gradient of hidden layer below.\n", "grad_h2 = W3.T.dot(grad_z3)\n", "\n", "# Gradient of hidden layer below before activation.\n", "grad_z2 = grad_h2 * (1-h2**2) # Grad of loss wrt z3.\n", "\n", "# Gradient of hidden parameters.\n", "grad_W2 = grad_z2[:, None].dot(h1[:, None].T)\n", "grad_b2 = grad_z2\n", "\n", "# Gradient of hidden layer below.\n", "grad_h1 = W2.T.dot(grad_z2)\n", "\n", "# Gradient of hidden layer below before activation.\n", "grad_z1 = grad_h1 * (1-h1**2) # Grad of loss wrt z3.\n", "\n", "# Gradient of hidden parameters.\n", "grad_W1 = grad_z1[:, None].dot(h0[:, None].T)\n", "grad_b1 = grad_z1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "❓ Complete functions `forward`, `compute_loss`, `backpropagation` and `update_weights` generalized to perform the same computations as before, but for any MLP architecture." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "'''\n", "x: single observation of shape (n,)\n", "weights: list of weight matrices [W1, W2, ...]\n", "biases: list of biases matrices [b1, b2, ...]\n", "\n", "y: final output\n", "hiddens: list of computed hidden layers [h1, h2, ...]\n", "'''\n", "\n", "def forward(x, weights, biases):\n", " num_layers = len(weights)\n", " g = np.tanh\n", " hiddens = []\n", " # compute hidden layers\n", " for i in range(num_layers):\n", " h = x if i == 0 else hiddens[i-1]\n", " z = weights[i].dot(h) + biases[i]\n", " if i < num_layers-1: # Assuming the output layer has no activation.\n", " hiddens.append(g(z))\n", " #compute output\n", " output = z\n", " \n", " return output, hiddens\n", "\n", "def compute_loss(output, y):\n", " # compute loss\n", " probs = np.exp(output) / np.sum(np.exp(output))\n", " loss = -y.dot(np.log(probs))\n", " \n", " return loss \n", "\n", "def backward(x, y, output, hiddens, weights):\n", " num_layers = len(weights)\n", " g = np.tanh\n", " z = output\n", "\n", " probs = np.exp(output) / np.sum(np.exp(output))\n", " grad_z = probs - y \n", " \n", " grad_weights = []\n", " grad_biases = []\n", " \n", " # Backpropagate gradient computations \n", " for i in range(num_layers-1, -1, -1):\n", " \n", " # Gradient of hidden parameters.\n", " h = x if i == 0 else hiddens[i-1]\n", " grad_weights.append(grad_z[:, None].dot(h[:, None].T))\n", " grad_biases.append(grad_z)\n", " \n", " # Gradient of hidden layer below.\n", " grad_h = weights[i].T.dot(grad_z)\n", "\n", " # Gradient of hidden layer below before activation.\n", " grad_z = grad_h * (1-h**2) # Grad of loss wrt z3.\n", "\n", " # Making gradient vectors have the correct order\n", " grad_weights.reverse()\n", " grad_biases.reverse()\n", " return grad_weights, grad_biases" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Question 2\n", "\n", "Now we will use the MLP on real data to classify handwritten digits.\n", "\n", "Data is loaded, split into train and test sets and target is one-hot encoded below:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "from sklearn.datasets import load_digits\n", "from sklearn.model_selection import train_test_split\n", "\n", "data = load_digits()\n", "\n", "inputs = data.data \n", "labels = data.target \n", "n, p = np.shape(inputs)\n", "n_classes = len(np.unique(labels))\n", "\n", "\n", "X_train, X_test, y_train, y_test = train_test_split(inputs, labels, test_size=0.2, random_state=42)\n", "\n", "# Encode labels as one-hot vectors.\n", "one_hot = np.zeros((np.size(y_train, 0), n_classes))\n", "for i in range(np.size(y_train, 0)):\n", " one_hot[i, y_train[i]] = 1\n", "y_train_ohe = one_hot\n", "one_hot = np.zeros((np.size(y_test, 0), n_classes))\n", "for i in range(np.size(y_test, 0)):\n", " one_hot[i, y_test[i]] = 1\n", "y_test_ohe = one_hot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "❓ Complete function `MLP_train_epoch` using your previously defined functions to compute one epoch of training using SGD:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "'''\n", "Outputs:\n", " - weights: list of updated weights\n", " - biases: list of updated \n", " - loss: scalar of total loss (sum for all observations)\n", "\n", "'''\n", "\n", "def MLP_train_epoch(inputs, labels, weights, biases):\n", " num_layers = len(weights)\n", " total_loss = 0\n", " # For each observation and target\n", " for x, y in zip(inputs, labels):\n", " # Comoute forward pass\n", " output, hiddens = forward(x, weights, biases)\n", " \n", " # Compute Loss and Update total loss\n", " loss = compute_loss(output, y)\n", " total_loss+=loss\n", " # Compute backpropagation\n", " grad_weights, grad_biases = backward(x, y, output, hiddens, weights)\n", " \n", " # Update weights\n", " \n", " for i in range(num_layers):\n", " weights[i] -= eta*grad_weights[i]\n", " biases[i] -= eta*grad_biases[i]\n", " \n", " return weights, biases, total_loss" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's use a MLP with a single hidden layer of 50 units and a learning rate of $0.001$. \n", "\n", "❓ Run 100 epochs of your MLP. Save the loss at each epoch in a list and plot the loss evolution after training." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAD4CAYAAAAAczaOAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/OQEPoAAAACXBIWXMAAAsTAAALEwEAmpwYAAAZ6UlEQVR4nO3da3Ad533f8e9/zw03kiAIEAIvEikbtkTJo4sRiY1VV45kiVIvktM2sSaNWY9T2q08sVt3UqV9oTYZp26b+CIn1YxssZba1K6lODHrcCzRtKZqOpElUBdKoi6kKImX8AISvAEkgINz/n2xzwEOQIAEQQAH3P19Zs7s7rN7dp/lcn67ePY5u+buiIhIOkS1roCIiMwdhb6ISIoo9EVEUkShLyKSIgp9EZEUyda6AufS2trqq1atqnU1REQuKdu2bTvi7m0TzZvXob9q1Sq6u7trXQ0RkUuKmb0/2Tw174iIpIhCX0QkRRT6IiIpotAXEUkRhb6ISIoo9EVEUkShLyKSIokM/b7BYb6+5W1e3nu81lUREZlXEhn6Q8NlHtq6k5f3HKt1VURE5pVEhn4hG+/WUKlc45qIiMwviQz9fAj9waJCX0SkWiJDPxsZkcHgsEJfRKRaIkPfzChkMwwOl2pdFRGReSWRoQ9QyEUM6UpfRGSMxIZ+PhOpeUdEZJzEhn4hp9AXERkvuaGfzah5R0RknASHfqQbuSIi4yQ29PNZNe+IiIx33tA3s5Vm9oyZ7TCz183sS6G8xcy2mNnOMFwcys3MHjKzXWa23cxurFrX+rD8TjNbP3u7Fa709eMsEZExpnKlPwx8xd3XAGuB+81sDfAAsNXdO4GtYRrgLqAzfDYAD0N8kgAeBG4GbgIerJwoZkMhm2FQj2EQERnjvKHv7gfc/cUwfgp4A1gO3AM8FhZ7DLg3jN8DPO6x54BmM+sA7gS2uHuvux8DtgDrZnJnqsVX+mrTFxGpdkFt+ma2CrgB+AXQ7u4HwqyDQHsYXw7srfravlA2Wfn4bWwws24z6+7p6bmQ6o2Rz+rHWSIi40059M2sCfgz4MvufrJ6nrs74DNRIXd/xN273L2rra1t2uuJH8Og0BcRqTal0DezHHHg/6m7/ygUHwrNNoTh4VC+H1hZ9fUVoWyy8lmhH2eJiJxtKr13DHgUeMPdv141axNQ6YGzHvhxVflnQi+etcCJ0Az0FHCHmS0ON3DvCGWzIn4Mg9r0RUSqZaewzMeA3wReNbOXQ9m/Bb4G/NDMPge8D/xamLcZuBvYBZwGPgvg7r1m9vvAC2G533P33pnYiYnoSl9E5GznDX13/yvAJpl92wTLO3D/JOvaCGy8kApOV+UxDO5O/MeKiIgk9he5emWiiMjZEh/6auIRERmV/NDXoxhEREYkOPQzAOrBIyJSJbmhnwtt+mreEREZkdjQz2fUpi8iMl5iQ79ypa/QFxEZldzQr7Tp60mbIiIjEhz66qcvIjJeYkM/ry6bIiJnSWzoj3bZVOiLiFQkOPQrN3LVpi8iUpHc0Fc/fRGRsyQ29NVPX0TkbIkN/UJOj2EQERkvuaGfVfOOiMh4iQ39bGSYqXlHRKRaYkPfzChk9cpEEZFqiQ19iPvq6zEMIiKjEh76kR7DICJSJdGhn89GegyDiEiVRIe+2vRFRMZKeOhn1E9fRKRKskM/pyt9EZFqiQ79fEahLyJSLdGhX8hlFPoiIlWSHfrZSP30RUSqJD701U9fRGRUokNf/fRFRMZKdOjHXTYV+iIiFQkP/Uj99EVEqiQ79HORnqcvIlIl2aEf+um7e62rIiIyLyQ79MMrE9WDR0QkluzQz+rl6CIi1RId+nm9J1dEZIxEh76u9EVExjpv6JvZRjM7bGavVZX9ezPbb2Yvh8/dVfN+18x2mdlbZnZnVfm6ULbLzB6Y+V05WyEbt+nrUQwiIrGpXOl/D1g3Qfk33P368NkMYGZrgE8D14Tv/Fczy5hZBvgT4C5gDXBfWHZWVa70dSNXRCSWPd8C7v6sma2a4vruAX7g7oPAu2a2C7gpzNvl7rsBzOwHYdkdF17lqau06etRDCIisYtp0/+imW0PzT+LQ9lyYG/VMvtC2WTlZzGzDWbWbWbdPT09F1G9quYdtemLiADTD/2HgQ8A1wMHgD+aqQq5+yPu3uXuXW1tbRe1rkKuciNXbfoiIjCF5p2JuPuhyriZfQf4SZjcD6ysWnRFKOMc5bOmoC6bIiJjTOtK38w6qiY/BVR69mwCPm1mBTNbDXQCzwMvAJ1mttrM8sQ3ezdNv9pTk1eXTRGRMc57pW9m3wduBVrNbB/wIHCrmV0POPAe8HkAd3/dzH5IfIN2GLjf3UthPV8EngIywEZ3f32md2a80TZ9Ne+IiMDUeu/cN0Hxo+dY/qvAVyco3wxsvqDaXaSCeu+IiIyRil/kqp++iEgs0aGvfvoiImMlOvTVpi8iMlaiQz+XMczUe0dEpCLRoW9mFLJ6ZaKISEWiQx8gH16ZKCIiKQj9Qi6jNn0RkSD5oZ+N1HtHRCRIfOjnsxGD6qcvIgKkIPQL2Yyu9EVEghSEfqQ2fRGRICWhryt9ERFIQejn1U9fRGRE4kO/kM3oSl9EJEh+6OfUpi8iUpH80FfzjojIiFSEvpp3RERiKQj9DINFNe+IiEAqQl9X+iIiFakI/aFSGXevdVVERGou8aGfz0a4Q7Gk0BcRSXzo65WJIiKjkh/6ufBydLXri4gkP/TzmXgX1VdfRCQFoa8rfRGRUckPfbXpi4iMSEHohyt9vUhFRCT5oZ8PoT+kVyaKiCQ/9Eead3SlLyKShtCv3MhVm76ISPJDX713RERGJD701U9fRGRU4kO/kFOXTRGRiuSHflbNOyIiFekJffXeERFJfujn1XtHRGRE8kM/E5HPRpwaGK51VUREau68oW9mG83ssJm9VlXWYmZbzGxnGC4O5WZmD5nZLjPbbmY3Vn1nfVh+p5mtn53dmbD+tDbm6ekbnKtNiojMW1O50v8esG5c2QPAVnfvBLaGaYC7gM7w2QA8DPFJAngQuBm4CXiwcqKYC60LChztG5qrzYmIzFvnDX13fxboHVd8D/BYGH8MuLeq/HGPPQc0m1kHcCewxd173f0YsIWzTySzZkljniO60hcRmXabfru7HwjjB4H2ML4c2Fu13L5QNln5Wcxsg5l1m1l3T0/PNKs3VmuTrvRFRGAGbuS6uwMz9tZxd3/E3bvcvautrW1G1rmkqcDR/kHiqoqIpNd0Q/9QaLYhDA+H8v3AyqrlVoSyycrnRGtTnmLJOXlGPXhEJN2mG/qbgEoPnPXAj6vKPxN68awFToRmoKeAO8xscbiBe0comxOtTQUA9eARkdTLnm8BM/s+cCvQamb7iHvhfA34oZl9Dngf+LWw+GbgbmAXcBr4LIC795rZ7wMvhOV+z93H3xyeNZXQP9o3yAeXNs3VZkVE5p3zhr673zfJrNsmWNaB+ydZz0Zg4wXVboYsacoDcEQ3c0Uk5RL/i1youtLvV/OOiKRbKkJ/cUMOMzhySqEvIumWitDPZiJaGvIc6VfzjoikWypCH+J2/aPqvSMiKZea0G9tKuhGroikXmpCf0lTQVf6IpJ6qQn91qa8rvRFJPVSFPoF+gaHGSjqDVoikl6pCf0ljZUfaKmJR0TSKzWhP/ooBjXxiEh6pSb0Rx/FoCt9EUmv1IS+rvRFRFIY+nq8soikWWpCvz6foTGf0ZW+iKRaakIf4h9oqU1fRNIsVaHf2pTX45VFJNVSFfpLmgocOaXmHRFJr1SFfmtTQVf6IpJqKQv9PL39Q5TKXuuqiIjURMpCv0DZ4dhpNfGISDqlKvQrv8pVt00RSatUhX7lB1rqtikiaZWy0Nfzd0Qk3VIW+pUrfTXviEg6pSr0F9blyEamK30RSa1UhX4UGcua63n/aH+tqyIiUhOpCn2ANR0L2fE3J2tdDRGRmkhd6F/dsZD3e0/TPzhc66qIiMy5FIb+AtzhzYOnal0VEZE5l7rQX7NsIQA7DqiJR0TSJ3Whv7y5noV1Wd5Q6ItICqUu9M2MqzoWKvRFJJVSF/oQ9+B588ApPW1TRFIntaF/plhSf30RSZ1Uhv7VHfHN3DcOqAePiKRLKkO/s72JTGTsOHCi1lUREZlTqQz9ulyGD7Q16kpfRFInlaEPcROPevCISNpcVOib2Xtm9qqZvWxm3aGsxcy2mNnOMFwcys3MHjKzXWa23cxunIkdmK41HQs5cGKAY/16zLKIpMdMXOl/wt2vd/euMP0AsNXdO4GtYRrgLqAzfDYAD8/Atqdt9GaurvZFJD1mo3nnHuCxMP4YcG9V+eMeew5oNrOOWdj+lFRCX49jEJE0udjQd+BpM9tmZhtCWbu7HwjjB4H2ML4c2Fv13X2hbAwz22Bm3WbW3dPTc5HVm1zbggJLFxTYvk89eEQkPbIX+f1b3H2/mS0FtpjZm9Uz3d3N7IJ+9urujwCPAHR1dc3qT2Zv6WzlZzsOUSyVyWVSe09bRFLkopLO3feH4WHgz4GbgEOVZpswPBwW3w+srPr6ilBWM3decxknB4Z5/t3eWlZDRGTOTDv0zazRzBZUxoE7gNeATcD6sNh64MdhfBPwmdCLZy1woqoZqCY+3tlGXS7iqdcP1rIaIiJz5mKu9NuBvzKzV4Dngb90958CXwM+aWY7gdvDNMBmYDewC/gO8C8uYtszoj6f4eOdbTz9+iHc9fA1EUm+abfpu/tu4LoJyo8Ct01Q7sD9093ebLnzmst4eschtu87wXUrm2tdHRGRWZX6u5e3Xb2UTGRq4hGRVEh96Dc35Ll5dQtP7zhU66qIiMy61Ic+xE08uw738U5PX62rIiIyqxT6wCfXxL8fe/p1Xe2LSLIp9IFlzfVct7KZJ7ftpaxXKIpIgin0g9+6ZTXv9PTzU93QFZEEU+gHd3+kgytbG/njn+9Sn30RSSyFfpCJjH9+6wfYceAkz7x1+PxfEBG5BCn0q9x7w3KWN9fzbV3ti0hCKfSr5DIRX7j1A7y05zh//c7RWldHRGTGKfTH+ccfXcHSBQX+aMvb6skjIomj0B+nLpfhd9Zdxbb3j7Hx/71b6+qIiMwohf4E/uGNy7n96nb+81NvsfPQqVpXR0Rkxij0J2Bm/MGvXktjPsO/+uErFEvlWldJRGRGKPQnsXRBHV/91Ed4df8J/vjnu2pdHRGRGaHQP4e7P9LBp25Yzre27uQvXqrpmx1FRGbExb4YPfH+469+hL85foZ//cQrLKrP8Ymrlta6SiIi06Yr/fOoy2X47vourupYwBf+xzZeeE8vUReRS5dCfwoW1OX43mdvYnlzPf904/NsfUOPYBaRS5NCf4pamwr8z3+2ltVtjfzW49088uw7elSDiFxyFPoX4LJFdTzx+V/m7ms7+IPNb/KVJ16hb3C41tUSEZkyhf4Fqs9n+PZ9N/Dl2zv585f2c+c3nuX/7uypdbVERKZEoT8NUWR8+fYP8eQXfplCLuI3H32ef/PkdnpODda6aiIi56TQvwgfvWIxm3/7b/P5v3MlT764j1v/yzN882dvq8lHROYtm883I7u6ury7u7vW1ZiS3T19/OHTb7H51YMsaczz2Y+t4p+svYLmhnytqyYiKWNm29y9a8J5Cv2Z9dKeY3zzZzv5P2/3UJ/L8Ou/tJLfuPlyOtsX1LpqIpISCv0aePPgSb7z7LtsemU/xZLz0SsW8+u/tJK7rr2MBXW5WldPRBJMoV9DR/oG+dGL+/jBC3vZ3dNPPhvxiQ+38fevW8atH15KU0FPwhCRmaXQnwfcnRf3HON/v3KAv3z1AD2nBslljJtWt/CJDy/lls5WPrR0AVFkta6qiFziFPrzTKnsdL/Xy8/fPMwzbx3m7UN9ALQ05rl5dQtdq1q44fJmrlm2kEI2U+PaisilRqE/z+07dprndvfy1+8c5bndR9l//AwA+UzEVR0LuGbZIq5ZtpCrOxbyofYm3RMQkXNS6F9iDp0c4KU9x3hpz3Fe3X+C1/af4OTAaN//ZYvq+GD7Aq5sbWR1ayOrWhu5oqWBZc315LP66YVI2p0r9HUXcR5qX1jHums7WHdtBxDfD9h37AxvHTzFW4dOsfPQKd7p6eeJ93rpHyqNfC8y6FhUz/LF9axormdZcz0dzXV0LKrjsoX1tC8ssLghr/sGIimm0L8EmBkrWxpY2dLA7WvaR8rdncOnBnnvSD97ek+zt/c0e3pPs//4GZ7bfZSDJwcoj/tDLpcx2poKtC2si4cLCrQ25VnSmKelqcCSxjyLG/IsacrT3JDTPQWRhFHoX8LMjPaFdbQvrOPmK5ecNX+4VKanb5ADJwY4eGKAQycHOHRykMMnB+jpG2TfsdO8vPcYvf1DZ50cKhryGRY35FlUn2NRfY7mhni4sD7HgkI2HtZlWVAXD5sK2ZFhYyFLIRthpr8sROYLhX6CZTMRHYvq6VhUf87lymXn+JkiR/oG6e0f4lj/EEf7hzh+eohjp4scOz3EyTNFjp8usvNwHyfPFDk5UGSgWD5/HSKjsRCfBBryGRoKWRrzGRryYTqfoT4MG/JZ6nLxeF0uoj6XoS58Rscj6nIZCtl4mM9Eaq4SuQAKfSGKjJbGPC2NF/acoMHhEqcGhsOnSN/AMKcG4+n+wWH6BuNhPF7i9FBcdmaoxPHTZzg9NMzpoRJnhkqcLpYoTfbnxnnksxGFbEQhmwnDiHzlk5lgPAxzmfDJGvlM1XTGzhrPjpQZ2Wh0OhtVT8fjmWj0O9nIyI4sZ2Qi018+UlNzHvpmtg74FpABvuvuX5vrOsjMKGQzFJoytDYVZmR9Q8NlzhTjk8BAsRSfEIolBovxcKBYZqBYYmA4Hh8cLjFYLDMwXGJouMzgcDx/aLgcf0rlkfG+weExZcVSmWLJR8qKpTJz1ZEtGxlR1Ukgl4lPFNnIiMzIZmxkOhNFZCLI2Oh3KstEFi9XmZcxI5MJw7BcZMTj0bnLoyiUjYzH01H1vHDCGlnODKusJ4xHNvpdG9kek86PjJH1W/U8Rpcf/d7od6yqfhCvwxhdjqptVcotYnS86rs2blkL60+qOQ19M8sAfwJ8EtgHvGBmm9x9x1zWQ+anyhX5ovra/A6hVI5PAsVymeJwfFIolkZPEMVSmeGyMxymh8tlhks+WhaG8bRTKlctV3ZKJadYjstHl4k/w+VyGI6WlcpOseSUPR4ve/iOO4PFeB2VeaVx4yV3ymXGfLcyzx1K48qn+UdWopmNPUlY9QmkckICqCwXVU4a4cQTTlwjJxImPtmMLjN2fWuWLeLb990w4/s111f6NwG73H03gJn9ALgHUOhLzWUioz6foZ509lgqV04WHk4M4aRQ9nhe2UdPJs7Y8rLHJ5hy2XFGTzbuVePEPc5K5XhY+c7IMu5QWY+PLuM+dp1Q2RYj5T6ynjANMGa7o9saqXuYrux7ZV3l8IXK96qXx8M2q5avrGOkvlS2Q5g3+m9WqaMzuq+j2xndVxwubzn3vbjpmuvQXw7srZreB9xcvYCZbQA2AFx++eVzVzORlIsiIyK5zRoSm3c/33T3R9y9y9272traal0dEZFEmevQ3w+srJpeEcpERGQOzHXovwB0mtlqM8sDnwY2zXEdRERSa07b9N192My+CDxF3GVzo7u/Ppd1EBFJsznvp+/um4HNc71dERGZhzdyRURk9ij0RURSRKEvIpIi8/rNWWbWA7x/EatoBY7MUHUuFWncZ0jnfqdxnyGd+32h+3yFu0/4Q6d5HfoXy8y6J3tlWFKlcZ8hnfudxn2GdO73TO6zmndERFJEoS8ikiJJD/1Hal2BGkjjPkM69zuN+wzp3O8Z2+dEt+mLiMhYSb/SFxGRKgp9EZEUSWTom9k6M3vLzHaZ2QO1rs9sMbOVZvaMme0ws9fN7EuhvMXMtpjZzjBcXOu6zjQzy5jZS2b2kzC92sx+EY75/wpPcU0UM2s2syfN7E0ze8PM/lbSj7WZ/cvwf/s1M/u+mdUl8Vib2UYzO2xmr1WVTXhsLfZQ2P/tZnbjhWwrcaFf9R7eu4A1wH1mtqa2tZo1w8BX3H0NsBa4P+zrA8BWd+8EtobppPkS8EbV9H8CvuHuHwSOAZ+rSa1m17eAn7r7VcB1xPuf2GNtZsuB3wa63P1a4ifzfppkHuvvAevGlU12bO8COsNnA/DwhWwocaFP1Xt43X0IqLyHN3Hc/YC7vxjGTxGHwHLi/X0sLPYYcG9NKjhLzGwF8HeB74ZpA34FeDIsksR9XgR8HHgUwN2H3P04CT/WxE8CrjezLNAAHCCBx9rdnwV6xxVPdmzvAR732HNAs5l1THVbSQz9id7Du7xGdZkzZrYKuAH4BdDu7gfCrINAe63qNUu+CfwOUA7TS4Dj7j4cppN4zFcDPcB/C81a3zWzRhJ8rN19P/CHwB7isD8BbCP5x7pismN7URmXxNBPHTNrAv4M+LK7n6ye53Gf3MT0yzWzvwccdvdtta7LHMsCNwIPu/sNQD/jmnISeKwXE1/VrgaWAY2c3QSSCjN5bJMY+ql6D6+Z5YgD/0/d/Ueh+FDlz70wPFyr+s2CjwH/wMzeI266+xXitu7m0AQAyTzm+4B97v6LMP0k8Ukgycf6duBdd+9x9yLwI+Ljn/RjXTHZsb2ojEti6KfmPbyhLftR4A13/3rVrE3A+jC+HvjxXNdttrj777r7CndfRXxsf+7uvwE8A/yjsFii9hnA3Q8Ce83sw6HoNmAHCT7WxM06a82sIfxfr+xzoo91lcmO7SbgM6EXz1rgRFUz0Pm5e+I+wN3A28A7wL+rdX1mcT9vIf6TbzvwcvjcTdzGvRXYCfwMaKl1XWdp/28FfhLGrwSeB3YBTwCFWtdvFvb3eqA7HO+/ABYn/VgD/wF4E3gN+O9AIYnHGvg+8X2LIvFfdZ+b7NgCRtxD8R3gVeLeTVPelh7DICKSIkls3hERkUko9EVEUkShLyKSIgp9EZEUUeiLiKSIQl9EJEUU+iIiKfL/Acs5ppXDhGGRAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Initialize weights\n", "\n", "# Single hidden unit MLP with 50 hidden units.\n", "# First is input size, last is output size.\n", "units = [64, 50, 10]\n", "\n", "# Initialize all weights and biases randomly.\n", "W1 = .1 * np.random.randn(units[1], units[0])\n", "b1 = .1 * np.random.randn(units[1])\n", "W2 = .1 * np.random.randn(units[2], units[1])\n", "b2 = .1 * np.random.randn(units[2])\n", "\n", "weights = [W1, W2]\n", "biases = [b1, b2]\n", "\n", "# Learning rate.\n", "eta = 0.001 \n", " \n", "# Run epochs\n", "\n", "losses = []\n", "\n", "for epoch in range(100):\n", " weights, biases, loss = MLP_train_epoch(X_train, y_train_ohe, weights, biases)\n", " losses.append(loss)\n", " \n", "plt.plot(losses)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "❓ Complete function `MLP_predict` to get array of predictions from your trained MLP:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "def MLP_predict(inputs, weights, biases):\n", " predicted_labels = []\n", " for x in inputs:\n", " # Compute forward pass and get the class with the highest probability\n", " output, _ = forward(x, weights, biases)\n", " y_hat = np.argmax(output)\n", " predicted_labels.append(y_hat)\n", " predicted_labels = np.array(predicted_labels)\n", " return predicted_labels" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "❓ Compute the accuracy on the train and test sets." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Train accuracy: 1.0\n", "Test accuracy: 0.9638888888888889\n" ] } ], "source": [ "y_train_pred = MLP_predict(X_train, weights, biases)\n", "y_test_pred = MLP_predict(X_test, weights, biases)\n", "\n", "print(f'Train accuracy: {(y_train_pred==y_train).mean()}')\n", "print(f'Test accuracy: {(y_test_pred==y_test).mean()}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can compare our results with Sklearn's implementation of the MLP. Compare their accuracies:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.9993041057759221\n", "0.9722222222222222\n" ] } ], "source": [ "from sklearn.neural_network import MLPClassifier\n", "clf = MLPClassifier(hidden_layer_sizes=(50),\n", " activation='tanh',\n", " solver='sgd',\n", " learning_rate='constant',\n", " learning_rate_init=0.001,\n", " nesterovs_momentum=False,\n", " random_state=1,\n", " max_iter=1000)\n", "clf.fit(X_train, y_train)\n", "print(clf.score(X_train, y_train))\n", "print(clf.score(X_test, y_test))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" } }, "nbformat": 4, "nbformat_minor": 4 }