{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "Copy of Assignment3.ipynb",
      "version": "0.3.2",
      "views": {},
      "default_view": {},
      "provenance": [
        {
          "file_id": "0B-TLIIJqcZpNV3pjWEthUXhPMnc",
          "timestamp": 1523060994854
        }
      ],
      "collapsed_sections": []
    },
    "kernelspec": {
      "name": "python2",
      "display_name": "Python 2"
    },
    "accelerator": "GPU"
  },
  "cells": [
    {
      "metadata": {
        "id": "kB3ViOebJ1Dl",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "Assignment 3: Image Classification using CNNs\n",
        "==="
      ]
    },
    {
      "metadata": {
        "id": "Yz0UcUjYdIkR",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "In this assignment you will learn about\n",
        "1. The fundamental computations in neural networks for vision, including backpropagation\n",
        "2. The basics of fitting a model for generalization\n",
        "3. Nearest neighbor classifiers\n",
        "\n",
        "**Note:** When you first load this colab webpage, it will be in read-only viewing mode.  To edit and run code, you can either (a) download the Jupyter notebook (\"File\" -> \"Download .ipynb\") to run on your local computer or (b) copy to your Google Drive (\"File\" -> \"Save a copy in Drive...\") to work in the browser and run on a Google Cloud GPU.  If you run locally, you will need to install Tensorflow and it is recommended that you use a GPU for problem 3.2.  If you do not want to use Colab and do not have a local GPU, please let us know.\n"
      ]
    },
    {
      "metadata": {
        "id": "IScx1TAgmBgV",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "# 3.0 Nearest neighbor classification (20 points)\n",
        "\n",
        "## 3.0.1 (20 points) \n",
        "Given the following training set of labeled two-dimensional points for binary classification, draw a Voronoi diagram of the output of a 1-nearest neighbor classifier.  Feel free to render the diagram using Python below (do not use scikit-learn or any machine learning libraries to do this) or submit a PDF along with your assignment.\n",
        "\n",
        ">```\n",
        "Point (x,y)  | Label\n",
        "-------------|-------\n",
        "(1,3)        |   +\n",
        "(-4,-2)      |   +\n",
        "(-3,-1.5)    |   -\n",
        "(3,3)        |   -\n",
        "(0,-2)       |   +\n",
        "(-2,0)       |   +\n",
        "(-2,4)       |   -\n",
        "```\n",
        "\n"
      ]
    },
    {
      "metadata": {
        "id": "Q3U4NGyNqbKG",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "import matplotlib.pyplot as plt\n",
        "import numpy as np\n",
        "\n",
        "## Can render diagram using Python here, if you would like.\n",
        "\n"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "SLFlhLnW6FJH",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "## 3.0.2 (5 points, extra) \n",
        "\n",
        "Render for 3-NN"
      ]
    },
    {
      "metadata": {
        "id": "MB-Q7MZmRb_w",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        ""
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "Pqcz4ghVMich",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "# 3.1 Neural network operations (40 points)"
      ]
    },
    {
      "metadata": {
        "id": "tOxsGv9H4h4Q",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "In this section we provide a working example of a convolutional neural network written using basic numpy operations.  Each neural network operation is represented by a Python class with methods *forward()* and *backward()*, which compute activations and gradients, respectively. Your task is to complete certain methods that are left blank.\n",
        "\n",
        "1. 2D Convolution\n",
        "> * Forward\n",
        "> * **Backward (10 points)**\n",
        "2. ReLU\n",
        "> * **Forward (5 points)**\n",
        "> * Backward\n",
        "3. Average pooling\n",
        "> * Forward\n",
        "> * **Backward (5 points)**\n",
        "4. Softmax cross-entropy\n",
        "> * **Forward (10 points)**\n",
        "> * Backward\n",
        "\n",
        "When you complete an operation, you can check your work by executing its cell.  We compare the outputs of your method to that of Tensorflow.\n",
        "\n",
        "Finally, when you have all of the operations completed, you can run a small network for a few iterations of stochastic gradient descent and plot the loss.\n"
      ]
    },
    {
      "metadata": {
        "id": "E7LdJleoh6px",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "#@title (Hidden utility code: RUN ME FIRST) { display-mode: \"form\" }\n",
        "import tensorflow as tf\n",
        "import numpy as np\n",
        "\n",
        "class Variable:\n",
        "  \"\"\"Placeholder for labels and input images\"\"\"\n",
        "  value = 0\n",
        "\n",
        "def cmp_ops(your_op, tf_op, tf_inputs, tf_weights=None):\n",
        "  your_op.forward()\n",
        "  your_op_f_out = your_op.value\n",
        "\n",
        "  with tf.Session().as_default():\n",
        "    tf_op_f_out = tf_op.eval()[0] # Remove the batch dimension\n",
        "\n",
        "  print(\"Forward pass:\")\n",
        "  cmp_tensors(your_op_f_out, tf_op_f_out, verbose=False)\n",
        "\n",
        "  your_op.inputs.dloss_dvalue = np.zeros(your_op.inputs.value.shape)\n",
        "  your_op.dloss_dvalue = np.ones(your_op.value.shape)\n",
        "  your_op.backward()\n",
        "  your_op_g_inputs = your_op.inputs.dloss_dvalue\n",
        "\n",
        "  if tf_weights is not None:\n",
        "    your_op_g_weights = your_op.dloss_dweights\n",
        "    g_inputs, g_weights = tf.gradients(tf.reduce_sum(tf_op), [tf_inputs, tf_weights])\n",
        "    \n",
        "    with tf.Session() as sess:\n",
        "      tf_g_inputs_out, tf_g_weights_out = sess.run([g_inputs, g_weights])\n",
        "      tf_g_weights_out = np.transpose(tf_g_weights_out, [3,0,1,2])\n",
        "    \n",
        "    print(\"Gradient wrt inputs:\")\n",
        "    cmp_tensors(your_op_g_inputs, tf_g_inputs_out[0])\n",
        "    print(\"Gradient wrt weights:\")\n",
        "    cmp_tensors(your_op_g_weights, tf_g_weights_out)\n",
        "    \n",
        "  else:\n",
        "    g_inputs = tf.gradients(tf.reduce_sum(tf_op), [tf_inputs])\n",
        "\n",
        "    with tf.Session() as sess:\n",
        "      tf_g_inputs_out = sess.run(g_inputs)\n",
        "\n",
        "    print(\"Gradient wrt inputs:\")\n",
        "    cmp_tensors(your_op_g_inputs, tf_g_inputs_out[0], verbose=False)\n",
        "\n",
        "def cmp_tensors(yours, tfs, verbose=False):\n",
        "  print(\"  Your Op shape: \" + str(yours.shape))\n",
        "  print(\"  TensorFlow Op shape: \" + str(tfs.shape))\n",
        "  print(\"  Values equal: \" + str(np.allclose(tfs, yours, atol=1e-6)))\n",
        "  if verbose:\n",
        "    print(tfs)\n",
        "    print(yours)\n",
        "    \n",
        "inputs = Variable()\n",
        "inputs.value = np.random.normal(size=(10, 10, 3)) # Input image is 10x10x3\n",
        "tf_inputs = tf.constant(inputs.value[np.newaxis, ...], dtype=tf.float32)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "lms52ZYzdQnA",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "## 3.1.1 2D Convolution (10 pts)"
      ]
    },
    {
      "metadata": {
        "id": "5w_5lVExdNKK",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "cellView": "code"
      },
      "cell_type": "code",
      "source": [
        "import numpy as np\n",
        "\n",
        "\"\"\"rows x cols x filters\"\"\"\n",
        "\n",
        "class OpConv2D:\n",
        "  \"\"\"Two-dimensional convolutional layer\"\"\"\n",
        "    \n",
        "  def __init__(self, filters, kernel_size, inputs):\n",
        "    # Shape of the input feature map\n",
        "    input_height = inputs.value.shape[0]\n",
        "    input_width = inputs.value.shape[1]\n",
        "    input_filters = inputs.value.shape[2]\n",
        "    \n",
        "    # Shape of this layer's feature map\n",
        "    self.height = input_height - kernel_size + 1\n",
        "    self.width = input_width - kernel_size + 1\n",
        "    self.filters = filters\n",
        "    \n",
        "    self.inputs = inputs\n",
        "    self.kernel_size = kernel_size\n",
        "    self.weights = np.random.normal(size=(filters, kernel_size, kernel_size, input_filters), scale=0.1)\n",
        "    self.reset_values()\n",
        "    \n",
        "  def reset_values(self):\n",
        "    self.value = np.zeros((self.height, self.width, self.filters))\n",
        "    self.dloss_dvalue = np.zeros(self.value.shape)\n",
        "    self.dloss_dweights = np.zeros(self.weights.shape)\n",
        "    \n",
        "  def forward(self):\n",
        "    # Reset value and gradient at start of forward pass\n",
        "    self.reset_values()\n",
        "    \n",
        "    for y in range(self.height):\n",
        "      for x in range(self.width):\n",
        "        for f in range(self.filters):\n",
        "          z = 0.0\n",
        "          \n",
        "          for ky in range(self.kernel_size):\n",
        "            for kx in range(self.kernel_size):\n",
        "              for kf in range(self.weights.shape[3]):\n",
        "                z += self.inputs.value[y+ky, x+kx, kf] * self.weights[f, ky, kx, kf]\n",
        "                \n",
        "          self.value[y, x, f] = z\n",
        "          \n",
        "  def backward(self):\n",
        "    ## Complete this method, which sets:\n",
        "    ## 1. Partial derivative of the loss with respect to the values of the inputs\n",
        "    ## self.inputs.dloss_dvalue, which is a `height x width x input_filters` tensor\n",
        "    ## 2. Partial derivative of the loss with respect to the weights\n",
        "    ## self.dloss_dweights, which is a `filters x kernel_size x kernel_size x input_filters` tensor\n",
        "    ##\n",
        "    ## This will utilize tensors:\n",
        "    ## 1. The partial with respect to the value of this layer\n",
        "    ## self.dloss_dvalue, a `height x width x filter` tensor\n",
        "    ## 2. The weights of this layer\n",
        "    ## self.weights, a `filters x kernel_size x kernel_size x input_filters` tensor\n",
        "    ## 3. The value of the input layer\n",
        "    ## self.inputs.value, a `height x width x input_filters` tensor\n",
        "    pass\n",
        "                \n",
        "  def gradient_step(self, step_size):\n",
        "    self.weights -= step_size * self.dloss_dweights\n",
        "    \n",
        "# Double check that op matches tensorflow\n",
        "print(\"Testing Conv2D...\")\n",
        "op1 = OpConv2D(4, 3, inputs)\n",
        "\n",
        "tf_weights = tf.constant(np.transpose(op1.weights, [1,2,3,0]), dtype=tf.float32)\n",
        "tf_op1 = tf.nn.conv2d(tf_inputs,\n",
        "                      tf_weights,\n",
        "                      [1,1,1,1],\n",
        "                      'VALID')\n",
        "cmp_ops(op1, tf_op1, tf_inputs, tf_weights)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "-MdfHCgzdYnx",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "## 3.1.2 ReLU (5 pts)"
      ]
    },
    {
      "metadata": {
        "id": "6cwXENGCC9tm",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "class OpRelu:\n",
        "  \"\"\"Elementwise relu operator\"\"\"\n",
        "    \n",
        "  def __init__(self, inputs):\n",
        "    # Shape of the input feature map\n",
        "    self.input_shape = inputs.value.shape\n",
        "    self.inputs = inputs\n",
        "    self.reset_values()\n",
        "    \n",
        "  def reset_values(self):\n",
        "    self.value = np.zeros(self.inputs.value.shape)\n",
        "    self.dloss_dvalue = np.zeros(self.inputs.value.shape)\n",
        "    \n",
        "  def forward(self):\n",
        "    # Reset value and gradient at start of forward pass\n",
        "    self.reset_values()\n",
        "    ## Complete this code by setting self.value using self.inputs.value\n",
        "          \n",
        "  def backward(self):\n",
        "    self.inputs.dloss_dvalue = self.dloss_dvalue * np.greater(self.value, 0.0)\n",
        "                \n",
        "  def gradient_step(self, step_size):\n",
        "    pass    \n",
        "  \n",
        "# Double check that each op matches tensorflow\n",
        "print(\"\\nTesting Relu...\")\n",
        "op2 = OpRelu(inputs)\n",
        "tf_op2 = tf.nn.relu(tf_inputs)\n",
        "cmp_ops(op2, tf_op2, tf_inputs)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "bz3FdokPdxVx",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "## 3.1.3 Average Pooling (5 pts)"
      ]
    },
    {
      "metadata": {
        "id": "U1-lSilxdh1x",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "class OpAvgPool:\n",
        "  \"\"\"Average pooling layer. Non-overlapping cells.\"\"\"\n",
        "    \n",
        "  def __init__(self, cell_size, inputs):\n",
        "    # Shape of the input feature map\n",
        "    self.input_height = inputs.value.shape[0]\n",
        "    self.input_width = inputs.value.shape[1]\n",
        "    self.input_filters = inputs.value.shape[2]\n",
        "    \n",
        "    # Shape of this layer's feature map\n",
        "    self.height = (self.input_height + cell_size - 1) / cell_size\n",
        "    self.width = (self.input_width + cell_size - 1) / cell_size\n",
        "    self.filters = self.input_filters\n",
        "    \n",
        "    self.inputs = inputs\n",
        "    self.cell_size = cell_size\n",
        "    self.reset_values()\n",
        "    \n",
        "  def reset_values(self):\n",
        "    self.value = np.zeros((self.height, self.width, self.filters))\n",
        "    self.dloss_dvalue = np.zeros(self.value.shape)\n",
        "    \n",
        "  def forward(self):\n",
        "    # Reset value and gradient at start of forward pass\n",
        "    self.reset_values()\n",
        "    \n",
        "    for y in range(self.height):\n",
        "      for x in range(self.width):\n",
        "        for f in range(self.filters):\n",
        "          z = 0.0\n",
        "          \n",
        "          for ky in range(min(self.cell_size, self.input_height - y*self.cell_size)):\n",
        "            for kx in range(min(self.cell_size, self.input_width - x*self.cell_size)):\n",
        "              z += self.inputs.value[self.cell_size*y+ky, self.cell_size*x+kx, f]\n",
        "                \n",
        "          self.value[y, x, f] = z / (self.cell_size * self.cell_size)\n",
        "          \n",
        "  def backward(self):\n",
        "    ## Complete this method by setting the partial with repect to the values of the inputs\n",
        "    ## self.inputs.dloss_dvalue, an `input_height x input_width x filters` tensor\n",
        "    ## This will use the partial with respect to the value of this layer\n",
        "    ## self.dloss_dvalue, a `height x width x filters` tensor\n",
        "    pass\n",
        "                \n",
        "  def gradient_step(self, step_size):\n",
        "    pass\n",
        "  \n",
        "# Double check that each op matches tensorflow\n",
        "print(\"\\nTesting AvgPool...\")\n",
        "op3 = OpAvgPool(2, inputs)\n",
        "tf_op3 = tf.nn.avg_pool(tf_inputs, [1, 2, 2, 1], [1,2,2,1], \"VALID\")\n",
        "cmp_ops(op3, tf_op3, tf_inputs)\n",
        "\n",
        "\n"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "YTOQFqDmd1LU",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "## 3.1.4 Softmax Cross-entropy Loss (10 pts)"
      ]
    },
    {
      "metadata": {
        "id": "gfCoGsUDdhpo",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "class OpSoftmaxCrossEntropyLoss:\n",
        "  \"\"\"Cross-entropy loss.\"\"\"\n",
        "    \n",
        "  def __init__(self, logits, true_label):\n",
        "    \"\"\"\n",
        "    inputs:\n",
        "      logits: shape [1,1,num_classes]\n",
        "      true_label: scalar in range [0, num_classes-1]\n",
        "    \"\"\"\n",
        "    \n",
        "    # Shape of the input feature map\n",
        "    self.num_classes = logits.value.shape[2]\n",
        "    self.inputs = logits\n",
        "    self.true_label = true_label\n",
        "    \n",
        "  def reset_values(self):\n",
        "    self.max_label = 0\n",
        "    self.value = np.zeros((1,))\n",
        "    self.softmax_prob = np.zeros((self.num_classes,))\n",
        "    \n",
        "  def forward(self):\n",
        "    # Reset value and gradient at start of forward pass\n",
        "    self.reset_values()\n",
        "    ## Complete this method by:\n",
        "    ## (1) setting self.value to the scalar value of the \n",
        "    ##     negative log probability of the true class under a Softmax distribution.\n",
        "    ##     Loss = -ln(exp(y_true) / sum_j (exp(y_j))), where y_j is the logits\n",
        "    ##     value for class j.\n",
        "    ## (2) setting self.softmax_prob to the vector representing the probability\n",
        "    ##     of each class according to the Softmax distribution\n",
        "    ##     softmax_prob[j] = exp(y_i) / sum_j (exp(y_j)), where y_j is the logits\n",
        "    ##     value for class j.\n",
        "    ## This will use\n",
        "    ## self.inputs.value, a `1 x 1 x num_classes` tensor containing the logits\n",
        "    \n",
        "          \n",
        "  def backward(self):\n",
        "    # Loss = -ln(exp(y_true) / sum_j (exp(y_j)))\n",
        "    # dLoss/dYk = exp(y_k) / sum_j (exp(y_j))\n",
        "    # dLoss/dYtrue = exp(y_true) / sum_j (exp(y_j)) - 1\n",
        "    self.inputs.dloss_dvalue[0, 0, :] += self.softmax_prob\n",
        "    self.inputs.dloss_dvalue[0, 0, self.true_label.value] += -1\n",
        "                \n",
        "  def gradient_step(self, step_size):\n",
        "    pass\n",
        "  \n",
        "# Double check that each op matches tensorflow\n",
        "print(\"\\nTesting Cross Entropy Loss...\")\n",
        "pooled = OpAvgPool(10, inputs)\n",
        "pooled.forward()\n",
        "tf_pooled = tf.nn.avg_pool(tf_inputs, [1, 10, 10, 1], [1,10,10,1], \"VALID\")\n",
        "\n",
        "true_label = Variable()\n",
        "op4 = OpSoftmaxCrossEntropyLoss(pooled, true_label)\n",
        "tf_op4 = tf.nn.softmax_cross_entropy_with_logits_v2(logits=tf_pooled, \n",
        "                                                    labels=tf.one_hot(tf.constant(0), 3))\n",
        "cmp_ops(op4, tf_op4, tf_pooled)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "kup9p9q8eSXn",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "## 3.1.5 Run for a few iterations (10 pts)\n",
        "\n",
        "Here we assemble all of our operations into a full convolutional neural network.  We then run stochastic gradient descent on a small collection of ten images to ensure that the loss is decreasing.\n",
        "\n",
        "Run this cell to plot 100 iterations of training. **(5 pts)**\n",
        "\n",
        "Why is this plot jagged?  What is it about our architecture or training procedure that causes this, and how might adjusting these factors change the shape of this curve? **(5 pts)**"
      ]
    },
    {
      "metadata": {
        "id": "QE1qrTr1coAr",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "from tensorflow.examples.tutorials.mnist import input_data\n",
        "\n",
        "# Construct a mini network for MNIST\n",
        "inputs = Variable()\n",
        "true_label = Variable()\n",
        "inputs.value = np.random.normal(size=(28, 28, 1))\n",
        "inputs.dloss_dvalue = np.random.normal(size=(28, 28, 1))\n",
        "\n",
        "op1 = OpConv2D(16, 5, inputs) # Output is 28-5+1=24\n",
        "op2 = OpAvgPool(2, op1)      # Output is 24/2=12\n",
        "op3 = OpRelu(op2)\n",
        "\n",
        "op4 = OpConv2D(16, 5, op3) # Output is 12-5+1=8\n",
        "op5 = OpAvgPool(2, op4)      # Output is 8/2=4\n",
        "op6 = OpRelu(op5)\n",
        "\n",
        "op7 = OpConv2D(10, 3, op6) # Output is 4-3+1=2\n",
        "op8 = OpAvgPool(2, op7)      # Output is 2/2=1\n",
        "\n",
        "op9 = OpSoftmaxCrossEntropyLoss(op8, true_label)\n",
        "ops_list = [op1,op2,op3,op4,op5,op6,op7,op8,op9]\n",
        "\n",
        "# Run for a few iterations, make sure loss is going down\n",
        "learning_rate = 0.2\n",
        "inputs.value = np.random.normal(size=(28, 28, 1))\n",
        "\n",
        "mnist = input_data.read_data_sets('MNIST_data', one_hot=False)\n",
        "\n",
        "num_its = 20\n",
        "batch_size = 10\n",
        "batch_x, batch_y = mnist.train.next_batch(batch_size)\n",
        "\n",
        "loss_list = []\n",
        "\n",
        "for it in range(num_its):\n",
        "  loss_of_batch = 0.0\n",
        "  \n",
        "  for im in range(batch_size):\n",
        "    inputs.value = np.reshape(batch_x[im], (28,28,1))\n",
        "    true_label.value = batch_y[im]\n",
        "  \n",
        "    for op in ops_list:\n",
        "      op.forward()\n",
        "\n",
        "    loss_of_batch += ops_list[-1].value\n",
        "    \n",
        "    for op in reversed(ops_list):\n",
        "      op.backward()\n",
        "      op.gradient_step(learning_rate)\n",
        "  \n",
        "  loss_list.append(loss_of_batch)\n",
        "  \n",
        "  print(\"Iteration \" + str(it) + \" Loss: \"+str(loss_of_batch))\n",
        "  \n",
        "  \n",
        "plt.plot(range(num_its), loss_list)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "KWyWnrg7nLFC",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "## 3.1.6 Extra credit (5 points)\n",
        "\n",
        "Extend the functionality of one of these operations (e.g. add stride, dilation, or padding to the 2D Convolution) or implement a new one (e.g. fully-connected layer).\n",
        "\n"
      ]
    },
    {
      "metadata": {
        "id": "DQBxDYsJidXP",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "# 3.2 Training an image classifier (40 points)"
      ]
    },
    {
      "metadata": {
        "id": "_opHX46kic3G",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "#@title (Hidden utility code: RUN ME FIRST) { display-mode: \"form\" }\n",
        "!git clone https://github.com/tensorflow/models.git 2>/dev/null\n",
        "import sys\n",
        "import math\n",
        "sys.path.append('/content/models/tutorials/image/cifar10/')\n",
        "from datetime import datetime\n",
        "import matplotlib.pyplot as plt\n",
        "import numpy as np\n",
        "\n",
        "plt.rcParams['axes.facecolor'] = 'white'\n",
        "\n",
        "import tensorflow as tf\n",
        "tf.reset_default_graph()\n",
        "try:\n",
        "  tf.app.flags.FLAGS.f\n",
        "except Exception:\n",
        "  tf.app.flags.DEFINE_string('f', '', \"\"\"Placeholder.\"\"\")\n",
        "import cifar10\n",
        "tf.app.flags.FLAGS.batch_size = 100\n",
        "# from tensorflow.examples.models.tutorials.image.cifar10 import cifar10\n",
        "\n",
        "def plot_filters(filters, xlabel=None, ylabel=None):\n",
        "  print(filters.shape)\n",
        "  # filters: height x width x channels x num_filters\n",
        "  num_filters = filters.shape[3]\n",
        "  filter_height = filters.shape[0]\n",
        "  filter_width = filters.shape[1]\n",
        "  filter_channels = filters.shape[2]\n",
        "  spacing = 1\n",
        "  rows = int(math.ceil(math.sqrt(num_filters)))\n",
        "  cols = int(math.ceil(math.sqrt(num_filters)))\n",
        "  plot = np.zeros((rows*(filter_height+spacing), cols*(filter_width+spacing), min(filter_channels, 3) ))\n",
        "  \n",
        "  min_value = np.min(filters)\n",
        "  max_value = np.max(filters)\n",
        "  filters = (filters - min_value) / (max_value - min_value)\n",
        "  \n",
        "  for f in range(num_filters):\n",
        "    r = int(f/cols)\n",
        "    c = f - r*cols\n",
        "    plot[r*(filter_height+spacing):r*(filter_height+spacing)+filter_height,\n",
        "        c*(filter_width+spacing):c*(filter_width+spacing)+filter_width,:] = filters[:,:,0:min(filter_channels, 3),f]\n",
        "  \n",
        "  plt.grid(False)\n",
        "  plt.imshow(np.squeeze(plot))\n",
        "  if xlabel is not None:\n",
        "    plt.xlabel(xlabel)\n",
        "  if ylabel is not None:\n",
        "    plt.ylabel(ylabel)\n",
        "  plt.show()\n",
        "\n",
        "cifar10.maybe_download_and_extract()\n",
        "images, labels = cifar10.inputs(False)\n",
        "test_images, test_labels = cifar10.inputs(True)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "QRA3pZgr3Z2W",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "## 3.2.1 Early stopping (15 points)\n",
        "\n",
        "We have specified a very simple convolutional neural network to classify images from the Cifar-10 dataset.  We then provide a training loop to optimize the weights of the network.  Your task is to add Early Stopping (ES) to this training loop.  Validation accuracy should be measured periodically, and training should stop if the validation accuracy does not reach a new absolute maximum after some number of measurements (this is called the \"patience\"). After training, we then measure the test accuracy.  Before implementing ES, run the following cell to see a plot of the training loss and validation accuracy.  Report the test accuracy you have found with ES.\n",
        "\n",
        "## 3.2.2 Tuning hyperparameters (25 points)\n",
        "\n",
        "The hyperparameters we have chosen are not necessarily optimal.  Pick two factors to search over (e.g. number of layers, filters per layer, learning rate, convolutional kernel size, etc.).  Then write a procedure that uses grid search to find the combination of these hyperparameters that yields the highest validation accuracy.  Finally, report the test accuracy achieved by this model."
      ]
    },
    {
      "metadata": {
        "id": "sW5cHzOB7gsX",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "sess = tf.Session()\n",
        "with sess.as_default():\n",
        "  tf.train.start_queue_runners()\n",
        "  im_width = 24\n",
        "\n",
        "  # Define placeholders for image and label\n",
        "  y_ = tf.placeholder(tf.float32, [None, 10])\n",
        "  x = tf.placeholder(tf.float32, [None, im_width, im_width, 3])\n",
        "\n",
        "  # Define a convolutional neural network (CNN)\n",
        "  cnnL1 = tf.layers.conv2d(x, 16, 5, strides=(2,2), activation=tf.nn.relu)\n",
        "  cnnL2 = tf.layers.conv2d(cnnL1, 16, 5, activation=tf.nn.relu)\n",
        "  cnnL3 = tf.layers.conv2d(cnnL2, 32, 5, activation=tf.nn.relu)\n",
        "  cnn = tf.reduce_sum(tf.reduce_sum(cnnL3, axis=1), axis=1)\n",
        "  cnn = tf.contrib.layers.flatten(cnn)\n",
        "  y_cnn = tf.layers.dense(cnn, 10)\n",
        "\n",
        "  cross_entropy_cnn = tf.reduce_mean(\n",
        "      tf.nn.softmax_cross_entropy_with_logits_v2(labels=y_, logits=y_cnn))\n",
        "  train_step_cnn = tf.train.GradientDescentOptimizer(0.05).minimize(cross_entropy_cnn)\n",
        "\n",
        "  correct_prediction_cnn = tf.equal(tf.argmax(y_cnn, 1), tf.argmax(y_, 1))\n",
        "  accuracy_cnn = tf.reduce_mean(tf.cast(correct_prediction_cnn, tf.float32))\n",
        "\n",
        "  tf.global_variables_initializer().run(session=sess)\n",
        "\n",
        "  # Train\n",
        "  print('Training... '+str(datetime.now()))\n",
        "  valid_batch_xs, valid_batch_ys = sess.run([test_images, tf.one_hot(test_labels, 10)])\n",
        "  train_losses = []\n",
        "  test_accuracies = []\n",
        "  valid_its = []\n",
        "  valid_accuracies = []\n",
        "  num_its = 1000\n",
        "  for it in range(num_its):\n",
        "    if (it+1) % 50 == 0:\n",
        "      print('Iteration %d/%d ...' % (it, num_its))\n",
        "\n",
        "      # Validation accuracy\n",
        "      valid_acc_cnn = sess.run(accuracy_cnn, feed_dict={x: valid_batch_xs, y_: valid_batch_ys})\n",
        "      valid_accuracies.append(valid_acc_cnn)\n",
        "      valid_its.append(it)\n",
        "\n",
        "    batch_xs, batch_ys = sess.run([images, tf.one_hot(labels, 10)])\n",
        "    loss_cnn_out, _ = sess.run([cross_entropy_cnn, train_step_cnn], feed_dict={x: batch_xs, y_: batch_ys})\n",
        "\n",
        "    train_losses.append(loss_cnn_out)\n",
        "\n",
        "  print('Testing... '+str(datetime.now()))\n",
        "  # # Test trained model\n",
        "  test_batch_xs, test_batch_ys = sess.run([test_images, tf.one_hot(test_labels, 10)])\n",
        "\n",
        "  true_label = tf.argmax(y_, 1)\n",
        "  cnn_label = tf.argmax(y_cnn, 1)\n",
        "  acc_cnn_out, true_label_out, cnn_label_out = sess.run([accuracy_cnn, true_label, cnn_label], feed_dict={x: test_batch_xs,\n",
        "                                      y_: test_batch_ys})\n",
        "  \n",
        "# Plot train loss and validation accuracy\n",
        "plt.plot(range(it+1), train_losses)\n",
        "plt.ylabel('Training loss')\n",
        "plt.xlabel('Iteration')\n",
        "plt.show()\n",
        "plt.plot(valid_its, valid_accuracies)\n",
        "plt.ylabel('Validation accuracy')\n",
        "plt.xlabel('Iteration')\n",
        "plt.show()\n",
        "\n",
        "print('Test accuracy: ' + str(acc_cnn_out*100)+ '%%')\n"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "PoGc-6q5bS9M",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "If you are curious what the weights, activations, or confused images look like, we visualize them below.  Feel free to modify this code to inspect other aspects of your trained model."
      ]
    },
    {
      "metadata": {
        "id": "9z2FGsc9CSdX",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "cell_type": "code",
      "source": [
        "with sess.as_default():\n",
        "  # Show weights from the first layer\n",
        "  print('Weights from the first layer')\n",
        "  with tf.variable_scope(\"conv2d_1\", reuse=True):\n",
        "    weights = tf.get_variable('kernel')\n",
        "  plot_filters(weights.eval())\n",
        "\n",
        "  # Show activations from the first feature map\n",
        "  print('Activations from the first feature map.')\n",
        "  fmap = cnnL1.eval(feed_dict={x: test_batch_xs, y_: test_batch_ys})\n",
        "  plot_filters(np.transpose(fmap[0:1,...], (1,2,0,3)))\n",
        "\n",
        "  # Show images in a confusion matrix\n",
        "  confusion = np.zeros((24,24,3,100))\n",
        "  for b in range(true_label_out.shape[0]):\n",
        "    confusion[:,:,:,true_label_out[b]*10 + cnn_label_out[b]] = test_batch_xs[b]\n",
        "\n",
        "  plot_filters(confusion, ylabel='True label', xlabel='Guessed label')\n",
        "\n"
      ],
      "execution_count": 0,
      "outputs": []
    }
  ]
}