Project 3: Image Classification with CIFAR10 (Part 2)

This is Project 3 (Part 2) for UW CSE P576 Computer Vision.

Part 2 of the project uses the Keras Sequential API in Tensorflow to perform image classification using CIFAR10. You should complete Part 1 first. Part 1 is worth 60% and Part 2 worth 40% of the overall mark.

Getting Started: The source files for this project (Parts 1 and 2) are here. To run locally, you will need IPython/Jupyter as well as Tensorflow installed. Launch Jupyter and open Project3_2.ipynb. You can also import this notebook directly into Colaboratory without installing anything.

This project: You'll start by replicating the linear classifier from Part 1 using the Keras Sequential API. This will give a performance baseline. You'll then work on improved model designs using convolutional layers and tune parameters to get good classification performance.

What to turn in: Hand in a zipfile containing your completed .ipynb notebook and any .py files you created. Be sure to describe clearly the results of your investigations.

version 070120

Licensed under the Apache License, Version 2.0 (the "License"); This is not an official Google product.

In [ ]:
#@title 
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
In [ ]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline
# edit this line to change the figure size
plt.rcParams['figure.figsize'] = (16.0, 10.0)
plt.rcParams['font.size'] = 16
# may be needed to avoid mulitply defined openmp libs
import os
os.environ['KMP_DUPLICATE_LIB_OK']='TRUE'

Download CIFAR10 dataset

We use keras.datasets to download the CIFAR10 data, splitting off 1000 images from the training set for validation. The dataset will be cached at ~/.keras/datasets by default.

In [ ]:
# Load CIFAR10 dataset
(train_images0,train_labels0),(test_images,test_labels) = keras.datasets.cifar10.load_data()

# Normalise images
train_images0=train_images0.astype('float')/255.0
test_images=test_images.astype('float')/255.0

# Create a validation set
num_valid=1000
valid_images=train_images0[0:num_valid]
valid_labels=train_labels0[0:num_valid]
train_images=train_images0[num_valid:]
train_labels=train_labels0[num_valid:]

cifar10_names = ['airplane', 'automobile', 'bird', 'cat', 'deer','dog', 'frog', 'horse', 'ship', 'truck']
num_classes=10
num_train=train_labels.size
num_valid=valid_labels.size
num_test=test_labels.size

# Make one-hot targets
train_one_hot=tf.one_hot(train_labels[:,0],num_classes)
valid_one_hot=tf.one_hot(valid_labels[:,0],num_classes)
test_one_hot=tf.one_hot(test_labels[:,0],num_classes)

# Show a random image and label
rnd=np.random.randint(num_train)
plt.rcParams['figure.figsize'] = (4.0, 4.0)
plt.imshow(train_images[rnd])
print(cifar10_names[train_labels[rnd][0]])

Define a Linear model using Keras [10%]

We'll start by replicating the linear model from Part 1. Use the Keras Sequential API to define a linear model over the input pixels. Complete the code below Hint: try layers.Flatten and layers.Dense. Check that your layer outputs have the right shape. How many parameters does your model have?

In [ ]:
# Initialize a Keras sequential model
model=keras.models.Sequential()

#FORNOW: placeholder model, replace this with your own model
model.add(layers.Conv2D(filters=10,kernel_size=1,input_shape=(32,32,3)))
model.add(layers.GlobalAveragePooling2D())

"""
*************************************************************
*** TODO: implement a linear model using Keras Sequential API
*************************************************************

The model should compute a single linear function of the input pixels
"""        


"""
*************************************************************
"""

# output a summary of the model
model.summary()

Train the model

Use the code below to train the model with squared-error loss between logits and targets. Try a few different optimizers and loss functions, and running for additional epochs. Which combination works best? Can you modify the model to implement softmax regression? How could you add regularization (L2 penalty) on the weights?

In [ ]:
model.compile(optimizer='sgd', loss='mean_squared_error',metrics=['accuracy'])
history = model.fit(train_images,train_one_hot,epochs=1,validation_data=(valid_images, valid_one_hot))

"""
******************************************************************
*** TODO: test different training parameters for linear regression
******************************************************************
"""


"""
******************************************************************
"""

Run the model

The following code demonstrates use of model.predict. Run the following code to demonstrate running the model on a single input.

In [ ]:
rnd=np.random.randint(num_test)
test_image_rnd=test_images[rnd]
test_label_rnd=test_labels[rnd][0]
logits=model.predict(np.expand_dims(test_image_rnd,0))
label_pred=np.argmax(logits)
plt.imshow(test_image_rnd)
print('predicted =',cifar10_names[label_pred])
print('actual =',cifar10_names[test_label_rnd])

Design a new model [5%]

Design your own convolutional net choosing your own set of layers. In the next block, you'll train, tune and visualize outputs from this model.

In [ ]:
"""
****************************************************
*** TODO: design a convolutional network for CIFAR10
****************************************************

Design a model to perform CIFAR10 classification
"""     


"""
****************************************************
"""

Train, Tune and Visualize the Model [25%]

Train your model and plot training and validation accuracy as a function of time (steps or epochs). Choose some model parameters (e.g., number of layers, filters per layer, kernel size) and study the affect of their settings on performance. Show your findings with plots or tables with validation accuracy as a function of the parameters in your study. Visualize some aspect of your model, e.g., first layer weights, activation distributions. How do they evolve over time?

In [ ]:
# FORNOW: train model with sgd for 5 epochs
model.compile(optimizer='sgd',loss='mean_squared_error',metrics=['accuracy'])
history = model.fit(train_images, train_one_hot, epochs=5, validation_data=(valid_images, valid_one_hot))

"""
*********************************************
*** TODO: train, tune and visualize CNN model
*********************************************
"""    


"""
*********************************************
"""

# Example of plotting training and validation accuracy vs epoch
plt.rcParams['figure.figsize'] = (8.0, 6.0)
plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0.0, 1])
plt.legend(loc='upper right')

Evaluate the model

Use the code below to compute the final accuracy of your model on the test set.

In [ ]:
# Compute accuracy on the test set
test_loss, test_acc = model.evaluate(test_images,test_one_hot,verbose=2)

#