This assignment is due on Friday, May 19 2023 at 11:59pm PST.
Starter code containing Colab notebooks can be downloaded here.
Please familiarize yourself with the recommended workflow before starting the assignment. You should also watch the Colab walkthrough tutorial below.
Note. Ensure you are periodically saving your notebook (File -> Save
) so that you don’t lose your progress if you step away from the assignment and the Colab VM disconnects.
Once you have completed all Colab notebooks except collect_submission.ipynb
, proceed to the submission instructions.
In this assignment, you will implement language networks and apply them to image captioning on the COCO dataset. Then you will train a Generative Adversarial Network to generate images that look like a training dataset. Finally, you will be introduced to self-supervised learning to automatically learn the visual representations of an unlabeled dataset.
The goals of this assignment are as follows:
You will use PyTorch for the majority of this homework.
The notebook RNN_Captioning.ipynb
will walk you through the implementation of vanilla recurrent neural networks and apply them to image captioning on COCO.
The notebook Transformer_Captioning.ipynb
will walk you through the implementation of a Transformer model and apply it to image captioning on COCO.
For MultiHeadAttention
class in Transformer_Captioning.ipynb
notebook, you are expected to apply dropout to the attention weights.
In the notebook Generative_Adversarial_Networks.ipynb
you will learn how to generate images that match a training dataset and use these models to improve classifier performance when training on a large amount of unlabeled data and a small amount of labeled data. When first opening the notebook, go to Runtime > Change runtime type
and set Hardware accelerator
to GPU
.
In the notebook Self_Supervised_Learning.ipynb
, you will learn how to leverage self-supervised pretraining to obtain better performance on image classification tasks. When first opening the notebook, go to Runtime > Change runtime type
and set Hardware accelerator
to GPU
.
The notebook LSTM_Captioning.ipynb
will walk you through the implementation of Long-Short Term Memory (LSTM) RNNs and apply them to image captioning on COCO.
Important. Please make sure that the submitted notebooks have been run and the cell outputs are visible.
Once you have completed all notebooks and filled out the necessary code, there are two steps you must follow to submit your assignment:
Even if you have completed your notebooks locally, please execute the following PDF generation on Colab. This will prevent a lot of headaches installing xelatex
locally, specifically on Windows or Mac OS.
1. Open collect_submission.ipynb
in Colab and execute the notebook cells.
This notebook/script will:
.py
and .ipynb
) called a3_code_submission.zip
.If your submission for this step was successful, you should see the following display message:
### Done! Please submit a3_code_submission.zip and the a3_inline_submission.pdf to Gradescope. ###
2. Submit the PDF and the zip file to Gradescope.
Remember to download a3_code_submission.zip
and a3_inline_submission.pdf
locally before submitting to Gradescope.