This assignment is due on Tue, Nov 26, 2024 at 11:59pm PST.
Starter code containing Colab notebooks can be downloaded here.
Note. Ensure you are periodically saving your notebook (File -> Save) so that you don’t lose your progress if you step away from the assignment and the Colab VM disconnects.
Once you have completed all Colab notebooks except collect_submission.ipynb, proceed to the submission instructions.
In this assignment, you will implement language networks and apply them to image captioning on the COCO dataset. Then, you will be introduced to self-supervised learning to automatically learn the visual representations of an unlabeled dataset.
The goals of this assignment are as follows:
You will use PyTorch for the majority of this homework.
The notebook Transformer_Captioning.ipynb will walk you through the implementation of a Transformer model and apply it to image captioning on COCO.
For MultiHeadAttention class in Transformer_Captioning.ipynb notebook, you are expected to apply dropout to the attention weights.
In the notebook Self_Supervised_Learning.ipynb, you will learn how to leverage self-supervised pretraining to obtain better performance on image classification tasks. When first opening the notebook, go to Runtime > Change runtime type and set Hardware accelerator to GPU.
Important. Please make sure that the submitted notebooks have been run and the cell outputs are visible.
1. Open collect_submission.ipynb in Colab and execute the notebook cells.
This notebook/script will:
.py and .ipynb) called a5_code_submission.zip.If your submission for this step was successful, you should see the following display message:
### Done! Please submit a5_code_submission.zip and the a5_inline_submission.pdf to Gradescope. ###
2. Submit the PDF and the zip file to Gradescope.
Remember to download a5_code_submission.zip and a5_inline_submission.pdf locally before submitting to Gradescope.