This assignment is due on Monday, June 9, 2025 at 11:59pm PST.
Starter code containing Colab notebooks can be downloaded here.
Note. Ensure you are periodically saving your notebook (File -> Save
) so that you don't lose your progress if you step away from the assignment and the Colab VM disconnects.
Once you have completed all Colab notebooks except collect_submission.ipynb
, proceed to the submission instructions.
In this assignment, you will implement language networks and apply them to image captioning on the COCO dataset. Then, you will be introduced to self-supervised learning to automatically learn the visual representations of an unlabeled dataset. Lastly, if you so choose, you can improve the image captioning models using reinforcement learning from human feedback (RLHF).
The goals of this assignment are as follows:
You will use PyTorch for the majority of this homework.
The notebook Transformer_Captioning.ipynb
will walk you through the implementation of a Transformer model and apply it to image captioning on COCO.
For MultiHeadAttention
class in Transformer_Captioning.ipynb
notebook, you are expected to apply dropout to the attention weights.
In the notebook Self_Supervised_Learning.ipynb
, you will learn how to leverage self-supervised pretraining to obtain better performance on image classification tasks. When first opening the notebook, go to Runtime > Change runtime type
and set Hardware accelerator
to GPU
.
The notebook RLHF_Image_Captioning.ipynb
will walk you through the implementation of REINFORCE, DPO, and KL Divergence and their application to image captioning on COCO.
We did not have time to cover these topics in lecture, so this is purely optional. We recommend you only attempt this notebook if you have finished the rest of A5 and your course project.
This notebook does NOT have an autograder on Gradescope. Instead, we will manually grade code submissions for RLHF.
Important. Please make sure that the submitted notebooks have been run and the cell outputs are visible.
1. Open collect_submission.ipynb
in Colab and execute the notebook cells.
This notebook/script will:
.py
and .ipynb
) called a5_code_submission.zip
.If your submission for this step was successful, you should see the following display message:
### Done! Please submit a5_code_submission.zip to Gradescope. ###
2. Submit the zip file to Gradescope.
Remember to download a5_code_submission.zip
locally before submitting to Gradescope.
3. Ensure that you have answered, on Gradescope, the inline questions scattered throughout the notebooks.