as4: Teach Machines to Recognise Sounds
Last revised: April 26, 2021
Assigned:
- April 26, 2021
Due:
- May 5, 2021
Learning goals
- Design a visualization and provide a rationale for its purpose and value in code/sketch
- Understand the difference between text for captioning speech and representations of nonspeech audio through the act of captioning a video that includes both speech and nonspeech audio
- Exploring how context may impact representation through the readings
Activities
In this homework, you will do three things:
- Build a visualization of non-speech audio.
- Caption a video.
- Read related papers.
- Answer Reflection Questions
All of these are described in more depth in the Jupyter Notebook
Build a visualization of non-speech audio
We will provide a Jupyter notebook (linked on canvas) via Google Colaboratory that is pre-loaded with code to train a machine learning model to recognize sounds. Everything you need is in the notebook.
To access it: Open it in your browser. You will be prompted to copy it. Then login to your UW CSE provided google account and open the notebook in google colab on your browser. Here is a getting started tutorial on colab.
Remainder of homework
All of the information for captioning; papers to read; and the reflection questions can be found in the Jupyter notebook.