![Title image: The Ancient Secrets of Computer Vision](images/title.jpg) ## Course Information ## This class is a general introduction to computer vision. It covers standard techniques in image processing like filtering, edge detection, stereo, flow, etc. (old-school vision), as well as newer, machine-learning based computer vision. Course will be offered in a variety of modalities: - In-person: [CSE2 G01](https://www.washington.edu/classroom/CSE2+G01) - Remotely: [Zoom](https://washington.zoom.us/j/98505228096) - Asynchronously: See below for lecture recordings Participate in whatever way best suits your needs this quarter! **Please do not come to in-person class if you are sick or have reason to suspect you may be sick.** ### Instructor ### Joseph Redmon - Email: pjreddie@cs.washington.edu ### TAs ### - Roy Or - El - royorel@cs.washington.edu - Zixuan Liu - zucksliu@cs.washington.edu - Ivan Montero - ivamon@cs.washington.edu - Mino Nakura - nakuram@cs.washington.edu - Tobias Rohde - tobiasr@cs.washington.edu ### Office Hours ### - Monday - 11:30-12:30pm with Ivan: https://washington.zoom.us/my/ivamon - Tuesday - 11:30-12:30pm with Roy: https://washington.zoom.us/my/royorel - Wednesday - 11:00am-12:00pm with Tobias: https://washington.zoom.us/my/tobiasr.nlp - Thursday - 1:30-2:30pm with Zucks: https://washington.zoom.us/j/7826771790 - 3:00-4:00pm with Joe: https://washington.zoom.us/j/3362756951 - Friday - 9:30-10:30am with Mino: https://washington.zoom.us/my/minonakura ### Resources ### - Ed Discussion Board: https://edstem.org/us/courses/21400/discussion/ - Canvas: https://canvas.uw.edu/courses/1547005/ - Zoom: https://washington.zoom.us/j/98505228096 Slides are a mishmash of lots of other people's work. Special thanks to: Rob Fergus, Linda Shapiro, Harvey Rhody, Rick Szeliski, Ali Farhadi, Robert Collins. Lectures 8 and 9 on Flow, 3d, and stereo are by [Connor Schenck](https://homes.cs.washington.edu/~schenckc/). All of the slides, videos, and homeworks are free to use, modify, redistribute as you like without permission. Just make your own copy of the slides on Google Docs, don't ask to modify mine! ## Homeworks ## The class has 6 homeworks where you will build out a computer vision library in C. We cover basic image manipulations, filtering, features, stitching, optical flow, machine learning, and convolutional neural networks. Most of the homeworks will use [this repository](https://github.com/pjreddie/uwimg/). The individual homeworks can be found in the `src/` folder. - [Homework 0: Fun with Color!](https://github.com/pjreddie/uwimg/tree/main/src/hw0), Due April 7 - [Homework 1: Resizing](https://github.com/pjreddie/uwimg/tree/main/src/hw1), Due April 14 - [Homework 2: Filtering and Convolutions](https://github.com/pjreddie/uwimg/tree/main/src/hw2), Due April 21 - [Homework 3: Panoramas!](https://github.com/pjreddie/uwimg/tree/main/src/hw3), Due April 28 - [Homework 4: Optical Flow](https://github.com/pjreddie/uwimg/tree/main/src/hw4), Due May - [Homework 5: Neural Networks and Machine Learning](https://github.com/pjreddie/uwimg/tree/main/src/hw5), Due May **Note:** due date subject to change if we haven't covered relevant material in time for the assignment. You have 8 late day to use throughout the quarter. Each day late counts as one late day. Any number can be used on any assingment. After you have used your late days late assingments will be penalized up to 10% per day late. **COVID Policy:** If you get COVID don't worry about doing your homework, rest, recover, do what you need to do to get better. If you feel like doing computer vision while sick go for it but also know you can take some time off. Once you are well please reach out to the course staff and we can figure out how to get you back on track with assingments and any missed classes. You will not be penalized for turning in assingments late due to COVID (or if you're having trouble getting caught back up afterward). ## Final Project ## There is a final project worth 20% of the final grade. Pick any area of computer vision that interests you and pursue some independent work in that area. Each project should have a significant technical component, software implementation, or large-scale study. Projects can focus on developing new techniques or tools in computer vision or applying existing tools to a new domain. If you don't have an idea you can train a classifier on birds and compete in the Kaggle competition posted on the Ed discussion board. ## Lectures ## ### Week 1: Image Basics ### This week we cover the basics of computer vision. There's an introduction to the three levels of vision, **low-level** vision mostly concerns the pixels or groups of nearby pixels, **mid-level** vision starts to connect images to each other and the real world, and **high-level** vision connects images to semantics and meaning. There's background information on the human visual system, color, light, what an image actually is, and how it's stored in a computer. All fun stuff! Once you've learned the basics you should be ready for [Homework 0](https://github.com/pjreddie/uwimg/tree/main/src/hw0), which is mostly an introduction to the codebase we'll be using for the assignments. #### Lecture 1: Introduction - [Slides](https://docs.google.com/presentation/d/1vZuncM3rrJZakza94UU8FvpEd9Ph-9E9I3Hj4Xkr91U/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/1FEF3jkDLCvAELYiPl2Pw0SCoEWpPIDL9Q1WEu2k7V4ovCMm2Ihq-Onjb1sjRQY_W1A3EOl9do5xlLo.RYm2eI7Uweq1oco_) #### Lecture 2: Human Vision, Color Spaces, Transforms #### - [Slides](https://docs.google.com/presentation/d/1kTvnMCG7qZ8eoA4NF79Q-Qyirt-ZMq6RNv4xAkQh2bY/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/LhA8g7WBfaQ1BBGYFppdeq22qzfBOVZUz3JjsLxJwgNU6TDGe37NudFE-_P-PSScRL_fOpQJZJq-s8-Q.DVfPmM_5B1b6zN29) -------- ### Week 2: Image Transformations ### In week 2 we start to dive into low-level vision and image processing. You learn how to manipulate images and perform operations like resizing, sharpening, smoothing, and more. You'll apply this knowledge as you get started on [Homework 1](https://github.com/pjreddie/uwimg/tree/main/src/hw1). #### Lecture 3: Image Coordinates, Resizing - [Slides](https://docs.google.com/presentation/d/1imYQe7kDCahP2YO69L7FBiwqcEc4LgVEDXW7pDMFhOs/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/xxU4T3uzHibVQip8fsQ2ajQEFa45ife3szjdZ-j7RfrYx7PHzfJqCFrCRHlP-4IcQ16ysD0uT3-CpGAI.9D32iR_Ev9bc_KTl) #### Lecture 4: Resizing, Filters, Convolutions - [Slides](https://docs.google.com/presentation/d/1owZEUBbFp-Iz5kLNo-KvppjtzJGF91wEzfviUphLWqs/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/JccIH7-cW4NuNRig5lxeo10XQgJSBRPhqqXP5_E0wJM7jtgXIF3tHIFT090TiE5AI3WHPwTBz-NtUAPD.DdthnolKyKRngAu_) -------- ### Week 3: Edges and Features Time to put those convolutions to use! For week 3 we delve in to what makes images interesting, what makes them unique, how to find correspondences between images, and how to fit models with a large number of outliers in the data. #### Lecture 5: Edges and Features - [Slides](https://docs.google.com/presentation/d/1WqcYCxTnYzbPmhAsi5N7heb1GACzYd5Rt441Z36cp28/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/WyprYimFQeKuU1zLjqDNpWX24FhAfARLY4eO7fY1rM5lrChVmaZvY1WVPFH7EQwmneXN31qXrq1WOkWv.bZA3K1h1XkfD23AK) #### Lecture 6: Harris, Matching, RANSAC - [Slides](https://docs.google.com/presentation/d/1_HFh3SdmdyZ_j-sFS4Tw17DmhKfjmYZqvSp7TmfdD_M/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/vHHN5q5AdXlPEaInCvRzJF2hndByMRtX8WFh2VRu7g6Qe0_OBKmgweLoMSA6v0FthTEO9zqFpbiWYQXc.RNe57-7CKDgAksB_) #### Tutorial: Debugging With GDB and Valgrind - [Video](https://washington.zoom.us/rec/share/iXrb-wti5dKLaJZ6KrbHeALAQT48hsl2VoAFX7Vqo8Oc1Y68bVuS5sCRpS3DcMMl.Yk2jmBI1gven24b2) ------- ### Week 4: Features and Flow #### Lecture 7: Matching, RANSAC, HOG, and SIFT - [Slides](https://docs.google.com/presentation/d/1Ti2o4HPX8xEIYpfgEm6W3EvGc2bWjiz0ZuyAZfTAuD4/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/hmK6g88L9jhRHnyPYIDcKnKbbzmUruZzRagNvbjoDWnlrgMGdOrUKq7DH_9IO20Bfw3oinyw4Y2BYCu9.OprlmmDX7Z3379_E) #### Lecture 8: Optical Flow - [Slides](https://docs.google.com/presentation/d/1Z6xyKI5SWJ3Qs4aWcbP_bOlj34VW40WIHfhf3b4EN10/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/6MUi7F2f_Z41JK5286KyZMQe3xvOOg5RXMHeyUya7DpGadrRS_MCDxtTYuGN6jH4RhpURRhQlCaR_SsK.vZVPFzLhtfatKQVV) --------- ### Week 5: Depth and ML Review #### Lecture 9: 3D, Depth, and Stereo - [Slides](https://docs.google.com/presentation/d/1e8RSHMYzrRtA3s4wfb7xXoZ1CTH8lKXFlNhPFQVD9KI/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/h45KurlAsaWgjgFhyegNMyED0UcBox8jcpvx5xEnVJLXhqUvi-g1JGfDI0G86dJ0Eby0IrtgsZkZ1Tw4.A62fKxe15IYoe-pM) #### Lecture 10: Machine Learning for Computer Vision - [Slides](https://docs.google.com/presentation/d/1vYhq30b9BEDzzl3rwP8uY6VF1x4jmtzJaxQGK0P2XAI/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/ztBKyOUXDZqVaWOF86G6fwCCIGifTep7-_nqBuFcJ7VCK75ALJoBGxEBSIIVmo0dtnVsewq6PAZ5f9TA.pJH9sv53gbnHgKmi) -------- ### Week 6: Machine Learning and Neural Networks #### Lecture 11: More Machine Learning for Computer Vision - [Slides](https://docs.google.com/presentation/d/1c5qjBEQVhcwejWY6I88BwHGOdZiPvkDD6su1bbqqd2c/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/k9maYqsjOnz_I8Uos6XGdMB6bq5DlF2A5K4ClnKEx0h-j927WMo6xcWEbVKA5J-HTIy1zbmp6EkXIXm7.OLrX8S5-94-g8cTx) #### Lecture 12: Neural Networks - [Slides](https://docs.google.com/presentation/d/1HQOET7oM3fzQba2KAcw20yEtqe_3ZLE9VTtfoPBgb8A/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/g9gZqi2B-7ASWY-eYGqNYghtixDjXP-YKAAb6qVSx3ThH_JdfvSQbvGsC7EKzaMZYGd0bBTMV51FmQ6v.Xa2BxrJ1M0v3F2Hk) -------- ### Week 7: Convolutional Neural Networks #### Lecture 13: Convolutional Neural Networks - [Slides](https://docs.google.com/presentation/d/1szC_xsXx4kBtVaM-6yVjdJRPc2SGgOq3rcpzKM06-UM/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/zGnAHAVAYrl1Mwdsp1rNDWJ0nTyjriPFH6N0ht3Bf6Tkvdk08gtpQlWae89zkfe8BKiQfyQwgTUatZQj.p_9KK1PuiWp-WeJ8) #### Lecture 14: Network Architectures - [Slides](https://docs.google.com/presentation/d/1IIJ6uxpwT3NdOzO2L-KDVsSNOfQM0uBcf4F0WFy-jVc) - [Video](https://washington.zoom.us/rec/play/A6BwZdGz_S8JwdVV0x3EGEtRo_Cu7dkg9F3stWGfXG7XEuWxVn6zLZJx-7RTjBSa1G30RE4CqDmN76jW.HVOaHUbOqeGfnBOz) #### Tutorial: Introduction to PyTorch - [iPynb](https://colab.research.google.com/drive/1cwLRtE1uspBehTouNkEzZ_1dRa2uJvY9?usp=sharing) - [Video](https://washington.zoom.us/rec/play/smtsFsktioAsEwKaR-ontx8mKSwMtcGd6HD6vbXg9Q5KxbHpUMTjrk3gL6KWNI-Skua6Rwl64wMtJCyu.ffpIkXxtqoM6yrxj) --------- ### Week 8 #### Lecture 15: Semantic Segmentation - [Slides](https://docs.google.com/presentation/d/1XFr96QGJFHoRr0HyydnrLntulH7vNlIQd_SHm88yLNk/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/EamCaey8hw9rDUb0JNsZ25Er7xIF9b3qAdbWxtD3lrN8eyTxYPIcCxamjqQRQYAcr4ZSSeLvsGJB8gQ._2oqBSi-60CQPy9C) #### Lecture 16: Object Detection - [Slides](https://docs.google.com/presentation/d/1CLVryBPNddEn8TwM2ngy1bHnlJC4t33OhERzVixFGj4/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/uszuGxMH0CXnuzuzMnxHmKYbSqZbcA7S5W6SNryTgcYsL74BK7O9AYpfjVytj5maJ_U0J0fPa7tG2bXJ.lq8M0bERdyZOM18M) #### Tutorial: CNNs in PyTorch - [iPynb](https://colab.research.google.com/github/pjreddie/uwimg/blob/main/tutorial2_cnns_in_pytorch.ipynb) - [Video](https://washington.zoom.us/rec/play/QlaWvpMNFP5Q9JjA8Y0YnBDpPVoEWeJ5CPNjv529j_XJPFpqNOpo8ux4UIXA5NCZNDv4IO164zY8rnN5.9DEsNquuOjlQ_fjx) ---------- ### Week 9 #### Lecture 17: Instance Segmentation - [Slides](https://docs.google.com/presentation/d/1GsphfNerVfEoolZFxQIbAf0VdZ6ftGEzM5AUAwUewOo/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/HJh2qF-RtYPiWtWSF8kfXTe-6ST0ItlhILmoa2m5S_fe8XwlAmVCLqPkOeHx6fTjQYfZpTYCMaoIZyAw.aZfUlgpxgo3xWGeX) #### Lecture 18: Vision and Language - [Slides](https://docs.google.com/presentation/d/1UJOl-C3qDUMw7ftgKJh7GEELxuNh0KRAtQpzhg3PWrc/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/Ghsx7EEXyliWcIEqRrpVxSqkUd7Vjkk4sfy_ib5XrIUy-B_i2os3qNmuUqN_qPneFWd0VOUikhoK_ftD.63pemuy_18y_i6RN) #### Tutorial: Transfer Learning in PyTorch - [iPynb - ImageNet and Transfer Learning](https://colab.research.google.com/drive/1EBz4feoaUvz-o_yeMI27LEQBkvrXNc_4?usp=sharing) - [iPynb - Transfer Learning to Birds](https://colab.research.google.com/drive/1kHo8VT-onDxbtS3FM77VImG35h_K_Lav?usp=sharing) - [Video](https://washington.zoom.us/rec/play/pO_me9HFDn5wE_M6GwZC0xbT195wP4P1JWPCD10q-I1-NW9pvX-6cSg68cW-JFhN-5ce7DKvKFhuyusg.eBYjVRLFlZd3qZyz) ----------- ### Week 10 #### Lecture 19: Generative Adversarial Networks - [Slides](https://docs.google.com/presentation/d/1NgicaHQhluKQ0r39U4IWo8AtOBJUAQ2aBGCDq_DpgkA/edit?usp=sharing) #### Lecture 20: AlphaGo - [Slides](https://docs.google.com/presentation/d/1NTYdLhcRmJLwPaUYaBZd-m8X1EIMw-8SX1nwJImkN24/edit?usp=sharing)