![Title image: The Ancient Secrets of Computer Vision](images/title.jpg) ## Course Information ## This class is a general introduction to computer vision. It covers standard techniques in image processing like filtering, edge detection, stereo, flow, etc. (old-school vision), as well as newer, machine-learning based computer vision. ### Instructor ### Joseph Redmon - Email: pjreddie@cs.washington.edu - Office Hours: Tu/Thur 10:00-11:20 am ### TAs ### Bindita Chaudhuri - Email: bindita@cs Josie Lee - Email: jlee98@cs Zhichao Lei - Email: zl68@cs Lirui Leroy Wang - Email: liruiw@cs Paul Yoo - Email: yoosehy@cs ### Office Hours ### - Monday - 3-4pm with Bindita - https://washington.zoom.us/j/98831167276 - Tuesday - 10-11am with Joe - https://washington.zoom.us/j/93606871917 - 1-2pm with Paul - https://washington.zoom.us/j/92759521140 - Wednesday - 3-4pm with Zhichao - https://washington.zoom.us/j/99106036413 - Thursday - 10-11am with Joe - https://washington.zoom.us/j/93606871917 - 3-4pm with Lirui - https://washington.zoom.us/j/94568560697 - Friday ### Resources ### - Ed Discussion Board: https://edstem.org/us/courses/3128 - Canvas: https://canvas.uw.edu/courses/1431923 - Zoom: https://washington.zoom.us/j/93606871917 Slides are a mishmash of lots of other people's work. Special thanks to: Rob Fergus, Linda Shapiro, Harvey Rhody, Rick Szeliski, Ali Farhadi, Robert Collins. Lectures 8 and 9 on Flow, 3d, and stereo are given by [Connor Schenck](https://homes.cs.washington.edu/~schenckc/). All of the slides, videos, and homeworks are free to use, modify, redistribute as you like without permission. Just make your own copy of the slides on Google Docs, don't ask to modify mine! ## Homeworks ## The class has 6 homeworks where you will build out a computer vision library in C. We cover basic image manipulations, filtering, features, stitching, optical flow, machine learning, and convolutional neural networks. Most of the homeworks will use [this repository](https://github.com/pjreddie/uwimg/). The individual homeworks can be found in the `src/` folder. - [Homework 0: Fun with Color!](https://github.com/pjreddie/uwimg/tree/main/src/hw0), Due January 14th - [Homework 1: Resizing](https://github.com/pjreddie/uwimg/tree/main/src/hw1), Due January 21st - [Homework 2: Filtering and Convolutions](https://github.com/pjreddie/uwimg/tree/main/src/hw2), Due January 28th - [Homework 3: Panoramas!](https://github.com/pjreddie/uwimg/tree/main/src/hw3), Due February 4th - [Homework 4: Optical Flow](https://github.com/pjreddie/uwimg/tree/main/src/hw4), Due February 11th - [Homework 5: Neural Networks and Machine Learning](https://github.com/pjreddie/uwimg/tree/main/src/hw5), Due February 18th - Homework 6: PyTorch ## Final Project: ## There is a final project worth 20% of the final grade. Pick any area of computer vision that interests you and pursue some independent work in that area. Each project should have a significant technical component, software implementation, or large-scale study. Projects can focus on developing new techniques or tools in computer vision or applying existing tools to a new domain. If you don't have an idea you can train a classifier on birds and compete in the Kaggle competition posted on the Ed discussion board. ## Lectures ## ### Week 1: Image Basics ### This week we cover the basics of computer vision. There's an introduction to the three levels of vision, **low-level** vision mostly concerns the pixels or groups of nearby pixels, **mid-level** vision starts to connect images to each other and the real world, and **high-level** vision connects images to semantics and meaning. There's background information on the human visual system, color, light, what an image actually is, and how it's stored in a computer. All fun stuff! Once you've learned the basics you should be ready for [Homework 0](https://github.com/pjreddie/uwimg/tree/main/src/hw0), which is mostly an introduction to the codebase we'll be using for the assignments. #### Lecture 1: Introduction #### - [Slides](https://docs.google.com/presentation/d/1VqTJEVC0gwxfY5TbINgBKyWqAmdxLYxgj_yIiphJj34/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/t7OkrWuyjvvLja480npOtp440frXbjF3S3EVMyk2JjgKxTIob-Nl87bhRRQi27fmqha6_ni0F-VAGF8.7sZvWqMZIcZ2VkHv) #### Lecture 2: Human Vision, Color Spaces, Transforms #### - [Slides](https://docs.google.com/presentation/d/1YBK7QkBW9t4kuZ8bJdwXLspyyeSpEJJldHvAhVcSsiM/edit?usp=sharing) - [Video](https://www.youtube.com/watch?v=-nt80JUNwlw) #### Supplementary Content - [Summary of lecture slides w/ questions](https://washington.zoom.us/rec/play/JrNKoDoltN7l-a62S96rR82k3VQsU2KLG3PeD43WcnelULLvkkzqbVZvIpsVwXYfD2prahpwo8GtRkly.RboeGz9eJoWw8Kld) - [Homework 0 introduction](https://washington.zoom.us/rec/play/BDqug9OxZMniL7Zb-d09PVRjmPREgFdLddCz6KP88ZCOhAmODzSpkwaK5F-PISV4y-9Opxa9TNEu8hTb.HXAC-Um23ITBEQWv) -------- ### Week 2: Image Transformations ### In week 2 we start to dive into low-level vision and image processing. You learn how to manipulate images and perform operations like resizing, sharpening, smoothing, and more. You'll apply this knowledge as you get started on [Homework 1](https://github.com/pjreddie/uwimg/tree/main/src/hw1). #### Lecture 3: Image Coordinates, Resizing - [Slides](https://docs.google.com/presentation/d/1ZkGgPPUzlGOdoGK6YNgiVLNouJW1ZaWzKEupDKmYM_4/edit?usp=sharing) - [Video](https://youtu.be/hpqrDUuk7HY?t=129) #### Lecture 4: Resizing, Filters, Convolutions - [Slides](https://docs.google.com/presentation/d/18f0cWwS40jwbP5u37LKvM9ZaHEtRQFUKQ7MCtMdGLmA/edit?usp=sharing) - [Video](https://youtu.be/5xdbJ7z4Nrc?t=119) #### Supplementary Content - [Summary of Lecture 3](https://washington.zoom.us/rec/play/UECHEJ4fGFUTIWHNo7vW-qYw2THndjxsbIC7cF9xf263s8e1jhdo5mZpX0_XxyvDbRK7A-rsdI-euI-z.iCMar9gGaV6RYx0c) - [Summary of Lecture 4](https://washington.zoom.us/rec/play/cL4nOZCCLuF-BxUaej6rek2BIAHd1YmESbv61grJ3ejK4an_A2Ls_ylqI1YISgZUGFSUSZtdliMWHy9y.xZvDrBHIsiZXepd3) ----------- ### Week 3: Edges and Features #### Lecture 5: Edges and Features - [Video](https://www.youtube.com/watch?v=z5WSV6CXsxs) - [Slides](https://docs.google.com/presentation/d/1_ZOtT17Ih2P-MRbWtZ8CTRQaJBz9V6-5_VA00QebiQQ/edit?usp=sharing) #### Lecture 6: Harris, Matching, RANSAC - [Video](https://www.youtube.com/watch?v=bn4KHa_zWuQ) - [Slides](https://docs.google.com/presentation/d/1GLPcw-hQB1D94mOzTZKdMwAa8NKuqgih-bWl1vJS0tE/edit?usp=sharing) #### Supplementary Content - [Summary of Lecture 5](https://washington.zoom.us/rec/play/fD-KKDQMIfCylT0YUnv2uu0-Y3RMD1BxeIPo9T5xMsUTg8qYMVEjbQOuyaJQZ95qhnCexnB7IKEJgHRQ.-MfYITP9nakhNwDY) - [Summary of Lecture 6](https://washington.zoom.us/rec/play/Gwqyyr5EOjZYVbSqQv4MGfknWKbdSx1oIqVdIsg9mE9iXIXvOVPb_-sqdnhRQBr3wL9GMwjasABvNQU6.FIh_mTpktmwSnyUq) - [Debugging Tips for C](https://washington.zoom.us/rec/play/Hj7lOqs5FnkxjoWrZ4JmLQSAl5tnIhzXnZagi8n8O4fDvdoINOAws7kcW2ZICn7zIfWdizHrRPRB2dgD.Vzjx0KVhQHbwY4v0) --------- ### Week 4: Features and Flow #### Lecture 7: Matching, RANSAC, HOG, and SIFT - [Video](https://www.youtube.com/watch?v=taty6lPVcmA) - [Slides](https://docs.google.com/presentation/d/1h2Az_a28qjKvLpbkwXoW0eut9HTwmjTkjCtk876PYN8/edit?usp=sharing) #### Lecture 8: Optical Flow - [Video](https://www.youtube.com/watch?v=a-v5_8VGV0A) - [Slides](https://docs.google.com/presentation/d/1guQ0hGL7tfHibiYID6gULCUO2OJJpNaWu6VYJXScdeY/edit?usp=sharing) #### Supplementary Content - [Summary of Lecture 7](https://washington.zoom.us/rec/play/izF2CEQCDfTuAmWm9jemcOzbDqu3n-U3FnDJ_HFNXUEoofmmSE7EGKbeSiBqLNflTHjXv_nCxwG3LLQP.xRsuFLZicMNIy9No) - [Summary of Lecture 8](https://washington.zoom.us/rec/play/Y9LzvQcLYv53-0o29Pxxw3GakZsA-7Utv_HoXr5JS1kLRf2Vg1gjiF0u2ZhMVdIS0Rzq6pPB0kmPQBgH.AWgBMaoXdtLb3kXD) --------- ### Week 5: Depth and ML Review #### Lecture 9: 3D, Depth, and Stereo - [Video](https://www.youtube.com/watch?v=AA8FEwutsVk) - [Slides](https://docs.google.com/presentation/d/1ZaFvVx8U7hJpGqaqk4Fxjj5QU-EHt8B8zmYbK6uaEaI/edit?usp=sharing) #### Lecture 10: Machine Learning for Computer Vision - [Video](https://www.youtube.com/watch?v=AIL5PuvRAPI) - [Slides](https://docs.google.com/presentation/d/1QgvrxpjVJLcYPWPVm9gXvqLjQth4um1nJpc0R00SNpg/edit?usp=sharing) #### Supplementary Content - [Summary of Lecture 9](https://washington.zoom.us/rec/play/SfhFavTzF_xsJx7IoO3PJnO5sx-jdLpy2qkWpxnSDz8f_EMbYT_OyEXbleWRtBmwtwvS8zr-BRp0JaYd.T8XibtscYCsahR4D) - [Summary of Lecture 10](https://washington.zoom.us/rec/play/ss3H37-JeoKg7HSDNm2SAuG6KoGE_gPI9LsT8mn7QyIh8KJ-SBLYveQByq_ZHs62dJbPZpYv8-7qpG8y.2KokiC5ZMclFRWqK) ------- ### Week 6: Machine Learning and Neural Networks #### Lecture 11: More Machine Learning for Computer Vision - [Video](https://www.youtube.com/watch?v=3fCrfabOm8U) - [Slides](https://docs.google.com/presentation/d/1sU-rMMkWXMuQYhjJkhPFAD7VifZnxXMBxOfevdwOOKU/edit?usp=sharing) #### Lecture 12: Neural Networks - [Video](https://www.youtube.com/watch?v=fXuIpJ-2MR4) - [Slides](https://docs.google.com/presentation/d/1NLdRUsxH30tSNe46OOd3rPa-xoKFkDR-fYH8rnh0POo/edit?usp=sharing) #### Supplementary Content - [Summary of Lecture 11](https://washington.zoom.us/rec/play/e5gNU0IEyHda3i8SORVJFDzkBNSWsVQeDvsWuKdNsK26l3AnGou_OQolMK2_hsgcYt3zec-C5fn7yIOF.2h4V2I9O48ZmtyPq) - [Summary of Lecture 12](https://washington.zoom.us/rec/play/e5gNU0IEyHda3i8SORVJFDzkBNSWsVQeDvsWuKdNsK26l3AnGou_OQolMK2_hsgcYt3zec-C5fn7yIOF.2h4V2I9O48ZmtyP://washington.zoom.us/rec/play/B67mBt2mPhmQPlM-52hPyWyG8OyhmaJMpc2_iyWoS_tjYKhxYXxUPAadpIzAg81HBVV6VJuMBFbT1et_.mKiCYvbSyF0Uw9UZ) -------- ### Week 7: Convolutional Neural Networks #### Lecture 13: Convolutional Neural Networks - [Video](https://www.youtube.com/watch?v=RnD0OFbZGbA) - [Slides](https://docs.google.com/presentation/d/1LwTvykcPzDoAzQyAZB4cbP5Lh_czqfBAMGnBddVHbxs/edit?usp=sharing) #### Lecture 14: Network Architectures - [Video](https://www.youtube.com/watch?v=-XK_uMVD2CY) - [Slides](https://docs.google.com/presentation/d/1Y5q24TUmhKTZXYVOqrcqCkHP1CMQGG1CveS8KNNztms/edit?usp=sharing) #### Supplementary Content - [Summary of Lecture 13](https://washington.zoom.us/rec/play/bh1e4NIw2m8K7l4Atd9K-t3yYKOCu4p_OrJ5wExqjK5qL82H7KZ6mD9xI5l2XdI7Rtsc6HxTkwVvzXOE.l_xJCHF9T9lhq6D2) - [Summary of Lecture 14](https://washington.zoom.us/rec/play/1gZH8RrAu4-CDk5ys-K9aAIhskuhdIlYjZSbWlktpWEZXKuGiwDKR117SYQzSnsukOPv6vsjUmfFpblo.J-eVLIFwnkzWyttu) --------- #### Lecture 15: Semantic Segmentation - [Video](https://www.youtube.com/watch?v=4vd8zQQb7bk&feature=youtu.be) - [Slides](https://docs.google.com/presentation/d/1XFr96QGJFHoRr0HyydnrLntulH7vNlIQd_SHm88yLNk/edit#slide=id.g3600a767f6_1_21) #### Lecture 16: Object Detection - [Video](https://www.youtube.com/watch?v=7e0umTYMv_Y&feature=youtu.be) - [Slides](https://docs.google.com/presentation/d/1CLVryBPNddEn8TwM2ngy1bHnlJC4t33OhERzVixFGj4/edit#slide=id.g3600a767f6_1_21) #### Supplementary Content - Tutorial 1 - PyTorch Introduction - [Notebook](https://colab.research.google.com/drive/1CYD8uaxc_J5xmkJWT3cKnaF4cnUASWJP?usp=sharing) - [Notebook w/ Experiments](https://colab.research.google.com/drive/1CYD8uaxc_J5xmkJWT3cKnaF4cnUASWJP?usp=sharing) - [Video](https://washington.zoom.us/rec/play/mf3Kio9anNlagXso_3Q123MJt06T2cSkAmIhee2tqsQ7NxKdoG8u1R6feXwEhjPrqRbviDga4B4n9nvP.A1Qmj21_6kqMmLcc) - [Video Continued (experiments)](https://washington.zoom.us/rec/play/mdjJNWIsIRRTmPOcbRsDbjmN0EUWK2FyDefZwGW1PJdLeqdTLFY2FrxQfkZhRSQjHkGhu1diY4MbyejG.3qJk4UySm_R-NV3w) - Tutorial 2 - CNNs in Pytorch - [Notebook](https://colab.research.google.com/drive/1aedcC_6-2j2Jz0BySbJgTSyWfrpmtarI) - [Video](https://washington.zoom.us/rec/play/5EpWX4p4erjzqYHPWNjl4idDUOVOYeqriD3bJkVAeZkaxE1OrJwH-3Yrzy_20d6CNFtLjgStlfeZy_n0.lPGv4gm5snC2qCVB) - [Video Continued (experiments)](https://washington.zoom.us/rec/play/G-HH2p5UNRmTOPsAjAZDO5Xero2hZgTgEZyxVy-_7YRk36guB0Q_qX8AZs_3lsieXClbKZH4VNUGmJIo.QqaneTlpS3R1cMPv) ---------- #### Lecture 17: Instance Segmentation - [Video](https://www.youtube.com/watch?v=oW9qwD62Ljs) - [Slides](https://docs.google.com/presentation/d/1mqzWsN1Zt4_e3W5NXwDDnQZvjqC99NVJndOZvI1fPck/edit#slide=id.g37ded8bff5_0_88) #### Lecture 18: Vision and Language - [Video](https://www.youtube.com/watch?v=6CYsaaCY_u0) - [Slides](https://docs.google.com/presentation/d/1GsphfNerVfEoolZFxQIbAf0VdZ6ftGEzM5AUAwUewOo/edit#slide=id.g7a8786fbee_0_174) #### Supplementary Content - Tutorial 3 - ImageNet and Transfer Learning - [Notebook](https://colab.research.google.com/drive/1EBz4feoaUvz-o_yeMI27LEQBkvrXNc_4) - [Video](https://washington.zoom.us/rec/play/GinXcDkCGBG8ko8sPDb8GnO9D5SiBez2_k2Nvq5QTM34kgi9gVSQd_d4WiyyHTX7Elw4sgyfhTlPKA5T.37qWR5-jNEmybJY7) --------- #### Lecture 19: Generative Adversarial Networks - [Video](https://www.youtube.com/watch?v=um7jjHyFItE) - [Slides](https://docs.google.com/presentation/d/1NgicaHQhluKQ0r39U4IWo8AtOBJUAQ2aBGCDq_DpgkA/edit?usp=sharing) #### Lecture 20: Transformers and Vision