![Title image: The Ancient Secrets of Computer Vision](images/title.jpg) ## Course Information ## This class is a general introduction to computer vision. It covers standard techniques in image processing like filtering, edge detection, stereo, flow, etc. (old-school vision), as well as newer, machine-learning based computer vision. Course will be offered in a variety of modalities: - In-person: [CSE2 G20](https://www.google.com/maps?q=47.652915,-122.304806+(CSE2)&z=18) **temporarily on hold due to high cases** - Remotely: [Zoom](https://washington.zoom.us/j/92412405037) - Asynchronously: See below for lecture recordings Participate in whatever way best suits your needs this quarter! **Please do not come to in-person class if you are sick or have reason to suspect you may be sick.** ### Instructor ### Joseph Redmon - Email: pjreddie@cs.washington.edu ### TAs ### - Rehaan Minh Bhimani - bhimar@cs.washington.edu - Greg Guo - grgrggtr@cs.washington.edu - Zucks Zixuan Liu - zucksliu@cs.washington.edu - Mino Nakura - nakuram@cs.washington.edu - Mark Theeranantachai - stheera@cs.washington.edu ### Office Hours ### - Monday - 12:00-1:00pm with Mino: https://washington.zoom.us/my/minonakura - Tuesday - 1:30-2:30pm with Rehaan: https://washington.zoom.us/j/94507887953 - Wednesday - 8:30-9:30pm with Mark: https://washington.zoom.us/j/96962845976 - Thursday - 11:00am-12:00pm with Zucks: https://washington.zoom.us/j/99378370744 - 1:00-2:00pm with Joe: https://washington.zoom.us/j/3362756951 - Friday - 2:30-3:30pm with Greg: https://washington.zoom.us/j/99306313827 ### Resources ### - Ed Discussion Board: https://edstem.org/us/courses/17033/discussion/ - Canvas: https://canvas.uw.edu/courses/1515312 - Zoom: https://washington.zoom.us/j/92412405037 Slides are a mishmash of lots of other people's work. Special thanks to: Rob Fergus, Linda Shapiro, Harvey Rhody, Rick Szeliski, Ali Farhadi, Robert Collins. Lectures 8 and 9 on Flow, 3d, and stereo are by [Connor Schenck](https://homes.cs.washington.edu/~schenckc/). All of the slides, videos, and homeworks are free to use, modify, redistribute as you like without permission. Just make your own copy of the slides on Google Docs, don't ask to modify mine! ## Homeworks ## The class has 6 homeworks where you will build out a computer vision library in C. We cover basic image manipulations, filtering, features, stitching, optical flow, machine learning, and convolutional neural networks. Most of the homeworks will use [this repository](https://github.com/pjreddie/uwimg/). The individual homeworks can be found in the `src/` folder. - [Homework 0: Fun with Color!](https://github.com/pjreddie/uwimg/tree/main/src/hw0), Due January 18th - [Homework 1: Resizing](https://github.com/pjreddie/uwimg/tree/main/src/hw1), Due January 20th - [Homework 2: Filtering and Convolutions](https://github.com/pjreddie/uwimg/tree/main/src/hw2), Due January 27th - [Homework 3: Panoramas!](https://github.com/pjreddie/uwimg/tree/main/src/hw3), Due February 3rd - [Homework 4: Optical Flow](https://github.com/pjreddie/uwimg/tree/main/src/hw4), Due February 10th - [Homework 5: Neural Networks and Machine Learning](https://github.com/pjreddie/uwimg/tree/main/src/hw5), Due February 17th **Note:** due date subject to change if we haven't covered relevant material in time for the assignment. You have 8 late day to use throughout the quarter. Each day late counts as one late day. Any number can be used on any assingment. After you have used your late days late assingments will be penalized up to 10% per day late. **COVID Policy:** If you get COVID don't worry about doing your homework, rest, recover, do what you need to do to get better. If you feel like doing computer vision while sick go for it but also know you can take some time off. Once you are well please reach out to the course staff and we can figure out how to get you back on track with assingments and any missed classes. You will not be penalized for turning in assingments late due to COVID (or if you're having trouble getting caught back up afterward). ## Final Project ## There is a final project worth 20% of the final grade. Pick any area of computer vision that interests you and pursue some independent work in that area. Each project should have a significant technical component, software implementation, or large-scale study. Projects can focus on developing new techniques or tools in computer vision or applying existing tools to a new domain. If you don't have an idea you can train a classifier on birds and compete in the Kaggle competition posted on the Ed discussion board. ## Lectures ## ### Week 1: Image Basics ### This week we cover the basics of computer vision. There's an introduction to the three levels of vision, **low-level** vision mostly concerns the pixels or groups of nearby pixels, **mid-level** vision starts to connect images to each other and the real world, and **high-level** vision connects images to semantics and meaning. There's background information on the human visual system, color, light, what an image actually is, and how it's stored in a computer. All fun stuff! Once you've learned the basics you should be ready for [Homework 0](https://github.com/pjreddie/uwimg/tree/main/src/hw0), which is mostly an introduction to the codebase we'll be using for the assignments. #### Lecture 1: Introduction - [Slides](https://docs.google.com/presentation/d/1VqTJEVC0gwxfY5TbINgBKyWqAmdxLYxgj_yIiphJj34/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/tYNUctCKVo1LVfWMPPg4mDtdXW-4MGsbW9lTM1UzWVKZdxTmf-T7pGT68WHQBfAMOuBEXs1k5YA6BHi2.Nk__SZIJB-JVkbZ9) #### Lecture 2: Human Vision, Color Spaces, Transforms #### - [Slides](https://docs.google.com/presentation/d/1kTvnMCG7qZ8eoA4NF79Q-Qyirt-ZMq6RNv4xAkQh2bY/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/EAK1XZFflNoK5leN9eW3Lg41XCOyF1Es9liatTaR3yGVvnAcaZgg7UzKGhGV9q8QgNM5WzX-SNYSqZ1Z.sZRvu_ELCbXVOspx) -------- ### Week 2: Image Transformations ### In week 2 we start to dive into low-level vision and image processing. You learn how to manipulate images and perform operations like resizing, sharpening, smoothing, and more. You'll apply this knowledge as you get started on [Homework 1](https://github.com/pjreddie/uwimg/tree/main/src/hw1). #### Lecture 3: Image Coordinates, Resizing - [Slides](https://docs.google.com/presentation/d/1imYQe7kDCahP2YO69L7FBiwqcEc4LgVEDXW7pDMFhOs/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/AOxtNY1Tbbtdw0REhg6DqDroR_cPFTgYWyJuqY2ENbxcNUHj1DH8EaNpmqfgHoWfm8r4bvybvK1qBB_3.p9MAoLNCTt12kdVE) #### Lecture 4: Resizing, Filters, Convolutions - [Slides](https://docs.google.com/presentation/d/1owZEUBbFp-Iz5kLNo-KvppjtzJGF91wEzfviUphLWqs/edit?usp=sharing) - [Video](https://washington.zoom.us/rec/play/KnElM8uyNyL4ykuB_A6o5uGlDuNtAHl45ze2X2uVwS_qXsOVtcQL0sSoOw4vVQ3u9t5qiOpj6jy0pgSo.HhK-UhXh3VmTAhpe) -------- ### Week 3: Edges and Features Time to put those convolutions to use! For week 3 we delve in to what makes images interesting, what makes them unique, how to find correspondences between images, and how to fit models with a large number of outliers in the data. #### Lecture 5: Edges and Features - [Slides](https://docs.google.com/presentation/d/1WqcYCxTnYzbPmhAsi5N7heb1GACzYd5Rt441Z36cp28/edit?usp=sharing) #### Lecture 6: Harris, Matching, RANSAC ------- ### Week 4: Features and Flow #### Lecture 7: Matching, RANSAC, HOG, and SIFT #### Lecture 8: Optical Flow --------- ### Week 5: Depth and ML Review #### Lecture 9: 3D, Depth, and Stereo #### Lecture 10: Machine Learning for Computer Vision -------- ### Week 6: Machine Learning and Neural Networks #### Lecture 11: More Machine Learning for Computer Vision #### Lecture 12: Neural Networks -------- ### Week 7: Convolutional Neural Networks #### Lecture 13: Convolutional Neural Networks #### Lecture 14: Network Architectures --------- ### Week 8 #### Lecture 15: Semantic Segmentation #### Lecture 16: Object Detection ---------- ### Week 9 #### Lecture 17: Instance Segmentation #### Lecture 18: Vision and Language ----------- ### Week 10 #### Lecture 19: Generative Adversarial Networks #### Lecture 20: Transformers and Vision