Assigned: Thursday, Feb 1
Proposal due: Thursday, Feb 8 (11:59 P.M.)
Final presentations: Monday, Mar 11 (10:30 A.M. - 12:30 P.M.)
Final reports due: Tuesday, Mar 12 (11:59 P.M.)
Late due date: 2 days after the original due date
For the final project, we anticipate that people will work in teams of three or four. There are two options for this project, which will take roughly four weeks:
You can devise your own project from scratch, or use one of the ideas suggested below. In either case, the purpose is to learn more about and get a feel for doing research in computer vision.
All projects must have a machine learning component.Start by searching through recent computer vision conference proceedings or journal articles, and choosing a paper that interests you. The premiere vision conferences are ICCV, CVPR, ECCV. The premiere journals are TPAMI and IJCV. Sometimes computer vision papers also appear in graphics/machine learning conferences such as NIPS, ICML, SIGGRAPH, SIGGRAPH Asia. We recommend starting with the most recent years, i.e., CVPR 2023, ICCV 2023, ECCV 2022 etc. Most of the papers and project web sites are linked online from here. You should select a paper that is appropriate for a four-week project, i.e., it should be more involved than one of the class projects. Our expectation is that you will implement the method yourself rather than using any code that is available online.
How ambitious/difficult should your project be? Each team member should count on committing substantially more effort than on the previous class projects.
Here are several ideas related to object detection/recognition and would be appropriate for final research projects. Feel free to choose variations of these or to devise your own research problems that
are not on this list. You can either leverage deep learning or not, depending on your skill set and target. We're happy to meet with you to discuss any of these (or
other) project ideas in more detail.
Face/Pedestrian/Vehicle recognition. These are perhaps the most widely used object recognition scenarios. You will build your own detection framework and try it on a standard benchmark dataset. Example research problems include how to balance between speed and accuracy, and how object scale, density or lightning codition affect the detection resutls etc.
Tracking by detection. One classic approach to track a moving object is to remember it and then find it. The object model and tracking status are updated contineously along the video frames, so that appearance change, occlusion status etc. can be handled. You will build an online object tracking framework and perhaps share a link of a demo in your report so we can play with it.
Object re-identification. It is common that an object appears in different cameras with different viewpoint, zoom or lighting conditions. Object re-identification exactly finds such matches and enables many surveillance applications such as finding the criminals. You will design your object re-id pipeline and compare with several state-of-the-art approaches.
Product description. Given a product image, can you generate a paragraph describing the object? You can also develop projects that connect computer vision and natural language processing.
License plate recognition serves as a critical part in many surveillance applications. A typical routine for LPR includes license plate localization, character segmentation and character recognition. However, building a robust and fast LPR system is still very challenging. You will implement your own LPR approach and may try something different. You can (shortly) demo your LPR system during the presentation session.
Human pose/hand gesture recognition is popular in many medical/HCI applications. The basic idea of such articulated object recognition however is similar - finding object parts and their best combination.
Instance recognition. Object instance recognition attracts increasing interest recently; it tries to recognize different object instances that belong to the same class, for example, different kinds of dogs, birds or cars. Practially this is useful for warehouse management, robotics, traffic surveillance etc. You will read state-of-the-art papers, work on a standard instance recognition dataset and try your approach.
Cancer biopsy diagnosis. The task is to classify a given region of interest (ROI) from a whole slide biopsy to one of the four diagnostic categories: benign, atypia, DCIS and invasive. There are 428 ROIs marked and diagnosed by expert pathologists. ROIs have different sizes and shapes but each has only one diagnostic label. You can use different approaches to overcome size differences: sliding windows, resizing etc.
You can find more ideas on Kaggle Competition
Each team will turn in a one-page proposal in pdf form on canvas describing their project. It should specify:
Each team must submit one proposal document. Add your name to one of the project groups created under People -> Groups and then you will be able to see your submissions as a group.
We will have a presentation session at the end of the quarter, where each group will present (in ppt form over zoom) their project to the class.
Each team will turn in one report of about 10 pages in pdf form on canvas describing your problem and approach. Additionally, be sure to include the contributions of each group member, specifying what they worked on. This can go at the end of the introduction of your report.
You must write it in NIPS format. You can either use the .docx file or if you know Latex, you can use the .tex file with Latex2e .sty file. Overleaf is a great website for shared Latex writing and is very easy to learn with the tutorial.
Your report should include the following: