Machine learning explores the study and construction of algorithms that can learn from historical data and make inferences about future outcomes. This study is a marriage of algorithms, computation, and statistics so this class will be have healthy doses of each. The goals of this course are to provide a thorough grounding in the fundamental methodologies and algorithms of machine learning.
Prerequisites: Students entering the class should be comfortable with programming and should have a pre-existing working knowledge of linear algebra (MATH 308), vector calculus (MATH 324), probability and statistics (MATH 394/STAT390), and algorithms. For a brief refresher I recommend you consult the linear algebra and statistics/probability reference materials below.
IMPORTANT: This class uses Mattermost (a secure Slack clone). An invite link will be available on the Canvas Discussion board. If not registered for the course, please request an invite link by sending an email to email@example.com. All class announcements will be broadcasted on mattermost and you are responsible for keeping up to date on it (I suggest you turn on push notifications). The same applies to questions about homeworks, projects and lectures. Mattermost lowers the barrier to asking for help and encourages more interaction. It is also a place where students who are not registered can interact with the rest of the class (unlike Canvas). Please ask all course-related questions in a public channel on Mattermost as other students will often have the same question, or know the answer. If you have a question of personal matters, please email the instructors list: firstname.lastname@example.org.
Your grade will be based on 5 homework assignments (65%) and a final project (35%).
Your homework score will be the smaller of 100 points and the cumulative number of points you receive on the assignments. The first homework is worth 10 points, and the final four are worth 25 each. This means if you receive grades $(x_0,x_1,x_2,x_3,x_4)$ you will receive a score of $\min(100, x_0+x_1+x_2+x_3+x_4)$. In particular, if you receive grades
Each homework assignment contains both theoretical questions and will have programming components.
The first homework (10 points) is designed to be a review of the course prerequisites. If this assignment requires significant effort (e.g., several hours) or contains unfamiliar topics, you should strongly consider dropping the course and revisiting the prerequisites. Its secondary purpose to get you comfortable with Python and Latex.
COLLABORATION POLICY: Homework must be done individually: each student must submit their own answers. In addition, each student must write and submit their own code in the programming part of the assignment (we may run your code). It is acceptable, however, for students to collaborate in figuring out answers and helping each other solve the problems. You must also indicate on each homework with whom you collaborated.
RE-GRADING POLICY: All requests for regrading should be submitted to Gradescope directly. Office hours and in person discussions are limited solely to asking knowledge related questions, not grade related questions. If you feel that we have made an error in grading your homework, please let us know with a written explanation, and we will consider the request. Please note that regrading of a homework means the entire assignment may be regraded which may cause your grade on the entire homework set to go up or down.
LATE POLICY: Homeworks must be submitted online by the posted due date. With the exception of the poster presentation, all work is to be submitted online. There is no credit for late work. The homework scoring system of above is an attempt to minimize the rigidness of this policy. We may make special arrangements for alternative dates for poster presentation (contact the instructors). If you are unable to meet the deadlines due to travel, conferences, other deadlines, or any other reason, do not enroll in the class.
HONOR CODE: As we sometimes reuse problem set questions from previous years, covered by papers and webpages, we expect the students not to copy, refer to, or look at the solutions in preparing their answers (referring to unauthorized material is considered a violation of the honor code). Similarly, we expect students not to google directly for answers. The homework is to help you think about the material, and we expect you to make an honest effort to solve the problems. If you do happen to use other material, it must be acknowledged clearly with a citation on the submitted solution. For more information, please see the CSE Academic Misconduct policy that this course adheres to.
You will work independently or with a partner on a machine learning project spanning most of the quarter ending with a poster presentation and written report. You may use techniques developed in this course but are also encouraged to learn and apply new methods. The project should address a novel question with a non-obvious answer and must have a real-data component. We will provide some seed project ideas. You can pick one of these ideas, and explore the data and algorithms within and beyond what we suggest. You can also use your own data/ideas, but, in this case, you have to make sure you have the data available at the time of the proposal and a nice roadmap, since a quarter is too short to explore a brand new concept. The components of the project are
Example project ideas can be found here.
|12/4, 4:30-7:30 PM||Poster presentation|
|12/7||Project report due|
|12/12, 4:30-7:30 PM||Poster presentation|
|12/12||Optional Homework 3 revisited|
|12/14||Project Reviews due|