There will be two exams in CSE/STAT 416 this quarter:

  • Midterm: A take-home midterm 07/15-07/17

  • Final: An in-person final exam 08/14

Final

Logistics

The final exam will cover the various topics of machine learning that we have learned this quarter.

The final exam is on Wednesday, 8/14/2024 at 5:00 pm - 6:40 pm. You will have the full exam period (1 hour and 40 minutes) to complete the exam. You are allowed a single A4 notesheet (both sides) with you during the exam. You should have no electronic devices or any of your own scratch paper on your desk; you should just have your student ID, writing utensils, your notesheet, a water bottle and the exam. Any violation of these rules will result in a 0 on the entire test.

There will be semi-assigned seating with students sitting in the room by quiz section. The seating chart will be posted in the room during the exam. If you require extra time and receive DRS accommodations, you must get in touch with the instructor early on to schedule a separate final exam session ahead of time.

Resources

Here are some review materials that have been put together by past and current course staff. Like with training a good ML model, you will want to use good training practices to make sure you properly assessing your understanding of the material. We recommend that you save the practice exam until later in your studying so that you can use it as a un-biased estimate of your test accuracy. When taking the practice exam, try to take it like you would the real exam (i.e. time yourself, try to do the whole thing without breaks our looking at your notes).

Study resources

Note that these aren’t meant to replace your own study materials made from your notes and learning reflections, but might be helpful references as well!

Practice Exams

Study Strategies

  • Look over slides and do practice problems (from lectures, sections, checkpoints, assignments).

  • Make sure you understand the correct responses in concept questions from assignments. You can view these on Gradescope. Post on the discussion board if any are confusing.

  • You should be able to explain for each technique:

    • What types of problems it can be used for

    • How it works (key ideas)

    • Challenges (overfitting, having to choose hyperparameters, etc)

  • Also, you should be able to explain general ML concepts such as overfitting, bias-variance tradeoff, precision/recall, etc. You can practice by pretending you’re being asked about these concepts at an interview.

  • Use your learning reflections to help you review. Try building up your cheat sheet before taking and practice exams. Save the practice exams til the end and use those to assess your perforamnce.

Midterm

Exam

Logistics

Exam I will be released on Monday July 15th at 8:30 am, and will be due on Wednesday, July 17th at 11:59 pm (more info here). Although the exam is designed to be completed within 1-2 hours, you will have the full window, whenever works for you. The exam has a time limit of 180 minutes (3 hours). No submissions will be accepted after the Wednesday deadline – you may not use late days on the exam!

The exam will be open-note and open-internet. The exam is to be completed individually.

While the exam is out, we will have a modified policy for how much support we can provide in office hours and the message board. You can still attend office hours, but the purpose of the exam is for you to apply the learning you’ve done in this class, and the course staff will only answer clarifying or logistical questions during office hours – we will not help you with specific questions, review course concepts with you, or give you hints on the exam. The course staff will be closely monitoring Ed to give you a fast response in case of a technological issue.

Review Materials

There are a number of materials to refer to while studying for the exam. While it is open note, we recommend studying and building up a set of notes ahead of time as that will help you navigate the exam while it is out.

  • The slides and notes for the course lectures

  • The conceptual portions of the previous HW assignments. You can think of the midterm as a long conceptual portion that will cover the breadth of topics discussed so far.

  • Your learning reflections and notes that you are building up!

The best way to start preparing for the exam is to refer to the recaps listed at the end of each lecture, which provide a roadmap of the skills you can expect to see on the exam. Then, you can practice those skills by engaging with the course resources linked above.

You don’t just have to ask homework questions in office hours – the course staff is always happy to help you review the course concepts! Feel free to talk to us about concepts or review questions in office hours, on Ed or Discord, or in section.

Topics

Exam I will generally cover course content up to and including Wednesday, 7/10. That includes the following course components:

  • Lectures 0 - 6

  • Checkpoints 0-6

  • Sections 0 - 3

  • HWs 0-2

The following is an non-exhaustive list of the topics you might see on Exam I:

  • Regression

    • Regression Model

    • Quality Metrics for Regression

    • Learning Algorithm (Gradient Descent)

    • Feature Extraction

  • Assessing Performance

    • Model Complexity

    • Training vs. Test vs. True Error

    • Overfitting and Underfitting

    • Bias Variance Tradeoff

    • Error and training set size

  • Regularization: Ridge

    • Interpreting coefficients

    • Coefficient magnitude and overfitting

    • Regularization

    • L2 penalty

    • Ridge Regression

    • Practicalities with regularization

  • Regularization: LASSO

    • Feature Selection

    • All subsets algorithm and limitations

    • Greedy algorithms

    • LASSO Regression

    • Choosing hyperparameters

    • Compare/Contrast Ridge/LASSO

  • Classification

    • Classification terminology

    • Linear classifiers

    • Decision Boundary

    • Classification error/accuracy

    • Class imbalance

    • Confusion Matrix

    • Learning theory

  • Logistic Regression

    • Probability predictions

    • Logistic function

    • Logistic regression

    • Maximizing likelihood

    • Gradient Ascent

    • Effects of learning rate

    • Overfitting and logistic regression

  • Bias and Fairness

    • Calibration

    • Impacts of ML on society

    • Sources of bias

      • Historical bias

      • Representation bias

      • Measurement bias

      • Aggregate bias

      • Evaluation bias

      • Deployment bias

    • Definitions of fairness

      • Fairness through unawareness

      • Statistical parity

      • Equal opportunity

      • Predictive equality

  • More Bias/Fairness

    • Impossibility to achieve all fairness and accuracy constraints

    • Fairness-accuracy tradeoff

    • Pareto frontier

    • Modeling spaces

      • Construct space

      • Observed space

      • Decision space

    • Individual fairness

    • What you see is what you get (WYSIWYG)

    • Structural Bias + We’re all equal (WAE)

    • Conflicting worldviews

  • Decision Trees

    • Generative vs Discriminative models

    • Parametric vs non-parametric methods

    • Decision Trees introduction

    • Feature Selection with Decision Trees

    • Overfitting in Decision Trees