Link Search Menu Expand Document

CSE 447: Natural Language Processing, Autumn 2022

MWF 3:30-4:20pm, CSE2 G01

Instructor: Yulia Tsvetkov

yuliats@cs.washington.edu

OH: Fri 2:30-3:15pm, CSE 566 (preferably by appointment)

Teaching Assistant: Daksh Sinha

daksh97@uw.edu

OH: Tues 3:00-4:00pm on Zoom

Teaching Assistant: Jacob Morrison

jacobm00@cs.washington.edu

OH: Fri 2:00-3:00pm, Allen 220

Teaching Assistant: Leo Liu

zeyuliu2@cs.washington.edu

OH: Wed 2:00-3:00pm, Gates Center 153

Teaching Assistant: Leroy Wang

lryw@uw.edu

OH: Thu 3:00-4:00pm on Zoom

Teaching Assistant: Urmika Kasi

ukasi@uw.edu

OH: Mon 12:00-1:00pm on Zoom

Announcements

Summary

This course will explore foundational statistical techniques for the automatic analysis of natural (human) language text. Towards this end the course will introduce pragmatic formalisms for representing structure in natural language, and algorithms for annotating raw text with those structures. The dominant modeling paradigm is corpus-driven statistical learning, covering both supervised and unsupervised methods. Algorithms for NLP is a lab-based course. This means that instead of homeworks and exams, you will mainly be graded based on three hands-on coding projects.

This course assumes a good background in basic probability and a strong ability to program in Python. Experience using numerical libraries such as NumPy and neural network libraries such as PyTorch are a plus. Prior experience with machine learning is important. Prior experience in linguistics or natural languages is helpful, but not required. There will be a lot of statistics, algorithms, and coding in this class.

Calendar

Calendar is tentative and subject to change. More details will be added as the quarter continues.

Week Date Topics Readings Homeworks
1 9/28 Logistics
[slides] [recording]
Course website, syllabus
9/30 Introduction
[slides] [recording]
Eis 1 HW1 out
2 10/03 Introduction
[slides]
Eis 1
10/05 Text classification
[slides] [recording]
Eis 2; J&M III 4
10/07 Text classification
[slides] [recording]
Eis 2; J&M III 4; Ng & Jordan, 2001
3 10/10 Text classification
[slides] [recording]
Eis 2; J&M III 5; Pang et al. 2002
10/12 Text classification
[slides] [recording]
J&M III 5 In-class quiz 1
10/14 Text classification
[slides] [recording]
J&M III 5
4 10/17 Language modeling
[slides] [recording]
J&M III 3; Eis 6.1-6.2, 6.4
10/19 Language modeling
[slides] [recording]
J&M III 3; Eis 6.1-6.2, 6.4
10/21 Lexical semantics
[slides] [recording]
J&M III 6; Eis 14 HW1 due
5 10/24 Lexical semantics
[slides] [recording]
Eis 14; J&M III 6
10/26 Lexical semantics
[slides] [recording]
Eis 14; J&M III 6 In-class quiz 3
10/28 Neural networks
[slides] [recording]
Eis 6.3, 6.5; J&M III 7.5; J&M III 9; Goldberg 10; Collobert et al. 2011
6 10/31 Neural networks
[slides] [recording] [supplementary recording]
Annotated Transformer; Illustrated Transformer HW2 Out
11/02 Sequence labeling
[slides] [recording]
Eis 7.1-7.4, 8.1; J&M III 8 In-class quiz 4
11/04 Sequence labeling
[slides] [recording]
Eis 7.1-7.4, 8.1; Collins notes
7 11/07 Sequence labeling
[slides] [recording]
Eis 7.5, 7.7, 8.3; Sutton & McCallum 2.1 - 2.5
11/09 Sequence labeling
[slides] [recording]
Eis 7.6 In-class quiz 5
11/11 Veterans Day (no class)
8 11/14 Neural sequence labeling
[slides] [recording]
Eis 7.6
11/16 Parsing
[slides] [recording]
Eis 10.1-10.2; J&M III 13 In-class quiz 6
11/18 Parsing
[slides] [recording]
Eis 11.1, 11.3; J&M III 14
9 11/21 Cancelled
HW2 due, In-class quiz 7
11/23 Parsing
[slides] [recording]
Eis 11.1, 11.3; Chen and Manning 2014
11/25 Thanksgiving (no class)
10 11/28 Advanced topics: Recommender systems and online training
[slides] [recording]
Recommender Systems Lectures HW3 out
11/30 Research topics: Summarization
[slides] [recording]
Kassas et al. 2021
12/02 Advanced topics: Computational ethics
[slides] [recording]
The Trouble With Bias
11 12/05 Advanced topics: Natural Language Understanding
[slides] [recording]
In-class quiz 8
12/07 Q&A
12/09 TBD
HW3 due

Resources

Assignments/Grading

  • Project 1 (sequence classification): 30%
    • We will build a system for automatically classifying song lyrics comments by era.
    • Specifically, we build machine learning text classifiers, including both generative and discriminative models, and explore techniques to improve the models.
  • Project 2 (sequence labeling): 30%
    • We focus on sequence labeling with Hidden Markov Models and some simple deep learning based models.
    • Our task is part-of-speech tagging on English and Norwegian from the Universal Dependencies dataset.
    • We will cover the Viterbi algorithm.
  • Project 3 (dependency parsing): 30%
    • We will implement a transition-based dependency parser.
    • The algorithm would be new and specific to the dependency parsing problem, but the underlying building blocks of the method are still some neural network modules covered in P1 and P2.
  • Quizzes: 10%
    • Starting from the 3rd week, we will have quizzes on Wednesdays.
    • There will be 8 quizzes in total.
    • Quizzes will be released 10 minutes in the beginning of the class.
    • 5 best quizzes will be counted into final score. Each quiz will occupy 2% of final score.
  • Participation: 10% bonus

Policies

  • Late policy. Each student will be granted 5 late days to use over the duration of the quarter. You can use a maximum of 3 late days on any one project. Weekends and holidays are also counted as late days. Late submissions are automatically considered as using late days. Using late days will not affect your grade. However, projects submitted late after all late days have been used will receive no credit. Be careful!

  • Academic honesty. Homework assignments are to be completed individually. Verbal collaboration on homework assignments is acceptable, as well as re-implementation of relevant algorithms from research papers, but everything you turn in must be your own work, and you must note the names of anyone you collaborated with on each problem and cite resources that you used to learn about the problem. The project proposal is to be completed by a team. Suspected violations of academic integrity rules will be handled in accordance with UW guidelines on academic misconduct.

  • Accommodations. If you have a disability and have an accommodations letter from the Disability Resources office, I encourage you to discuss your accommodations and needs with me as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the office of Disability Resources for Students, I encourage you to apply here.

Note to Students

Take care of yourself! As a student, you may experience a range of challenges that can interfere with learning, such as strained relationships, increased anxiety, substance use, feeling down, difficulty concentrating and/or lack of motivation. All of us benefit from support during times of struggle. There are many helpful resources available on campus and an important part of having a healthy life is learning how to ask for help. Asking for support sooner rather than later is almost always helpful. UW services are available, and treatment does work. You can learn more about confidential mental health services available on campus here. Crisis services are available from the counseling center 24/7 by phone at +1 (866) 743-7732 (more details here).

COVID-19 Safety

In light of the COVID-19 pandemic and recent surge in cases due to the Omicron variant, and in accordance with UW guidelines, we are implementing the following policies to ensure the safety of our students and instructors to the maximum extent possible:

  • Course instruction The course will be taught in-person only, following the UW guidelines.

  • Remote access. If you are sick or have potentially been exposed to COVID-19, stay home! While we encourage everyone to attend class in-person when they are well, there will always be a Zoom meeting for the class and there is no penalty for attending remotely. Office hours are also available both in-person and over Zoom (by appointment).

  • Masking. In accordance with UW’s masking policy, masks are strongly recommended the first two weeks of the quarter and will be recommended after that, so long as we stay in the CDC’s “low” community level. Given the flexibility in choosing whether to wear a mask or not, please be respectful of others’ choices. Read more about UW’s policy here.

    If you would like a mask, please feel free to stop by the reception desk in the Allen Center, where they can provide you your choice of either a KN95/N95 mask or a cloth mask. Additionally, UW mask distribution will continue at various library locations, the Health Sciences Center, the HUB, and testing sites.

  • Social distancing. Currently, UW does not require social distancing in the classroom or office hours for students who are vaccinated and wearing a mask; it can also make it difficult to navigate and interact in such spaces. We do not mandate social distancing, but ask that if another student asks you to maintain distance from them, that you respect their request.

  • What if you get sick? Stay home if you are sick! The COVID-19 Public Health Flowchart indicates what you should do if you test positive, have been exposed to COVID-19, or have symptoms. Also see this FAQ for what to do.

  • What if we get sick? We will reschedule class, hold it remotely, or bring in a substitute lecturer/facilitator if necessary to prevent exposing students. We will try to give notice as far in advance as possible if an in-person event is moving to be held remotely, but please check your email beforehand to be sure you don’t miss anything.