Link Search Menu Expand Document

CSE 447: Natural Language Processing, Spring 2022

MWF 1:30-2:20pm, KNE 110

Instructor: Yulia Tsvetkov

yuliats@cs.washington.edu

OH: Thu 3:30-4:30pm, Zoom (and by appointment)

Teaching Assistant: Xiaochuang Han

xhan77@cs.washington.edu

OH: Mon 11am-12pm, Zoom

Teaching Assistant: Ivy Guo

zhifeig@cs.washington.edu

OH: Wed 4:30-5:30pm, Zoom

Teaching Assistant: Kaiser Sun

huikas@cs.washington.edu

OH: Thu 5-6pm, Zoom

Teaching Assistant: Leroy Wang

lryw@uw.edu

OH: Fri 11:00am-12:00pm, Zoom

Teaching Assistant: Thai Hoang

qthai912@cs.washington.edu

OH: Tue 2:00-3:00pm, Zoom

Announcements

Summary

This course will explore foundational statistical techniques for the automatic analysis of natural (human) language text. Towards this end the course will introduce pragmatic formalisms for representing structure in natural language, and algorithms for annotating raw text with those structures. The dominant modeling paradigm is corpus-driven statistical learning, covering both supervised and unsupervised methods. Algorithms for NLP is a lab-based course. This means that instead of homeworks and exams, you will mainly be graded based on three hands-on coding projects.

This course assumes a good background in basic probability and a strong ability to program in Python. Experience using numerical libraries such as NumPy and neural network libraries such as PyTorch are a plus. Prior experience with machine learning, linguistics or natural languages is helpful, but not required. There will be a lot of statistics, algorithms, and coding in this class.

Calendar

Calendar is tentative and subject to change. More details will be added as the quarter continues.

Week Date Topics Readings Homeworks
1 3/28 Introduction
[slides]
Eis 1
3/30 Introduction
[slides] [recording]
Eis 1
4/01 Introduction
[slides] [recording]
Eis 1
2 4/04 Text classification
[slides] [recording]
Eis 2; J&M III 4 HW1 out
4/06 Text classification
[slides] [recording]
Eis 2; J&M III 4; Ng & Jordan, 2001
4/08 Text classification
[slides] [recording]
Eis 2; J&M III 5; Pang et al. 2002
3 4/11 Text classification
[slides] [recording]
Eis 2; J&M III 5
4/13 Language modeling
[slides] [recording]
J&M III 3; Eis 6.1-6.2, 6.4 In-class quiz 1
4/15 Language modeling
[slides] [recording]
J&M III 3; Eis 6.1-6.2, 6.4
4 4/18 Lexical semantics
[slides]
J&M III 6; Eis 14
4/20 Lexical semantics, representation learning
[slides] [recording]
Eis 6.3, 6.5; J&M III 7.5; J&M III 9; Goldberg 10 In-class quiz 2
4/22 Neural networks
[slides] [recording]
Baroni et al. 2014; Bojanowski et al. 2017; Peters et al. 2018 HW1 due
5 4/25 Neural language models
[slides] [recording]
Annotated Transformer; Illustrated Transformer HW2 out
4/27 Recommender systems and online training
[slides] [recording]
Recommender Systems lectures
4/29 Sequence labeling
[slides] [recording]
J&M III 8; Eis 7.1-7.4, 8.1 In-class quiz 3
6 5/02 Sequence labeling
[slides] [recording]
Eis 7.1-7.4, 8.1; Collins notes
5/04 Sequence labeling
[slides] [recording]
Eis 7.1-7.4, 8.1; Collins notes In-class quiz 4
5/06 Sequence labeling
[slides] [recording]
Eis 7.5, 7.7, 8.3; Sutton & McCallum 2.1-2.5
7 5/09 Neural sequence labeling
[slides] [recording]
Eis 7.6; Collobert et al. 2011
5/11 Parsing
[slides]
J&M III 13; Eis 10.1-10.2 In-class quiz 5
5/13 Parsing
[slides] [recording]
J&M III 14; Eis 11.1, 11.3
8 5/16 Parsing
[slides] [recording]
Eis 11.1, 11.3; Chen and Manning 2014 HW2 due, HW3 out
5/18 Parsing: research showcase
[slides]
In-class quiz 6
5/20 Research topics: summarization
[slides] [recording]
9 5/23 No class
5/25 TA session
In-class quiz 7
5/27 Research topics: interpretability
[slides] [recording]
10 5/30 Memorial day (no class)
6/01 Research topics: computational ethics
[slides] [recording]
6/03 No class
In-class quiz 8, HW3 due

Resources

Assignments/Grading

  • Project 1 (sequence classification): 30%
    • We will build a system for automatically classifying song lyrics comments by era.
    • Specifically, we build machine learning text classifiers, including both generative and discriminative models, and explore techniques to improve the models.
  • Project 2 (sequence labeling): 30%
    • We focus on sequence labeling with Hidden Markov Models and some simple deep learning based models.
    • Our task is part-of-speech tagging on English and Norwegian from the Universal Dependencies dataset.
    • We will cover the Viterbi algorithm.
  • Project 3 (dependency parsing): 30%
    • We will implement a transition-based dependency parser.
    • The algorithm would be new and specific to the dependency parsing problem, but the underlying building blocks of the method are still some neural network modules covered in P1 and P2.
  • Quizzes: 10%
    • Starting from the 3rd week, we will have quizzes on Wednesdays.
    • There will be 8 quizzes in total.
    • Quizzes will be released 10 minutes in the beginning of the class.
    • 5 best quizzes will be counted into final score. Each quiz will occupy 2% of final score.
  • Participation: 10% bonus

Policies

  • Late policy. Each student will be granted 5 late days to use over the duration of the quarter. You can use a maximum of 3 late days on any one project. Weekends and holidays are also counted as late days. Late submissions are automatically considered as using late days. Using late days will not affect your grade. However, projects submitted late after all late days have been used will receive no credit. Be careful!

  • Academic honesty. Homework assignments are to be completed individually. Verbal collaboration on homework assignments is acceptable, as well as re-implementation of relevant algorithms from research papers, but everything you turn in must be your own work, and you must note the names of anyone you collaborated with on each problem and cite resources that you used to learn about the problem. The project proposal is to be completed by a team. Suspected violations of academic integrity rules will be handled in accordance with UW guidelines on academic misconduct.

  • Accommodations. If you have a disability and have an accommodations letter from the Disability Resources office, I encourage you to discuss your accommodations and needs with me as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the office of Disability Resources for Students, I encourage you to apply here.

Note to Students

Take care of yourself! As a student, you may experience a range of challenges that can interfere with learning, such as strained relationships, increased anxiety, substance use, feeling down, difficulty concentrating and/or lack of motivation. All of us benefit from support during times of struggle. There are many helpful resources available on campus and an important part of having a healthy life is learning how to ask for help. Asking for support sooner rather than later is almost always helpful. UW services are available, and treatment does work. You can learn more about confidential mental health services available on campus here. Crisis services are available from the counseling center 24/7 by phone at +1 (866) 743-7732 (more details here).

COVID-19 Safety

In light of the COVID-19 pandemic and recent surge in cases due to the Omicron variant, and in accordance with UW guidelines, we are implementing the following policies to ensure the safety of our students and instructors to the maximum extent possible:

  • Remote access. If you are sick or have potentially been exposed to COVID-19, stay home! While we encourage everyone to attend class in-person when they are well, there will always be a Zoom meeting for the class and there is no penalty for attending remotely. Office hours are also available both in-person and over Zoom (by appointment).

  • Masking. When in public, indoor spaces occupied by other people, you must wear a mask. This includes class sections and office hours. See more about UW’s masking requirements here. The instructors will abide by the same masking policy.

    For the purposes of this policy, a face covering must fit snugly against the sides of the face and completely cover the nose and mouth. Bandanas and gaiters are not considered face coverings under this policy. Students who do not wear a face mask will be asked to leave the classroom. Repeated failure to wear a face covering may result in being referred to the Student Conduct Office for possible disciplinary action.

    UW has approved a hydration exemption which allows students and instructors to briefly move aside their mask if they are drinking water even in class. This exemption is meant be used only for a brief moment to hydrate, and does not allow talking with one’s mask off or having one’s mask removed for a prolonged period of time. This exemption does not allow for eating food in classes.

  • Social distancing. Currently, UW does not require social distancing in the classroom or office hours for students who are vaccinated and wearing a mask; it can also make it difficult to navigate and interact in such spaces. We do not mandate social distancing, but ask that if another student asks you to maintain distance from them, that you respect their request.

  • What if you get sick? See this FAQ for what to do.

  • What if we get sick? We will reschedule class, hold it remotely, or bring in a substitute lecturer/facilitator if necessary to prevent exposing students. We will try to give notice as far in advance as possible if an in-person event is moving to be held remotely, but please check your email beforehand to be sure you don’t miss anything.