CSEP 517: Natural Language Processing

University of Washington

Spring 2017

The syllabus is subject to change; always get the latest version from the class website.
Lectures:CSE 305, Mondays 6:30–9:20 pm
Instructor:Noah A. Smith (nasmith@cs.washington.edu)
Instructor office hours:CSE 532, Mondays 5–6 pm or by appointment
Teaching assistant:George Mulcaire (gmulc@cs.washington.edu)
TA office hours:CSE 220, Mondays 5:30–6:30 pm
Final exam:on Canvas, released ~5/29, due ~6/4

Natural language processing (NLP) seeks to endow computers with the ability to intelligently process human language. NLP components are used in conversational agents and other systems that engage in dialogue with humans, automatic translation between human languages, automatic answering of questions using large text collections, the extraction of structured information from text, tools that help human authors, and many, many more. This course will teach you the fundamental ideas used in key NLP components. It is organized into several parts:

Probabilistic language models, which define probability distributions over text passages.
Text classifiers, which infer attributes of a piece of text by “reading” it.
Sequence models, which transduce sequences into other sequences.
Parsing sentences into syntactic representations.
Semantics, which includes a range of representations of meaning.
Machine translation, which maps text in one language to text in another.

1 Course Plan

datestopic readings deadlines

3/27introduction & language models [1, ch. 1], [2], [3]
4/3language models (continued) [4] §2; if you want more details on neural nets, see [5]A1 due Sun. 4/9

4/10text classifiers [678]

4/17hidden Markov models and applications [9101112] A2 due Sun. 4/23

4/24context-free syntax and parsing [1, ch. 12–14], [13]
5/1dependency syntax and parsing [14, ch. 1, 2, 6] A3 due Sun. 5/7

5/8semantics: predicate-argument, compositional [15]; [1, ch. 18], [16] A4 due Sun. 5/14
5/15distributed semantics; machine translation [1718], [1, ch. 25], [1920]

5/22pragmatics; machine translation (continued); summarization; finale A5 due Sun. 5/28

~5/29final exam (on Canvas) due ~6/4

The table above shows the planned lectures, along with readings. The official textbook for the course is Jurafsky and Martin [1], but some chapters of the forthcoming third edition are available online [21], so we link to those where appropriate.

Lectures will be available at this link, usually a day after the lecture.

2 Evaluation

Students will be evaluated as follows:

3 Computing Resources

CSE has reserved the host umnak.cs.washington.edu for you to use for this course.

4 Academic Integrity

Read, sign, and return the academic integrity policy for this course before turning in any work.


[1]    Daniel Jurafsky and James H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall, second edition, 2008.

[2]    Julia Hirschberg and Christopher D. Manning. Advances in natural language processing. Science, 349(6245):261–266, 2015. URL https://www.sciencemag.org/content/349/6245/261.full.

[3]    Noah A. Smith. Probabilistic language models 1.0, 2017. URL http://homes.cs.washington.edu/~nasmith/papers/plm.17.pdf.

[4]    Michael Collins. Log-linear models, MEMMs, and CRFs, 2011. URL http://www.cs.columbia.edu/~mcollins/crf.pdf.

[5]    Yoav Goldberg. A primer on neural network models for natural language processing, 2015. URL http://u.cs.biu.ac.il/~yogo/nnlp.pdf.

[6]    Daniel Jurafsky and James H. Martin. Naive Bayes and sentiment classification (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/6.pdf.

[7]    Daniel Jurafsky and James H. Martin. Logistic regression (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/7.pdf.

[8]    Michael Collins. The naive Bayes model, maximum-likelihood estimation, and the EM algorithm, 2011. URL http://www.cs.columbia.edu/~mcollins/em.pdf.

[9]    Daniel Jurafsky and James H. Martin. Hidden Markov models (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/9.pdf.

[10]    Michael Collins. Tagging with hidden Markov models, 2011. URL http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/hmms.pdf.

[11]    Daniel Jurafsky and James H. Martin. Part-of-speech tagging (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/10.pdf.

[12]    Daniel Jurafsky and James H. Martin. Information extraction (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/21.pdf.

[13]    Michael Collins. Probabilistic context-free grammars, 2011. URL http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/pcfgs.pdf.

[14]    Sandra Kübler, Ryan McDonald, and Joakim Nivre. Dependency Parsing. Synthesis Lectures on Human Language Technologies. Morgan and Claypool, 2009. URL http://www.morganclaypool.com/doi/pdf/10.2200/S00169ED1V01Y200901HLT002.

[15]    Daniel Jurafsky and James H. Martin. Semantic role labeling and argument structure (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/22.pdf.

[16]    Mark Steedman. A very short introduction to CCG, 1996. URL http://www.inf.ed.ac.uk/teaching/courses/nlg/readings/ccgintro.pdf.

[17]    Daniel Jurafsky and James H. Martin. Vector semantics (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/15.pdf.

[18]    Daniel Jurafsky and James H. Martin. Semantics with dense vectors (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/16.pdf.

[19]    Michael Collins. Statistical machine translation: IBM models 1 and 2, 2011. URL http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/ibm12.pdf.

[20]    Michael Collins. Phrase-based translation models, 2013. URL http://www.cs.columbia.edu/~mcollins/pb.pdf.

[21]    Daniel Jurafsky and James H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall, third edition, forthcoming. URL https://web.stanford.edu/~jurafsky/slp3/.