CSEP 517: Natural Language Processing

Natural language processing (NLP) seeks to endow computers with the ability to intelligently process human language. NLP components are used in conversational agents and other systems that engage in dialogue with humans, automatic translation between human languages, automatic answering of questions using large text collections, the extraction of structured information from text, tools that help human authors, and many, many more. This course will teach you the fundamental ideas used in key NLP components. It is organized into several parts:

1 Course Plan

The table above shows the planned lectures, along with readings. The official textbook for the course is Jurafsky and Martin [1], but some chapters of the forthcoming third edition are available online [21], so we link to those where appropriate.

2 Evaluation

3 Computing Resources

CSE has reserved the host umnak.cs.washington.edu for you to use for this course.

4 Academic Integrity

References

[1] Daniel Jurafsky and James H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall, second edition, 2008.

[2] Julia Hirschberg and Christopher D. Manning. Advances in natural language processing. Science, 349(6245):261–266, 2015. URL https://www.sciencemag.org/content/349/6245/261.full.

[3] Noah A. Smith. Probabilistic language models 1.0, 2017. URL http://homes.cs.washington.edu/~nasmith/papers/plm.17.pdf.

[4] Michael Collins. Log-linear models, MEMMs, and CRFs, 2011. URL http://www.cs.columbia.edu/~mcollins/crf.pdf.

[5] Yoav Goldberg. A primer on neural network models for natural language processing, 2015. URL http://u.cs.biu.ac.il/~yogo/nnlp.pdf.

[6] Daniel Jurafsky and James H. Martin. Naive Bayes and sentiment classification (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/6.pdf.

[7] Daniel Jurafsky and James H. Martin. Logistic regression (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/7.pdf.

[8] Michael Collins. The naive Bayes model, maximum-likelihood estimation, and the EM algorithm, 2011. URL http://www.cs.columbia.edu/~mcollins/em.pdf.

[9] Daniel Jurafsky and James H. Martin. Hidden Markov models (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/9.pdf.

[10] Michael Collins. Tagging with hidden Markov models, 2011. URL http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/hmms.pdf.

[11] Daniel Jurafsky and James H. Martin. Part-of-speech tagging (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/10.pdf.

[12] Daniel Jurafsky and James H. Martin. Information extraction (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/21.pdf.

[13] Michael Collins. Probabilistic context-free grammars, 2011. URL http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/pcfgs.pdf.

[14] Sandra Kübler, Ryan McDonald, and Joakim Nivre. Dependency Parsing. Synthesis Lectures on Human Language Technologies. Morgan and Claypool, 2009. URL http://www.morganclaypool.com/doi/pdf/10.2200/S00169ED1V01Y200901HLT002.

[15] Daniel Jurafsky and James H. Martin. Semantic role labeling and argument structure (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/22.pdf.

[16] Mark Steedman. A very short introduction to CCG, 1996. URL http://www.inf.ed.ac.uk/teaching/courses/nlg/readings/ccgintro.pdf.

[17] Daniel Jurafsky and James H. Martin. Vector semantics (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/15.pdf.

[18] Daniel Jurafsky and James H. Martin. Semantics with dense vectors (draft chapter), 2016. URL https://web.stanford.edu/~jurafsky/slp3/16.pdf.

[19] Michael Collins. Statistical machine translation: IBM models 1 and 2, 2011. URL http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/ibm12.pdf.

[20] Michael Collins. Phrase-based translation models, 2013. URL http://www.cs.columbia.edu/~mcollins/pb.pdf.

[21] Daniel Jurafsky and James H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall, third edition, forthcoming. URL https://web.stanford.edu/~jurafsky/slp3/.


The syllabus is subject to change; always get the latest version from the class website.

Website:	http://courses.cs.washington.edu/courses/csep517/17sp
Lectures:	CSE 305, Mondays 6:30–9:20 pm
Instructor:	Noah A. Smith (nasmith@cs.washington.edu)
Instructor office hours:	CSE 532, Mondays 5–6 pm or by appointment
Teaching assistant:	George Mulcaire (gmulc@cs.washington.edu)
TA office hours:	CSE 220, Mondays 5:30–6:30 pm
Final exam:	on Canvas, released ~5/29, due ~6/4

dates	topic	readings	deadlines


3/27	introduction & language models	[1, ch. 1], [2], [3]
4/3	language models (continued)	[4] §2; if you want more details on neural nets, see [5]	A1 due Sun. 4/9

4/10	text classifiers	[6, 7, 8]

4/17	hidden Markov models and applications	[9, 10, 11, 12]	A2 due Sun. 4/23

4/24	context-free syntax and parsing	[1, ch. 12–14], [13]
5/1	dependency syntax and parsing	[14, ch. 1, 2, 6]	A3 due Sun. 5/7

5/8	semantics: predicate-argument, compositional	[15]; [1, ch. 18], [16]	A4 due Sun. 5/14
5/15	distributed semantics; machine translation	[17, 18], [1, ch. 25], [19, 20]

5/22	pragmatics; machine translation (continued); summarization; finale		A5 due Sun. 5/28

~5/29	final exam (on Canvas)		due ~6/4