The syllabus is subject to change; always get the latest version from the class website.
| |
Website: | http://courses.cs.washington.edu/courses/cse517/18sp |
Lectures: | EE 045, Wednesdays and Fridays 1:00–2:20 pm |
Instructor: | Noah A. Smith (nasmith@cs.washington.edu) |
Instructor office hours: | CSE 532, Fridays 12:00–1:00 pm or by appointment |
Teaching assistants: | Dianqi Li (dianqili@uw.edu) |
Kelvin Luu (kellu@cs.washington.edu) | |
TA office hours: | CSE 220, Mondays 10:00-11:00 am (Dianqi) |
CSE 220, Wednesdays 9:00-10:00 am (Kelvin) | |
Natural language processing (NLP) seeks to endow computers with the ability to intelligently process human language. NLP components are used in conversational agents and other systems that engage in dialogue with humans, automatic translation between human languages, automatic answering of questions using large text collections, the extraction of structured information from text, tools that help human authors, and many, many more. This course will teach you the fundamental ideas used in key NLP components. It is organized into four parts:
Table 1 shows the planned lectures, along with readings. Slide links will start working once the slides are posted for a given lecture (usually shortly after the lecture). The textbook will be Eisenstein [1].
3/28 | introduction | [2]; [1] section A | |||
3/30 | generative |
*→ | [1] section 5.1–2, [3, 4, 5] | ||
4/4 | probabilistic | featurized | [3], [6] section 2, 7.4 | ||
4/6 | language | neural (continued) | [1] section 5.3–5.6; [7] section 0–4, 10–13 | ||
4/13 | models | cotext: topic models | [1] section 13; [8] section 1–4 | ||
4/18 | cotext and bitext | [9] | |||
4/20–25 | text classifiers | methods & applications | *→ | [1] section 1–3; [10, 11] | |
4/25 | methods for sequences |
*→ * | [1] section 6; [12] | ||
4/27 | parts of speech | [1] section 7.1; [13] | |||
5/2 | linguistic | supersenses, entities, chunking | [1] section 7.3; [14] | ||
5/2–4 | representations | graphical models |
*→ | [15] | |
5/9 | and | phrase-structure trees | [1] section 8–9; [16] | ||
5/11 | analyzers | syntactic dependencies | [1] section 10 | ||
5/16 | semantic roles and relations | [1] section 12; [17] | |||
5/18 | logical forms | [1] section 11; [18] | |||
5/23–30 |
text generators | translation, summarization | *→* | [1] section 17–18; [19] | |
Students will be evaluated as follows:
Please read, print, sign, and return the academic integrity form.
[1] Jacob Eisenstein. Natural Language Processing. 2018. URL https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf.
[2] Julia Hirschberg and Christopher D. Manning. Advances in natural language processing. Science, 349(6245):261–266, 2015. URL https://www.sciencemag.org/content/349/6245/261.full.
[3] Noah A. Smith. Probabilistic language models 1.0, 2017. URL http://homes.cs.washington.edu/~nasmith/papers/plm.17.pdf.
[4] Michael Collins. Course notes for COMS w4705: Language modeling, 2011. URL http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/lm.pdf.
[5] Daniel Jurafsky and James H. Martin. N-grams (draft chapter), 2015. URL https://web.stanford.edu/~jurafsky/slp3/4.pdf.
[6] Michael Collins. Log-linear models, MEMMs, and CRFs, 2011. URL http://www.cs.columbia.edu/~mcollins/crf.pdf.
[7] Yoav Goldberg. A primer on neural network models for natural language processing, 2015. URL http://u.cs.biu.ac.il/~yogo/nnlp.pdf.
[8] Peter D. Turney and Patrick Pantel. From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37(1):141–188, 2010. URL https://www.jair.org/media/2934/live-2934-4846-jair.pdf.
[9] Michael Collins. Statistical machine translation: IBM models 1 and 2, 2011. URL http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/ibm12.pdf.
[10] Daniel Jurafsky and James H. Martin. Classification: Naive Bayes, logistic regression, sentiment (draft chapter), 2015. URL https://web.stanford.edu/~jurafsky/slp3/7.pdf.
[11] Michael Collins. The naive Bayes model, maximum-likelihood estimation, and the EM algorithm, 2011. URL http://www.cs.columbia.edu/~mcollins/em.pdf.
[12] Michael Collins. Tagging with hidden Markov models, 2011. URL http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/hmms.pdf.
[13] Daniel Jurafsky and James H. Martin. Part-of-speech tagging (draft chapter), 2015. URL https://web.stanford.edu/~jurafsky/slp3/9.pdf.
[14] Daniel Jurafsky and James H. Martin. Information extraction (draft chapter), 2015. URL https://web.stanford.edu/~jurafsky/slp3/21.pdf.
[15] Daphne Koller, Nir Friedman, Lise Getoor, and Ben Taskar. Graphical models in a nutshell, 2007. URL http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.146.2935.
[16] Michael Collins. Probabilistic context-free grammars, 2011. URL http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/pcfgs.pdf.
[17] Daniel Jurafsky and James H. Martin. Semantic role labeling (draft chapter), 2015. URL https://web.stanford.edu/~jurafsky/slp3/22.pdf.
[18] Mark Steedman. A very short introduction to CCG, 1996. URL http://www.inf.ed.ac.uk/teaching/courses/nlg/readings/ccgintro.pdf.
[19] Michael Collins. Phrase-based translation models, 2013. URL http://www.cs.columbia.edu/~mcollins/pb.pdf.