CS388: Natural Language Processing (Spring 2021)

NOTE: This page is for an old semester of this class

Instructor: Greg Durrett, gdurrett@cs.utexas.edu
Lecture: Tuesday and Thursday 9:30am - 11:00am, on Zoom (see Canvas for Zoom links)
Instructor Office Hours: Tuesday 1pm-2pm, Wednesday 3:30pm-4:30pm (see Canvas for Zoom links)
TA: Xi Ye; xiye@cs.utexas.edu
TA Office Hours: Monday 11am-12pm, Thursday 3:30pm-4:30pm (see Canvas for Zoom Link)

Piazza

Description

This class is a graduate-level introduction to Natural Language Processing (NLP), the study of computing systems that can process, understand, or communicate in human language. The course covers fundamental approaches, largely machine learning and deep learning, used across the field of NLP as well as a comprehensive set of NLP tasks both historical and contemporary. Techniques studied include basic classification techniques, conditional random fields, tree-structured models, recurrent and convolutional neural networks, attention mechanisms, and pre-trained neural models such as BERT. Problems range from syntax (part-of-speech tagging, parsing) to semantics (lexical semantics, question answering, grounding) and include various applications such as summarization, machine translation, information extraction, and dialogue systems. Programming assignments throughout the semester involve building scalable machine learning systems for various of these NLP tasks.

Requirements

391L - Machine Learning, 343 - Artificial Intelligence, or equivalent AI/ML course experience
Familiarity with Python (for programming assignments)
Additional prior exposure to discrete math, probability, linear algebra, optimization, linguistics, and NLP useful but not required

Syllabus

Detailed syllabus with course policies

Assignments: There are four programming assignments that require implementing models discussed in class. Framework code in Python and datasets will be provided. In addition, there is an open-ended final project to be done in teams of 2 (preferred) or individually. This project should constitute novel exploration beyond directly implementing concepts from lecture and should result in a report that roughly reads like an NLP/ML conference submission in terms of presentation and scope.

Mini 1: Classification for Person Name Detection [code and dataset download]

Project 1: CRF tagging for NER [code and dataset download] [very strong project writeup sample]

Mini 2: Neural Networks for Sentiment Analysis [code and dataset download]

Project 2: Encoder-Decoder Models for Question Answering [code and dataset download]

Final Project

Readings: Textbook readings are assigned to complement the material discussed in lecture. You may find it useful to do these readings before lecture as preparation or after lecture to review, but you are not expected to know everything discussed in the textbook if it isn't covered in lecture. Paper readings are intended to supplement the course material if you are interested in diving deeper on particular topics. Bold readings and videos are most central to the course content; it's recommended that you look at these.

The chief text in this course is Eisenstein: Natural Language Processing, available as a free PDF online. For deep learning techniques, this text will be supplemented with selections from Goldberg: A Primer on Neural Network Models for Natural Language Processing. (Another generally useful NLP book is Jurafsky and Martin: Speech and Language Processing (3rd ed. draft), with many draft chapters available for free online; however, we will not be using it much for this course.)

Date Topics Readings Assignments

Jan 19 Introduction [4pp] Mini 1 out

Jan 21 Binary Classification [4pp] Eisenstein 2.0-2.5, 4.2-4.4.1
Perceptron and logistic regression

Jan 26 Multiclass Classification [4pp] Eisenstein 4.2
Multiclass lecture note

Jan 28 Sequence Models 1: HMMs [4pp] Eisenstein 7.0-7.4, 8.1
Manning+11 POS
Viterbi lecture note Mini 1 due / Proj 1 out

Feb 2 Sequence Models 2: CRFs [4pp] Eisenstein 7.5, 8.3
Sutton CRFs 2.3, 2.6.1
Wallach CRFs

Feb 4 Neural 1: Feedforward [4pp] RatinovRoth2009 NER
Eisenstein 3.0-3.3
Botha+17 FFNNs
Iyyer+15 DANs
ffnn_example.py

Feb 9 Neural 2: Word Embeddings; Bias [4pp] Eisenstein 3.3.4, 14.5-14.6
Goldberg 5
Mikolov+13 word2vec
Pennington+14 GloVe
Levy+14 Matrix Factorization
Grave+17 fastText
Bolukbasi+16 Gender
Gonen+19 Debiasing

Feb 11 Neural 3: RNNs [4pp] JM 9.1-9.4
Goldberg 10-11
Karpathy+15 Visualizing Proj 1 due / Mini2 out

Feb 16 NO CLASS

Feb 18 NO CLASS

Feb 23 OPTIONAL: Annotation, Dataset Bias [4pp] Wang+15 Overnight
Gebru+18 Datasheets
Gururangan+18 Artifacts
Gardner+20 Contrast sets

Feb 25 Neural 4: Language Modeling, ELMo [4pp] Eisenstein 6
JM 9.2.1
Melis+17 LSTM LMs
Merity+16 Pointer
Peters+18 ELMo
Peters+19 Frozen vs fine-tuned

Mar 2 Trees 1: Constituency, PCFGs [4pp] Eisenstein 10.0-10.5
JM 12.1-12.6, 12.8
KleinManning13 Structural
Collins97 Lexicalized Mini2 due / FP out

Mar 4 Trees 2: Dependency, Shift-reduce, State-of-the-art Parsers [4pp] Eisenstein 11.1-11.3
JM 13.1-13.3, 13.5
Dozat+17 Dependency
JM 13.4
Andor+16 Parsey
KitaevKlein18
KitaevKlein20 Linear-time

Mar 9 Semantics / Seq2seq 1 [4pp] Eisenstein 12
ZettlemoyerCollins05
Berant+13

Mar 11 Seq2seq 2: Attention [4pp] Sutskever+14 Seq2seq
JiaLiang16 Recomb
Bahdanau+14 Attention
Luong+15 Attention FP proposal due / Proj 2 out

Mar 16 NO CLASS

Mar 18 NO CLASS

Mar 23 Seq2seq 3: Copying/Pointers, Degeneration, etc. [4pp] Vaswani+17 Transformers
Holtzman+19 Degeneration

Mar 25 MT 1: Phrase-based [4pp] Eisenstein 18.0-18.2
Vogel96 HMM
Koehn04 Pharaoh

Mar 30 MT 2: Neural, Transformers [4pp] Eisenstein 18.3-18.4
Alammar Transformer Blog
BostromDurrett20 Tokenization
Wu+16 Google NMT
SennrichZhang19 Low-resource
Aji+20 Transfer

April 1 Pre-training 1: BERT, GPT [4pp] Radford+18 GPT
Devlin+19 BERT
Radford+19 GPT2
Liu+19 RoBERTa
Clark+19 What does BERT look at?
Rogers+20 BERTology Proj 2 due

April 6 Pre-training 2: BART/T5, GPT-3, Ethics [4pp] Raffel+19 T5
Lewis+19 BART
Kaplan+20 Scaling
Brown+20 GPT3
BenderGebru+21 Stochastic Parrots

April 8 Generation: Dialogue, Summarization [4pp] Yu+19 Gunrock
Adiwardana+20 Google Meena
Roller+20 Facebook Blender
See+17 Pointer-Generator
GoyalDurrett20 Factuality

April 13 Interpreting NNs [4pp] Lipton+16 Mythos
Ribeiro+16 LIME
Simonyan+13 Visualizing
Sundararajan+17 Int Grad
Nguyen18 Evaluating Explanations
Interpretation Tutorial

April 15 QA 1: Reading comprehension [4pp] Rajpurkar+16 SQuAD
Sei+16 BiDAF

April 20 QA 2: Multi-hop, etc. [4pp] JiaLiang17 Adversarial
Chen+17 QA on Wikipedia
Lee+19 Latent Retrieval
ChenDurrett19 Multi-hop reasoning
Kwiatkowski+19 NaturalQuestions

April 22 GUEST LECTURE: Jason Baldridge Sanabria+21 Talk Don't Write
Koh+21 Text-to-image
Zhang+21 Cross-Modal Contrastive Learning
Ku+20 Room-across-room
McLelland+20 Integrated

April 27 Multilingual / Cross-lingual models [4pp] DasPetrov11 Xlingual POS
McDonald+11 Xlingual parsing
Ammar+16 Xlingual embeddings
ArtetxeSchwenk19 Multilingual sent embs
Conneau+20 XLM-R
Pires+19 How multilingual is mBERT?
Clark+20 TyDI

April 29 Wrapup + Ethics [4pp] HovySpruit2016 Social Impact of NLP
Zhao+17 Bias Amplification
Rudinger+18 Gender Bias in Coref
Gebru+18 Datasheets for Datasets
Raji+20 Auditing
BenderGebru+21 Stochastic Parrots

May 4 FP presentations 1 [4pp]

May 6 FP presentations 2 [4pp] FP due May 12

Date	Topics	Readings	Assignments
Jan 19	Introduction [4pp]		Mini 1 out
Jan 21	Binary Classification [4pp]	Eisenstein 2.0-2.5, 4.2-4.4.1 Perceptron and logistic regression
Jan 26	Multiclass Classification [4pp]	Eisenstein 4.2 Multiclass lecture note
Jan 28	Sequence Models 1: HMMs [4pp]	Eisenstein 7.0-7.4, 8.1 Manning+11 POS Viterbi lecture note	Mini 1 due / Proj 1 out
Feb 2	Sequence Models 2: CRFs [4pp]	Eisenstein 7.5, 8.3 Sutton CRFs 2.3, 2.6.1 Wallach CRFs
Feb 4	Neural 1: Feedforward [4pp]	RatinovRoth2009 NER Eisenstein 3.0-3.3 Botha+17 FFNNs Iyyer+15 DANs ffnn_example.py
Feb 9	Neural 2: Word Embeddings; Bias [4pp]	Eisenstein 3.3.4, 14.5-14.6 Goldberg 5 Mikolov+13 word2vec Pennington+14 GloVe Levy+14 Matrix Factorization Grave+17 fastText Bolukbasi+16 Gender Gonen+19 Debiasing
Feb 11	Neural 3: RNNs [4pp]	JM 9.1-9.4 Goldberg 10-11 Karpathy+15 Visualizing	Proj 1 due / Mini2 out
Feb 16	NO CLASS
Feb 18	NO CLASS
Feb 23	OPTIONAL: Annotation, Dataset Bias [4pp]	Wang+15 Overnight Gebru+18 Datasheets Gururangan+18 Artifacts Gardner+20 Contrast sets
Feb 25	Neural 4: Language Modeling, ELMo [4pp]	Eisenstein 6 JM 9.2.1 Melis+17 LSTM LMs Merity+16 Pointer Peters+18 ELMo Peters+19 Frozen vs fine-tuned
Mar 2	Trees 1: Constituency, PCFGs [4pp]	Eisenstein 10.0-10.5 JM 12.1-12.6, 12.8 KleinManning13 Structural Collins97 Lexicalized	Mini2 due / FP out
Mar 4	Trees 2: Dependency, Shift-reduce, State-of-the-art Parsers [4pp]	Eisenstein 11.1-11.3 JM 13.1-13.3, 13.5 Dozat+17 Dependency JM 13.4 Andor+16 Parsey KitaevKlein18 KitaevKlein20 Linear-time
Mar 9	Semantics / Seq2seq 1 [4pp]	Eisenstein 12 ZettlemoyerCollins05 Berant+13
Mar 11	Seq2seq 2: Attention [4pp]	Sutskever+14 Seq2seq JiaLiang16 Recomb Bahdanau+14 Attention Luong+15 Attention	FP proposal due / Proj 2 out
Mar 16	NO CLASS
Mar 18	NO CLASS
Mar 23	Seq2seq 3: Copying/Pointers, Degeneration, etc. [4pp]	Vaswani+17 Transformers Holtzman+19 Degeneration
Mar 25	MT 1: Phrase-based [4pp]	Eisenstein 18.0-18.2 Vogel96 HMM Koehn04 Pharaoh
Mar 30	MT 2: Neural, Transformers [4pp]	Eisenstein 18.3-18.4 Alammar Transformer Blog BostromDurrett20 Tokenization Wu+16 Google NMT SennrichZhang19 Low-resource Aji+20 Transfer
April 1	Pre-training 1: BERT, GPT [4pp]	Radford+18 GPT Devlin+19 BERT Radford+19 GPT2 Liu+19 RoBERTa Clark+19 What does BERT look at? Rogers+20 BERTology	Proj 2 due
April 6	Pre-training 2: BART/T5, GPT-3, Ethics [4pp]	Raffel+19 T5 Lewis+19 BART Kaplan+20 Scaling Brown+20 GPT3 BenderGebru+21 Stochastic Parrots
April 8	Generation: Dialogue, Summarization [4pp]	Yu+19 Gunrock Adiwardana+20 Google Meena Roller+20 Facebook Blender See+17 Pointer-Generator GoyalDurrett20 Factuality
April 13	Interpreting NNs [4pp]	Lipton+16 Mythos Ribeiro+16 LIME Simonyan+13 Visualizing Sundararajan+17 Int Grad Nguyen18 Evaluating Explanations Interpretation Tutorial
April 15	QA 1: Reading comprehension [4pp]	Rajpurkar+16 SQuAD Sei+16 BiDAF
April 20	QA 2: Multi-hop, etc. [4pp]	JiaLiang17 Adversarial Chen+17 QA on Wikipedia Lee+19 Latent Retrieval ChenDurrett19 Multi-hop reasoning Kwiatkowski+19 NaturalQuestions
April 22	GUEST LECTURE: Jason Baldridge	Sanabria+21 Talk Don't Write Koh+21 Text-to-image Zhang+21 Cross-Modal Contrastive Learning Ku+20 Room-across-room McLelland+20 Integrated
April 27	Multilingual / Cross-lingual models [4pp]	DasPetrov11 Xlingual POS McDonald+11 Xlingual parsing Ammar+16 Xlingual embeddings ArtetxeSchwenk19 Multilingual sent embs Conneau+20 XLM-R Pires+19 How multilingual is mBERT? Clark+20 TyDI
April 29	Wrapup + Ethics [4pp]	HovySpruit2016 Social Impact of NLP Zhao+17 Bias Amplification Rudinger+18 Gender Bias in Coref Gebru+18 Datasheets for Datasets Raji+20 Auditing BenderGebru+21 Stochastic Parrots
May 4	FP presentations 1 [4pp]
May 6	FP presentations 2 [4pp]		FP due May 12