CS388: Natural Language Processing (Spring 2021)
NOTE: This page is for an old semester of this class
Instructor: Greg Durrett, gdurrett@cs.utexas.edu
Lecture: Tuesday and Thursday 9:30am - 11:00am, on Zoom (see Canvas for Zoom links)
Instructor Office Hours: Tuesday 1pm-2pm, Wednesday 3:30pm-4:30pm (see Canvas for Zoom links)
TA: Xi Ye; xiye@cs.utexas.edu
TA Office Hours: Monday 11am-12pm, Thursday 3:30pm-4:30pm (see Canvas for Zoom Link)
Piazza
Description
This class is a graduate-level introduction to Natural Language Processing (NLP), the study of computing systems that
can process, understand, or communicate in human language. The course covers fundamental approaches, largely machine learning
and deep learning, used across the field of NLP as well as a comprehensive set of NLP tasks both historical and contemporary.
Techniques studied include basic classification techniques, conditional random fields, tree-structured models, recurrent and convolutional neural
networks, attention mechanisms, and pre-trained neural models such as BERT. Problems range from syntax (part-of-speech tagging, parsing) to semantics (lexical semantics, question
answering, grounding) and include various applications such as summarization, machine translation, information extraction,
and dialogue systems. Programming assignments throughout the semester involve building scalable machine learning systems for various of these NLP tasks.
Requirements
- 391L - Machine Learning, 343 - Artificial Intelligence, or equivalent AI/ML course experience
- Familiarity with Python (for programming assignments)
- Additional prior exposure to discrete math, probability, linear algebra, optimization, linguistics, and NLP
useful but not required
Syllabus
Detailed syllabus with course policies
Assignments: There are four programming assignments that require
implementing models discussed in class. Framework code in Python and datasets
will be provided. In addition, there is an open-ended final project to
be done in teams of 2 (preferred) or individually. This project should constitute
novel exploration beyond directly implementing concepts from lecture and should
result in a report that roughly reads like an NLP/ML conference submission in
terms of presentation and scope.
Mini 1: Classification for Person Name Detection [code and dataset download]
Project 1: CRF tagging for NER [code and dataset download] [very strong project writeup sample]
Mini 2: Neural Networks for Sentiment Analysis [code and dataset download]
Project 2: Encoder-Decoder Models for Question Answering [code and dataset download]
Final Project
Readings: Textbook readings are assigned to complement the material discussed in lecture. You may find it useful
to do these readings before lecture as preparation or after lecture to review, but you are not expected to know everything discussed
in the textbook if it isn't covered in lecture.
Paper readings are intended to supplement the course material if you are interested in diving deeper on particular topics.
Bold readings and videos are most central to the course content; it's recommended that you look at these.
The chief text in this course is Eisenstein: Natural Language Processing,
available as a free PDF online. For deep learning techniques, this text will be supplemented with selections from Goldberg: A Primer on Neural Network Models for Natural Language Processing.
(Another generally useful NLP book is Jurafsky and Martin: Speech and Language Processing (3rd ed. draft), with many draft chapters available for free online; however,
we will not be using it much for this course.)
Date |
Topics |
Readings |
Assignments |
Jan 19 |
Introduction [4pp] |
|
Mini 1 out |
Jan 21 |
Binary Classification [4pp] |
Eisenstein 2.0-2.5, 4.2-4.4.1
Perceptron and logistic regression
|
|
Jan 26 |
Multiclass Classification [4pp] |
Eisenstein 4.2
Multiclass lecture note
|
|
Jan 28 |
Sequence Models 1: HMMs [4pp] |
Eisenstein 7.0-7.4, 8.1
Manning+11 POS
Viterbi lecture note
|
Mini 1 due / Proj 1 out |
Feb 2 |
Sequence Models 2: CRFs [4pp] |
Eisenstein 7.5, 8.3
Sutton CRFs 2.3, 2.6.1
Wallach CRFs
|
|
Feb 4 |
Neural 1: Feedforward [4pp] |
RatinovRoth2009 NER
Eisenstein 3.0-3.3
Botha+17 FFNNs
Iyyer+15 DANs
ffnn_example.py
|
|
Feb 9 |
Neural 2: Word Embeddings; Bias [4pp] |
Eisenstein 3.3.4, 14.5-14.6
Goldberg 5
Mikolov+13 word2vec
Pennington+14 GloVe
Levy+14 Matrix Factorization
Grave+17 fastText
Bolukbasi+16 Gender
Gonen+19 Debiasing
|
|
Feb 11 |
Neural 3: RNNs [4pp] |
JM 9.1-9.4
Goldberg 10-11
Karpathy+15 Visualizing
|
Proj 1 due / Mini2 out |
Feb 16 |
NO CLASS |
|
|
Feb 18 |
NO CLASS |
|
|
Feb 23 |
OPTIONAL: Annotation, Dataset Bias [4pp] |
Wang+15 Overnight
Gebru+18 Datasheets
Gururangan+18 Artifacts
Gardner+20 Contrast sets
|
|
Feb 25 |
Neural 4: Language Modeling, ELMo [4pp] |
Eisenstein 6
JM 9.2.1
Melis+17 LSTM LMs
Merity+16 Pointer
Peters+18 ELMo
Peters+19 Frozen vs fine-tuned
|
|
Mar 2 |
Trees 1: Constituency, PCFGs [4pp] |
Eisenstein 10.0-10.5
JM 12.1-12.6, 12.8
KleinManning13 Structural
Collins97 Lexicalized
|
Mini2 due / FP out |
Mar 4 |
Trees 2: Dependency, Shift-reduce, State-of-the-art Parsers [4pp] |
Eisenstein 11.1-11.3
JM 13.1-13.3, 13.5
Dozat+17 Dependency
JM 13.4
Andor+16 Parsey
KitaevKlein18
KitaevKlein20 Linear-time
|
|
Mar 9 |
Semantics / Seq2seq 1 [4pp] |
Eisenstein 12
ZettlemoyerCollins05
Berant+13
|
|
Mar 11 |
Seq2seq 2: Attention [4pp] |
Sutskever+14 Seq2seq
JiaLiang16 Recomb
Bahdanau+14 Attention
Luong+15 Attention
|
FP proposal due / Proj 2 out |
Mar 16 |
NO CLASS |
|
|
Mar 18 |
NO CLASS |
|
|
Mar 23 |
Seq2seq 3: Copying/Pointers, Degeneration, etc. [4pp] |
Vaswani+17 Transformers
Holtzman+19 Degeneration
|
|
Mar 25 |
MT 1: Phrase-based [4pp] |
Eisenstein 18.0-18.2
Vogel96 HMM
Koehn04 Pharaoh
|
|
Mar 30 |
MT 2: Neural, Transformers [4pp] |
Eisenstein 18.3-18.4
Alammar Transformer Blog
BostromDurrett20 Tokenization
Wu+16 Google NMT
SennrichZhang19 Low-resource
Aji+20 Transfer
|
|
April 1 |
Pre-training 1: BERT, GPT [4pp] |
Radford+18 GPT
Devlin+19 BERT
Radford+19 GPT2
Liu+19 RoBERTa
Clark+19 What does BERT look at?
Rogers+20 BERTology
|
Proj 2 due |
April 6 |
Pre-training 2: BART/T5, GPT-3, Ethics [4pp] |
Raffel+19 T5
Lewis+19 BART
Kaplan+20 Scaling
Brown+20 GPT3
BenderGebru+21 Stochastic Parrots
|
|
April 8 |
Generation: Dialogue, Summarization [4pp] |
Yu+19 Gunrock
Adiwardana+20 Google Meena
Roller+20 Facebook Blender
See+17 Pointer-Generator
GoyalDurrett20 Factuality
|
|
April 13 |
Interpreting NNs [4pp] |
Lipton+16 Mythos
Ribeiro+16 LIME
Simonyan+13 Visualizing
Sundararajan+17 Int Grad
Nguyen18 Evaluating Explanations
Interpretation Tutorial
|
|
April 15 |
QA 1: Reading comprehension [4pp] |
Rajpurkar+16 SQuAD
Sei+16 BiDAF
|
|
April 20 |
QA 2: Multi-hop, etc. [4pp] |
JiaLiang17 Adversarial
Chen+17 QA on Wikipedia
Lee+19 Latent Retrieval
ChenDurrett19 Multi-hop reasoning
Kwiatkowski+19 NaturalQuestions |
|
April 22 |
GUEST LECTURE: Jason Baldridge |
Sanabria+21 Talk Don't Write
Koh+21 Text-to-image
Zhang+21 Cross-Modal Contrastive Learning
Ku+20 Room-across-room
McLelland+20 Integrated
|
|
April 27 |
Multilingual / Cross-lingual models [4pp] |
DasPetrov11 Xlingual POS
McDonald+11 Xlingual parsing
Ammar+16 Xlingual embeddings
ArtetxeSchwenk19 Multilingual sent embs
Conneau+20 XLM-R
Pires+19 How multilingual is mBERT?
Clark+20 TyDI
|
|
April 29 |
Wrapup + Ethics [4pp] |
HovySpruit2016 Social Impact of NLP
Zhao+17 Bias Amplification
Rudinger+18 Gender Bias in Coref
Gebru+18 Datasheets for Datasets
Raji+20 Auditing
BenderGebru+21 Stochastic Parrots
|
|
May 4 |
FP presentations 1 [4pp] |
|
|
May 6 |
FP presentations 2 [4pp] |
|
FP due May 12 |