CS388: Natural Language Processing (Spring 2023)
Instructor: Greg Durrett, gdurrett@cs.utexas.edu
Lecture: Tuesday and Thursday 9:30am - 11:00am, UTC 1.146
Instructor Office Hours: Monday 12pm, Thursday 4pm, on Zoom (see Canvas for meetings)
TA: Kaj Bostrom, Xuefei (Sophie) Zhao
TA Office Hours: TBD
- Monday 3pm (Kaj)
- Wednesday 4:30pm (Sophie)
- Friday 3pm (Kaj)
Piazza
Description
This class is a graduate-level introduction to Natural Language Processing (NLP), the study of computing systems that
can process, understand, or communicate in human language. The course covers fundamental approaches, particularly deep learning
and language model pre-training, used across the field of NLP, as well as a comprehensive set of NLP tasks both historical and contemporary.
Techniques studied include basic classification techniques, feedforward neural networks, attention mechanisms, pre-trained neural models such as BERT, and structured
models (sequences, trees, etc.). Problems range from syntax (part-of-speech tagging, parsing) to semantics (lexical semantics, question
answering, grounding) and include various applications such as summarization, machine translation, information extraction,
and dialogue systems. Programming assignments throughout the semester involve building scalable machine learning systems for various of these NLP tasks.
Requirements
- 391L - Machine Learning, 343 - Artificial Intelligence, or equivalent AI/ML course experience
- Familiarity with Python for programming assignments
- Additional prior exposure to discrete math, probability, linear algebra, optimization, linguistics, and NLP
useful but not required
Detailed syllabus with course policies
Assignments: See syllabus for more details about these.
Project 1: Linear and Neural Sentiment Classification [code and dataset download]
Project 2: Transformer Language Modeling [code and dataset download]
Project 3: Dataset Biases [code and dataset download]
Final Project
Readings: Textbook readings are assigned to complement the material discussed in lecture. You may find it useful
to do these readings before lecture as preparation or after lecture to review, but you are not expected to know everything discussed
in the textbook if it isn't covered in lecture.
Paper readings are intended to supplement the course material if you are interested in diving deeper on particular topics.
Bold readings and videos are most central to the course content; it's recommended that you look at these.
The chief text in this course is Eisenstein: Natural Language Processing,
available as a free PDF online. For deep learning techniques, this text will be supplemented with selections from Goldberg: A Primer on Neural Network Models for Natural Language Processing.
(Another generally useful NLP book is Jurafsky and Martin: Speech and Language Processing (3rd ed. draft), with many draft chapters available for free online; however,
we will not be using it much for this course.)
Date |
Topics |
Readings |
Assignments |
Jan 10 |
Introduction [4pp] |
|
P1 out |
Jan 12 |
Binary Classification [4pp] |
Eisenstein 2.0-2.5, 4.2-4.4.1
Perceptron and logistic regression
|
|
Jan 17 |
Multiclass Classification [4pp] |
Eisenstein 4.2
Multiclass lecture note
|
|
Jan 19 |
Neural 1: Feedforward [4pp] |
Eisenstein 3.0-3.3
Botha+17 FFNNs
Iyyer+15 DANs
ffnn_example.py
|
|
Jan 24 |
Neural 2: Word Embeddings, Bias in Embeddings [4pp] |
Eisenstein 3.3.4, 14.5-14.6
Goldberg 5
Mikolov+13 word2vec
Pennington+14 GloVe
Levy+14 Matrix Factorization
Grave+17 fastText
Burdick+18 Instability
Bolukbasi+16 Gender
Gonen+19 Debiasing
|
|
Jan 26 |
Neural 3: Language Modeling, Attention [4pp] |
Bengio+03 NPLM
Luong+15 Attention
Vaswani+17 Transformers
Alammar Illustrated Transformer
|
P1 due / P2 out |
Jan 31 |
NO CLASS |
|
|
Feb 2 |
NO CLASS |
|
|
Feb 7 |
Neural 4: Transformers [4pp] |
Vaswani+17 Transformers
Alammar Illustrated Transformer
Kaplan+20 Scaling Laws
Beltagy+20 Longformer
Choromanski+21 Performer
Tay+20 Efficient Transformers
|
|
Feb 9 |
Pre-training 1: Encoders (BERT), Tokenization [4pp] |
Peters+18 ELMo
Devlin+19 BERT
Alammar Illustrated BERT
Liu+19 RoBERTa
Clark+20 ELECTRA
He+21 DeBERTa
BostromDurrett20 Tokenizers
|
|
Feb 14 |
Pre-training 2: Decoders (GPT/T5), Decoding Methods [4pp] |
Raffel+19 T5
Lewis+19 BART
Radford+19 GPT2
Brown+20 GPT3
Chowdhery+21 PaLM
Holtzman+19 Nucleus Sampling
|
P2 due / FP proposals out |
Feb 16 |
Evaluation in NLP, Datasets, Dataset Bias [4pp] |
Wang+19 SuperGLUE
BIGBench
Gururangan+18 Artifacts
McCoy+19 Right
Gardner+20 Contrast
Swayamdipta+20 Cartography
Utama+20 Debiasing
|
|
Feb 21 |
Understanding NNs 1: Interpretability [4pp] |
Lipton+16 Mythos
Ribeiro+16 LIME
Simonyan+13 Visualizing
Sundararajan+17 Int Grad
Interpretation Tutorial
|
|
Feb 23 |
Understanding NNs 2: Prompting, Interpreting GPT-3 [4pp] |
Zhao+21 Calibrate Before Use
Min+22 Rethinking Demonstrations
Gonen+22 Demystifying Prompts
Xie+21 ICL as Implicit Bayesian Inference
Akyurek+22 ICL regression
Olson+22 Induction Heads
|
FP proposals due / P3 out |
Feb 28 |
Understanding NNs 3: Rationales, Chain-of-thought [4pp] |
Camburu+18 e-SNLI
Wei+22 CoT
YeDurrett22 Unreliability
Kojima+22 Step-by-step
Ye+22 Complementary
|
|
Mar 2 |
Instruction tuning, RL in NLP, Dialog/ChatGPT [4pp] |
Sanh+21 T0
Liu+21 Prompting
Chung+22 Flan-PaLM
Ouyang+22 Human Feedback
Ramamurthy+22 RL for NLP
Roller+20 Facebook Blender
Gehman+20 Toxicity
Thoppilan+22 LaMDA
|
|
Mar 7 |
Sequence Tagging [4pp] |
Eisenstein 7, 8
Manning+11 POS
Sutton CRFs 2.3, 2.6.1
Wallach CRFs
|
|
Mar 9 |
Trees 1: Constituency, PCFGs [4pp] |
Eisenstein 10.0-10.5
JM 12.1-12.6, 12.8
KleinManning13 Structural
Collins97 Lexicalized
|
P3 due |
Mar 14 |
NO CLASS |
|
|
Mar 16 |
NO CLASS |
|
|
Mar 21 |
Trees 2: Dependency, Shift-reduce, State-of-the-art Parsers [4pp] |
Eisenstein 11.1-11.3
JM 13.1-13.3, 13.5
Dozat+17 Dependency
JM 13.4
Andor+16 Parsey
KitaevKlein18
KitaevKlein20 Linear-time
|
|
Mar 23 |
Apps 1: Question Answering [4pp] |
Eisenstein 12
Chen+17 DrQA
Lee+19 Latent Retrieval
Guu+20 REALM
Kwiatkowski+19 NQ
Min+20 AmbigQA
ZhangChoi21 SituatedQA
Nakano+21 WebGPT
|
|
Mar 28 |
Apps 2: Machine Translation [4pp] |
Eisenstein 18.1-18.2, 18.4
Michael Collins IBM Models 1+2
JHU slides
History of MT
Wu+16 Google
Chen+18 Google
SennrichZhang19 Low-resource
|
|
Mar 30 |
Apps 3: Language and Code [4pp] |
ZettlemoyerCollins05
Berant+13
JiaLiang16 Recomb
Wei+20 Type Inference
Wang+21 CodeT5
Chen+21 Codex
|
|
April 4 |
Guest Lecture: Sebastian Gehrmann (Bloomberg) [4pp] |
|
|
April 6 |
Language grounding [4pp] |
Radford+21 CLIP
|
|
April 11 |
Multilinguality [4pp] |
Ammar+16 Xlingual embeddings
Conneau+19 XLM-R
Pires+19 How multilingual is mBERT?
|
|
April 13 |
Wrapup + Ethics [4pp] |
HovySpruit16 Social Impact of NLP
Zhao+17 Bias Amplification
Rudinger+18 Gender Bias in Coref
BenderGebru+21 Stochastic Parrots
Gebru+18 Datasheets for Datasets
Raji+20 Auditing
|
|
April 18 |
FP presentations 1 [4pp] |
|
|
April 20 |
FP presentations 2 [4pp] |
|
FP due April 28 |