CS395T: Structured Models for NLP

Instructor: Greg Durrett, gdurrett@cs.utexas.edu
Lecture: Tuesday and Thursday 9:30am - 11:00am, Garrison Hall 0.132 (GAR)
Instructor Office Hours: Wednesday 10:00am - 12:00pm, GDC 3.420 (additional OHs by appointment)
TA: Ye Zhang
TA Office Hours: Tuesday and Thursday 2pm-3pm, GDC 1.302

Piazza Piazza

Description

This class covers a range of topics in structured prediction and deep learning with a focus on applications to NLP. We discuss model structures that commonly arise in NLP such as sequence models, tree-structured models, more general graphical models, recurrent neural networks, convolutional neural networks, and connections between these. We study the models themselves, examples of problems they are applied to, inference methods, parameter estimation (both supervised and unsupervised approaches), and optimization. Programming assignments involve building scalable machine learning systems for various NLP tasks, with a focus on understanding design decisions surrounding modeling, inference, and learning, and how these interact.

Differences from CS388: This class is intended to complement CS388; CS388 is not required as a prerequisite for this class, nor will those who have taken CS388 have seen everything in this class. In particular, this class has a greater emphasis on the fundamentals of structured machine learning and covers a wider range of deep learning techniques, while CS388 deals more with covering broadly important problems in NLP and studying the underlying linguistic phenomena.

Requirements

Syllabus

Detailed syllabus with course policies

This course is broken into two halves: the first half covers structured prediction techniques with linear models, and the second revisits these techniques and structures in the context of deep neural networks. Throughout the course, methods will be illustrated via a number of NLP tasks including POS tagging, named entity recognition, syntactic parsing, sentiment analysis, machine translation, image captioning, and others. This schedule is tentative! Because this is the first time this course is being offered, lecture topics at the end may shift around.

Assignments: There are three programming assignments that require implementing models discussed in class. Framework code in Python and datasets will be provided. If you prefer to use another language, that is possible as well, but you'll have to implement some basic file I/O and other parts of the framework code yourself. In addition, there is an open-ended final project to be done either individually or in teams of 2. This project should constitute novel exploration beyond directly implementing concepts from lecture and should result in a report that roughly reads like an NLP/ML conference submission in terms of presentation and scope.

Samples of successful Project 1 reports: Sample 1 Sample 2

Project 1: CRF for NER [download code]

Project 2: Shift-Reduce Parsing [download code]

Project 3: Neural Networks for Sentiment Analysis [download code and data (20MB)]

Final Project

Readings: Readings are purely optional and intended to supplement lecture and give you another view of the material. Two main sources will be used:


Date Topics Readings Assignments
Aug 31 Introduction [1pp] [4pp]
Sept 5 Binary classification [4pp] JM 6.1-6.3
Sept 7 Multiclass classification [4pp] JM 7, Structured SVM secs 1-2
Sept 12 Sequence models I: HMMs [4pp] JM 9, JM 10.4, Manning POS P1 out
Sept 14 Sequence models II: CRFs [4pp] Sutton CRFs 2.3, 2.6.1, Wallach CRFs tutorial, Illinois NER
Sept 19 Sequence models III: Unsupervised [4pp] JM 9.5, Painless
Sept 21 Tree models I: Constituency [4pp] JM 13.1-13.7, Structural, Lexicalized, State-split
Sept 26 Tree models II: Constituency II / Dependency I [4pp]
Tips for Academic Writing [4pp]
Sept 28 Tree models III: Dependency II [4pp] JM 14.1-14.4, Huang 1-2 P1 due / P2 out
Oct 3 Tree models IV: Global Dependency Parsing [4pp] Parsey, Huang 2
Oct 5 "Loopy" graphical models [4pp] Skip-chain NER, Joint entity
Oct 10 Machine translation [4pp] HMM alignment, Pharaoh
Oct 12 Feedforward neural networks [4pp] Goldberg 1-4, 6, NLP with FFNNs, DANs
Oct 17 NN implementation, word reprs. [4pp] Goldberg 5, word2vec, GloVe, Dropout P2 due / P3 out
Oct 19 RNNs I: Encoders [4pp] Goldberg 10-11, SNLI, Visualizing
Oct 24 RNNs II: Decoders [4pp] Seq2seq, Attention, Luong Attention
Oct 26 CNNs [4pp] Goldberg 9, Kim, ByteNet
Oct 31 Special guest lecture: Ye Zhang
Nov 2 Advanced NNs I: Neural CRFs [4pp] Collobert and Weston, Neural NER, Neural CRF parsing P3 due / FP out
Nov 7 Advanced NNs II: QA/memory networks [4pp] E2E Memory Networks, CBT, SQuAD, BiDAF
Nov 9 Deep generative models/VAE [4pp] Bowman VAE, Miao VAE Proposals due
Nov 14 Summarization [4pp] MMR, Gillick, Sentence compression, SummaRuNNER, Pointer
Nov 16 Special guest lecture: Katrin Erk
Nov 21 Dialogue systems [4pp] RNN chatbots, Diversity, Goal-oriented, Latent Intention, QA-as-dialogue
Nov 23 NO CLASS (Thanksgiving)
Nov 28 Information extraction [4pp] Distant supervision, RL for slot filling, TextRunner, ReVerb, NELL
Nov 30 Wrapup [4pp]
Dec 5 Project presentations I
Dec 7 Project presentations II
Dec 15 FP due