Course Specifications for
CS 388 Natural Language Processing
-
When & Where: Fall 2008, MW 2:00-3:30PM, PAR 1
-
Unique Number: 55848
-
Professor: Ray Mooney, CSA
1.102, 471-9558, mooney@cs.utexas.edu
-
Office Hours: Tu, Th 10-11 AM or by appointment
-
Teaching Assistant (TA):
Joseph Reisinger
-
TA Office Hours:
Mon, Fri 3:30-4:30PM, CSA 1.112
- Class Email Alias: cs388@cs.utexas.edu
-
Prerequisites: Basic knowledge of formal-language/automata
theory (i.e. regular and context-free grammars), artificial intelligence
(i.e. search, logic, and knowledge representation), and Java Programming.
Knowledge of machine learning will be extremely useful but not strictly
necessary.
-
Textbook :
Jurafsky and Martin,
SPEECH and LANGUAGE PROCESSING: An Introduction to Natural Language
Processing, Computational Linguistics, and Speech Recognition, Second
Edition, McGraw Hill, 2008.
-
Recommended Supplementary Text:
Manning and Schütze,
Foundations of Statistical Natural Language Processing,
MIT Press. Cambridge, MA: May 1999.
Course Overview
The intent of the course is to present a fairly broad graduate-level
introduction to Natural Language Processing (NLP, a.k.a. comptuational
linguistics), the study of computing systems that can process, understand, or
communicate in human language. The primary focus of the course will be on
understanding various NLP tasks as listed on the course
syllabus, algorithms for effectively solving these problems, and methods
for evaluating their performance. There will be a focus on statistical
learning algorithms that train on (annotated) text corpora to automatically
acquire the knowledge needed to perform the task. Class lectures will discuss
general issues as well as present abstract algorithms. Implemented versions of
some of the algorithms will be provided in order to give a feel for how the
systems discussed in class "really work" and allow for extensions and
experimentation as part of the course projects.
Course Requirements and Grading
Chapters from the text and a few other readings will be assigned throughout the
semester, and the reading should be done before the corresponding class.
Copies of the class lecture slides (in Powerpoint) will be available on the
course home page. There will be about four homework
assignments (about one every two to three weeks early on) as well as a final
research project.
The homework assignments will be a mix of problem solving and programming
assignments. All programming assignments will be in Java.
If you do not know Java, you will need to
learn it on your own. You can use your student account on the department
workstations or any other Java platform available to you (however, we will only
provide support for running on departmental Unix machines). If you are not a
CS student and need a temporary department account, apply on the web
here.
The final project can be a more ambitious experiment or enhancement involving
an existing NLP system or a new system implementation. In either case, the
implementation and/or experiments should be accompanied by a short paper (about
6 to 7 single-spaced pages) describing the project. A list of suggested
projects as well as an outline for the project report will be available later
on the course home page. About half-way through the
semester you will be asked to submit a one-page project proposal.
Late Submission and Cheating Policies
Homework assignments should be completed independently by each student
and any program code should always be appropriately commented. Assignments are
due at the beginning of class on the due date. Be sure to hand in assignments
on time, late penalties are a loss of a percentage of the original overall
points for the assignment: 1 Day: 15%, 2 Days: 40%; 3 Days: 75%; past 3 days:
100%. A day is a 24 hour period starting at the beginning of class and
includes all weekend days and holidays.
The preference is for individual final projects from each student; however,
larger projects done by pairs of students are possible with approval of the
instructor.
Read the department's academic policy page at
http://www.cs.utexas.edu/users/ear/CodeOfConduct.html. Students who
demonstrably violate the Academic Honesty policy will receive a failing grade
in the class.
Final Grade
The final grade will be computed as follows:
60% Homeworks
33% Final Project
7% Class Participation