CS 386d Database Systems
This is a graduate-level introduction
to the principles of database systems. We review
and explain fundamental ideas and algorithms that are used in the construction of centralized DBMSs, distributed DBMSs,
and database machines. Topics to be covered include: query processing and optimization,
database machines, object-oriented databases, concurrency control and recovery.
Recent directions in database research are also surveyed.
All students must have taken undergraduate
database or its equivalent.
Lecture Notes, Texts, and Class Lectures
|
Lecture notes for the course will be made available at the
Texas Union Copy Center. The first set of notes should now be
available; additional sets of notes will be announced in class. Class lectures are supplemented by papers (available
below). Students
are responsible for reading these papers. Remember: DO NOT
PRINT THESE PAPERS ON CS PRINTERS!
First two lectures on CC&R are
here.
The last class periods of the semester
are devoted to student group presentations. These presentations
will survey recent results presented in major database conferences.
Group lectures will be critiqued prior to the actual presentation
to enhance quality and content. Topics for presentation can be chosen from
any recent database conference, among them are:
Programming Projects and
Homework
|
There will be a series of 3 programming projects
during this course. The first project will refresh your memory on SQL; the
second will be either building an inverted file system using Berkeley DB
(which is a file management system written in Java) OR a special project of your
own. And the third project will be to develop a lock
manager. Details on these projects are forthcoming.
Project submission. To submit your project, execute on a CS
Linux machine:
> turnin -submit <TBD> cs386d
<file>
To check your submissions:
> turnin -list <TBD> cs386d
Final grades will be determined approximately by the
following scheme:
- Your accumulative project grade will determine
the maximum final grade for the course. Ex: if you get a "B"
average across all of your projects, your final grade will be no
greater than a "B".
- Final counts 40%, midterm counts 30% and presentation counts 30%.
Homework grades and class participation is used to decide borderline
cases for final grades.
Extenuating Circumstances
|
If you have difficulty meeting the requirements of this
course, fail to hand in an assignment, or miss an exam because of extenuating
circumstances, please advise the instructor in writing (not
email) at the earliest possible date so that your situation can be
discussed. If you encounter an unexpected medical or family emergency or a
random act of Nature that causes you to miss the due date for homework or miss a
quiz or exam, you must present suitable documentation in writing (not
email) to the instructor before special consideration will be given.
A file of all written correspondence will be kept by the instructor and
decisions regarding them will be made at the end of the semester.
Numbers in [brackets] indicates the number of lecture on a
topic. Copies of lecture notes are available at the Texas Union Copy Center, and
the papers are available via the links below. Papers highlighted in
yellow are subject for questions
on exams. Homework assignments are indicated in
orange. And projects are in blue.
Please note that not all projects and their due dates have been posted on this
page.
- Basic Concepts in Centralized and Distributed Query
Optimization [3]
-
Chaudhuri, "An Overview of Query Optimization in
Relational Systems"
- Selinger, et al., "Access Path
Selection in a Relational Database Management System"
- Mackert and Lohman, "R* Optimizer
Validation and Performance Evaluation for Distributed Queries"
-
Project #1: SQL Refresher
- Query Processing and Optimization in Database
Machines[2]
- Nyberg, Barclay, Cvetanovic, Gray, and
Lomet, "AlphaSort: A Cache-Sensitive Parallel External Sort"
- DeWitt, et. al., "The Gamma Database
Machine Project"
- Object-Oriented Query Optimization
-
Hellerstein, "Predicate
Migration: Optimizing Queries with Expensive Predicates"
-
Problem Set #1
- Project #2: Berkeley DB
DDL and Database Loading
- Spatial File Structures and Join Algorithms
-
Guttman, "R-Trees: A Dynamic Index Structure
for Spatial Searching"
- Hjaltason and Samet, "Incremental Distance
Join Algorithms for Spatial Databases"
- Data Warehouses
- Gray, et al., "Data
Cube: A Relational Aggregation Operator Generalizing Group-by,
Cross-Tab, and Sub-Totals"
-
Zhao, et al.,
"Simultaneous Optimization and Evaluation of Multiple Dimensional Queries"
- Gray Video, "The Revolution in Database Architecture"
- Gray, "The Revolution in Database
Architecture"
- Automatic Database Tuning
- Zilio, et al. "DB2 Design Advisor: Integrated Automatic
Physical Database Design"
- (Optional) XML databases and XQuery
- Chamberlin, Robie, and Florescu, "Quilt: An XML Query
Language for Heterogeneous Data Sources"
-
Problem Set #2
- Project #3: Berkeley DB
DDL and Database Loading
- Project #4: Berkeley DB
DDL and Database Loading
Midterm (in class, Thursday, October 16)
|
- Basic Concepts of Concurrency Control and Recovery
- Bernstein text. "Serializability Theory"
-
Problem Set #3
- Two-Phase Locking, Phantoms, and Multigranularity
Locking [2]
- Mohan, "Interactions Between Query Optimization and
Concurrency Control"
- Bernstein
text. "Two Phase Locking"
- Degrees of Consistency
- Gray, et al. "Granularity of Locks
and Degrees of Consistency in a Shared Data Base"
- Batory, "Degrees of Consistency"
-
Problem Set #4
- B+ Tree Locking Protocols
- Lehman and Yao, "Efficient Locking
for Concurrent Operations on B-Trees"
- Key Range Locking and Distributed Concurrency
Control
- Mohan, "ARIES/KVL:
A Key-Value Locking Method for Concurrency Control of Multiaction
Transactions Operating on B-Tree Indices"
- Non-Locking Protocols (Notes)
- H.T. Kung and J.T. Robinson, "On
Optimistic Methods for Concurrency Control"
-
Problem Set #5
- Basic Concepts and Operation Logging [2]
- Haerder and Reuter, "Principles
of Transaction-Oriented Database Recovery"
- Mohan, et al., "ARIES: A Transaction
Recovery Method Supporting Fine-Granularity Locking and Partial
Rollbacks Using Write-Ahead Logging"
- Recovery in Distributed Databases
- Mohan, Lindsay, and Obermarck, "Transaction
Management in the R* Distributed Database Management System"
- (Optional) Enterprise Java Beans
Current Trends in Database Research
[4]
|
- Student Group Presentations [4]
Miscellaneous (Guest Lectures) [2]
|
- Conor Cunningham (Microsoft)
Revised:
January 20, 2010.