SQL Mini Data Base
MDB is a miniature database system that supports a
useful subset of SQL. The primary limitations of MDB are that only
string and integer types are supported, there are no aggregations, and selection
predicates must be conjunctive.
In a series of assignments, you will implement MDB.
Each assignment gives you a context in which to work and the goals to achieve. In the end,
there will be a contest to see
which version of MDB runs the fastest on a set of benchmarks.
Big brownie points are awarded to the winner. MDB supports the following DDL commands:
- Create -- create a relation.
- Index -- create an index
- Show -- print the attributes of the given relation.
MDB supports the following DML commands:
- Abort -- rollback all updates since the
last abort or commit.
- Close -- close a database that has no uncommitted
updates. Uncommitted updates must be aborted or committed prior to a
- Commit -- commit all updates since the last abort or
- Delete -- delete the tuples associated with a
single-relation (i.e. non-join) predicate.
- Exit -- exit MDB. This will close the
currently opened database.
- Insert -- insert tuple into database.
- Open -- open a database for update and retrieval.
Only one database can be open at any time.
- Script -- run the script in the designated file.
- Select -- retrieve tuples from one or more relations.
- Update -- update zero or more tuples in a single
The grammar for the above language is in
file in the zip file mdb.zip:
- You will also need to place
Jakarta.jar on your classpath
You will be using Berkeley DB (BDB) Java Edition, a Java-based inverted
file system. Your assignment is to:
- First, download BDB, read its documentation, and figure
out how to store and retrieve tuples of a relation.
Click here for information.
- Second, I have provided a parser for the MDB
project. It is a set of Java files that read a command-line as input,
and parses the command line. If the parse is successful, the Main.main()
method prints out the contents of the parse tree. To invoke it, you
> java mdb.Main
> // type in your mdbsql query here
> . // terminate with a line with a sole dot
or if you want to run a script, type:
> java mdb.Main
> script "path to file";
The code and an extra zip file that MDB needs is given
here, along with installation instructions. Some test scripts are
here, and three relations of data for loading is
here. A mail-order database is in two SQL scripts
- Third, you need to write code that walks the parse tree
produced by MDB, and translate that into actions on Berkeley DB. The
only actions that you need to take for this assignment is (a) to process
relation create statements, (b) process insert statements to populate your
Note: Each production and pattern of the MDB DDL and DML grammar is identified
with a Java class. The actions to execute an instance of that class is
currently encoded in an execute() method of that class. Your goal is to fill
in the actions of the execute methods of specific classes.
Note: you should define an interface that allows you to
store and retrieve tuples from relations. Initially, your implementation
of this interface will be very simple: you map a relation to a BDB file.
There are no indices at all. Retrievals are done either via key
search or scan using BDB cursors.
To walk parse trees, you'll need to read sections on
Abstract Syntax Trees (ASTs) and
AST Cursors to see how to traverse and harvest
- Do you need to check the sanity (data type integrity)
of the field values in an insert command?
Ans: realize that there is an endless number of
integrity checks that you could perform. Add any that you feel comfortable. It
will help you debug your program. However, scripts that will be used for grading
will not have errors.
The goal of the second project is to provide functionality
for all but the Decl_ind DDL statement. There is no query optimization: you are
to process all queries as they are written, say using relation scans and nested
loops. Don't worry about the presence of indices or permutation of joins, just
provide MDB functionality.
My recommendation is (in this order) to:
- process single-relation queries. (You'll have to
segment a SELECT predicate into local predicates and join predicates (just as
we reviewed in class)).
- try the delete statements and update statements, that
leverage single-relation queries.
- process multi-relation queries.
What should be stated more clearly in the assignment is
that commit and abort is something that you can postpone until the 3rd
assignment. The 3rd assignment is about query optimization (and indices), but
really also transaction support. You should plan ahead on transactions, as
you'll be relying on BDB's transaction capability. Commit and abort should
be straightforward. If you handle commit and abort now in project #2, that's
You now need to implement the DDL Index statement of
MDB DDL. This will involve the modification of your code that you wrote
in Assignment #2. BDB has support for indices, and you should do your best
to use their facilities. Further, you should direct BDB to use a
particular index for query optimization. In the case of multi-table joins,
you should consider optimizing queries involving 2 or 3 relations (as you can
enumerate the possibilities easy enough). Another possible extension is
that you add another join algorithm. You must clearly document what
optimizations you have included, and performance evidence that your
Test scripts that you can use are: