CS 327e
Fall 2004
PURPOSE:
The purpose of this lab is to learn about Normalization and good database
design.
IMPLEMENTATION:
For this lab, consider solving ALL of the following:
A)
Consider
a relation that involves STUDENT CLASS GRADE data. Specifically, consider the
relation
STUDENT_CLASS_GRADE (CourseNumber,
CourseDesc, CourseFee, ClassUniqueNumber, ClassTime,
ClassMaxStudentsAllowed, ClassRoom, LetterGrade, StudentEID,
StudentFirstName, StudentLastName, ProfessorFirstName, ProfessorLastName,
ProfessorEID, DepartmentName, DepartmentSchool).
Where PK = (StudentEID, ClassUniqueNumber)
As you might guess, this will result in a highly denormalized
relational data. Your job is to normalize the above relation.
You can assume the following:
1) This is information for 1 semester
(say Fall 2004) in a university environment like The
University of Texas.
2) A Course may have 1 or more
sections (or classes); each section belongs to only 1 course. Each section has
a unique number that is unique across all courses for this semester.
3) A professor may teach 1 or more (up
to 3) sections, but a section is taught by exactly 1 professor.
4) Domain values for LetterGrade are
{A, B, C, D, F, Q, I}
5) A student must be enrolled in a
minimum of 3 different courses, with a maximum of 7.
6) StudentEID uniquely identifies each
student; ProfessorEID uniquely identifies each professor; ClassUniqueNumber
uniquely identifies each class; CourseNumber uniquely identifies each course.
7) Examples of CourseNumber are
{CS301K, PH301, CS302, CS303E, CS327E etc.}
8) Examples of ClassUniqueNumber are
{53635, 53640, 53650, 53655 etc.}
9) A Department can have 1 or more
students and 1 or more professors. Each professor (or student) belongs to
exactly 1 department.
10) Assume that there are at least
50,000 students and 500 professors in the university.
11) Each School (or College) can have 1
or more departments, but a department always belongs to a school (College). As
an example, the department of computer sciences belongs to the college of
natural sciences.
For this relation,
A) Identify all functional
dependencies (FD’s). (HINT: I can see at least 5 Functional Dependencies).
B) Break this relation down into
relation(s) that are in 2nd Normal Form (NF). Clearly state the FD’s
and primary key for each of these relations.
C) Break the above relation(s) into 3rd
NF. Clearly state the FD’s and primary key for each of these relations.
D) For the relations in C), clearly
state the referential integrity constraints (FK-PK relationships).
In all of the above, CLEARLY STATE
ANY ADDITIONAL ASSUMPTIONS THAT YOU MIGHT HAVE MADE.
B)
For
the relations in 3rdNF above, draw the ER diagram. Clearly label all
PK, FK, entity names, relationship names, attribute names, and the referential
integrity actions (e.g. delete cascade etc.).
In all of the above, CLEARLY STATE
ANY ADDITIONAL ASSUMPTIONS THAT YOU MIGHT HAVE MADE.
SOLUTION
A)
STUDENT_CLASS_GRADE
(CourseNumber, CourseDesc, CourseFee, ClassUniqueNumber, ClassTime,
ClassMaxStudentsAllowed, ClassRoom, LetterGrade, StudentEID,
StudentFirstName, StudentLastName, ProfessorFirstName, ProfessorLastName, ProfessorEID,
DepartmentName, DepartmentSchool).
Where PK = (StudentEID, ClassUniqueNumber)
For discussion
purposes, let’s rewrite the above relation as
R1 (A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P).
Where
A = CourseNumber,
B = CourseDesc,
C = CourseFee,
D = ClassUniqueNumber,
E = ClassTime,
F = ClassMaxStudentsAllowed,
G = ClassRoom,
H = LetterGrade,
I = StudentEID,
J = StudentFirstName,
K = StudentLastName,
L = ProfessorFirstName,
M = ProfessorLastName,
N = ProfessorEID,
O = DepartmentName,
P = DepartmentSchool
And PK = (D,I)
From the assumptions and constraints, we
can determine the following FDs:
FD1R1: D,I à A,B,C,E,F,G,H,J,K,L,M,N,O,P
FD2R1: I à J,K,O,P
FD3R1: N à L,M,O,P
FD4R1: A à B,C
FD5R1: D à A,E,F,G,N
FD6R1: O à P
a) FD2R1 and FD5R1
violate 2NF since they are partial dependencies.
Hence, separating them into independent relations gives us:
R1R2 (I,J,K,O,P)
With
FD2R1R2: I à J,K,O,P
FD6R1R2: O à P
R1R3 (D,A,B,C,E,F,G,L,M,N,O,P)
With
FD14R1R3: D à
A,B,C,E,F,G,L,M,N,O,P
FD3R1R3: N à L,M,O,P
FD4R1R3: A à B,C
R1R4 (D,H,I)
With
FD1R1R4: D,I à H
Since there are no
partial dependencies in any of the above 3 relations, these relations are said
to be in 2nd NF.
b)
The following FD’s
are not in 3NF:
FD6R1R2: O à P (in R1R2)
FD3R1R3: N à L,M,O,P (in R1R3)
FD4R1R3: A à B,C (in R1R3)
Hence, breaking the
respective relations gives us:
R1R2R5 (I,J,K,O)
With
FD2R1R2R5: I à J,K,O
R1R2R6 (O,P)
With
FD6R1R2R6: O à P
R1R3R7 (D,A,E,F,G,N)
With
FD14R1R3R7: D à A,E,F,G,N
R1R3R8 (N,L,M,O)
With
FD3R1R3R8: N à L,M,O
R1R3R9 (A,B,C)
With
FD4R1R3R9: A à B,C
R1R4 (D,H,I)
With
FD1R1R4: D,I à H
Notice that each of the above six relations are in 3rdNF.
Rewriting these into appropriate names, we get the following relations with
their respective FDs (also, I have written the appropriate FKs for each
relation):
1) STUDENT (StudentEID, StudentFirstName, StudentLastName,
DepartmentName)
FD1: StudentEID à
StudentFirstName, StudentLastName, DepartmentName
FK1: STUDENT.DepartmentName is FK to
DEPARTMENT.DepartmentName
2)
DEPARTMENT (DepartmentName, DepartmentSchool)
FD2:
DepartmentName à DepartmentSchool
3) CLASS (ClassUniqueNumber, CourseNumber, ClassTime,
ClassMaxStudentsAllowed, ClassRoom, ProfessorEID)
FD3:
ClassUniqueNumber à CourseNumber, ClassTime, ClassMaxStudentsAllowed, ClassRoom,
ProfessorEID
FK2:
CLASS.CourseNumber is FK to COURSE.CourseNumber
FK3:
CLASS.ProfessorEID is FK to PROFESSOR.ProfessorEID
4)
PROFESSOR (ProfessorEID, ProfessorFirstName, ProfessorLastName, DepartmentName)
FD4:
ProfessorEID à ProfessorFirstName, ProfessorLastName, DepartmentName
FK4:
PROFESSOR.DepartmentName is FK to DEPARTMENT.DepartmentName
5)
COURSE (CourseNumber, CourseDesc, CourseFee)
FD5:
CourseNumber à CourseDesc, CourseFee
6)
GRADE (ClassUniqueNumber, LetterGrade, StudentEID)
FD6:
StudentEID, ClassUniqueNumber à LetterGrade
FK5:
GRADE.ClassUniqueNumber is FK to CLASS.ClassUniqueNumber
FK6:
GRADE.StudentEID is FK to STUDENT.StudentEID