CS 329E - Elements of Algorithms in Bio-Informatics
Fall 2006 Syllabus
Unique Number:56293
Time and Room:11:00am-12:15pm, RLM 5.112
Instructor:
Dr. David R. Kincaid, Ph.D.
Email: kincaid@cs.utexas.edu Class Web Page:
http://www.cs.utexas.edu/users/kincaid/CS329E/ Office: PAI 5.48, Office Phone: (512) 471-9783, Fax Phone: (512) 471-8885
Office Hours: immediately after class in RLM 13.158
or via email or by appointment.
Special office hours will be held before each exam.
Teaching Assistant:
Nathan Paczan
Email: npaczan@cs.utexas.edu Office: Elements Lab PAI 5.38
Office Hours: MW 2-3pm
Outline:
An introduction to bioinformatics algorithms and the
computational ideas that have driven them through the last two
decades.
Computational biology has rapidly become a large source of new algorithms.
This has changed the way we teach computational ideas to biologists
as well as how applied algorithms is taught to computer scientists.
The objective of this course is to exposure students to
the broad range of algorithms in various
computational sciences application areas such
as bioinformatics and scientific computing.
It is important to acquire a knowledge of the algorithms
employed in various software systems and to have the ability
to understand and evaluate them rather than just use
them as black-boxes.
The focus is on applications involving many of the new algorithms
from bioinformatics and Web searching
as well as traditional ones such as
eigensystems, dynamical systems,
demographics, and economic models, etc.
Topics covered as time permits:
Introduction:
What a modern biologist ought to know about computer science.
Presenting both the foundation of algorithms and important results in
bioinformatics.
Algorithms and Complexity:
How popular bioinformatics algorithms work and
see what principles drove their design.
Molecular Biology Primer:
A brief introduction to biology that covers most of the computational
concepts discussed in bioinformatics.
Exhaustive Search Algorithms:
Require little effort to design but for many problems of interest
cannot process inputs of any reasonable size within your lifetime!
Greedy Algorithms:
Often return suboptimal results, but takes very little time.
A lucky few greedy algorithms find optimal rather
than suboptimal solutions.
Dynamic Programming Algorithms:
Provides a framework for understanding DNA sequence comparison
algorithms
Divide-and-Conquer Algorithms:
Proceeds in two distinct phases: a divide phase in
the algorithm splits a problem instance into smaller problems
instances and solves them; and a conquer phase in
which it stitches the solutions to the smaller problems into a
solution to the bigger one.
Graph Algorithms:
After introducing some of the mathematics of graph theory,
we show how powerful they can be in such bioinformatics applications as
DNA sequencing and protein identification
Combinatorial Pattern Matching:
Develop a number of ways to make pattern matching in a long sequence
practical: how to organize data into efficient data structures,
often a crucial part of solving a problem.
Clustering and Trees:
A common problem is to partition a set of experimental data into
groups
(clusters) in such a way that the data points within the same cluster
are highly similar while data points in different clusters are very
different.
Hidden Markov Models (HMM):
A popular machine learning approach in
bioinformatics.
Randomized Algorithms:
Make random decisions throughout their operation.
CS Elements Computer Laboratory:
PAI 5.38
Grading:
Item
Percentage
Assignments
30%
Exams
30%
Final Exam (Thu. 12/14 @ 2pm)
20%
Presentations
10%
Participation & Misc. Points
10%
Additional Details:
There are regular written homework (HW) and computer programming (CP)
assignments, which may involve writing brief reports.
Part of the grade is based on student-selected projects
(which may be done in small groups).
Miscellaneous points may be earned from class engagements
(attendng class,
participating in classroom discussion, answering questions,
doing work at the board, etc.) as well as doing extra credit work,
which may be available on some assignments.
Normal grading standards used for final letter grades.
Prerequisites:
Students must have completed CS 303E and CS313E
(or equivalent) with a grade of at least a 'C' in each.
This course may not be counted toward a degree in Computer Sciences.
The instructor should be contacted in case of questions about
prerequisites.
For example, the instructor will allow
alternative course-work and background knowledge to
satisfy the required prerequisites on an individual basis.
For example, students who have not had CS313E
but have the necessary background from another
programming course or from equivalent experience
may take the course after meeting with the instructor.
Other material may be give out in class
or may be available on the World Wide Web.
Exams:
Exams will be given approximately two or three times during the class
on dates announced by the instructor one week before each exam.
Each exam will emphasize material since the previous exam but
may include any previous material.
Exams will consist primarily of problems similar to homework
problems, examples discussed in class, in the lecture notes, or in the
textbook, and so on.
NO makeup exams will be given!
To allow for the normal learning process and because
the material gets more difficult,
the instructor may take into consideration
improvement in performance on the exams
by increasing the importance of each exam as we go along
and adjusting the grade percentages
accordingly.
On exams you should work entirely alone without using
notes or books unless instructed otherwise.
Final Exam:
All students must take the final examination (3 hours)
at the indicated time and date.
On the final exams you should work
entirely alone without using notes or books unless instructed otherwise.
Miscellaneous Points:
Miscellaneous points are available throughout the semester for
class attendance, participation, etc., in-class work such as
pop quizzes, work at the board, etc.,
and additional problems on some assignments.
Grading:
Normal grading standards (90 ~ A, 80 ~ B, etc.)
are used for the final letter grades.
(Final grades may be curved upward slightly, at the instructor's
discretion, but never downward.)
Homework:
Homework is assigned during each class meeting.
Homework assignments are due at the beginning of the next class.
NO late papers will be accepted!
Students may be asked to work
homework problems at the board and/or to write down the solutions
of some of the problems during class.
The should be neatly organized and stapled.
As will business or official documents, the homework assignments
must be done carefully and written legibly on standard size
paper according to the following instructions.
Please do homework on standard college rule 8 1/2'' x 11'' paper
or other standard size good quality paper.
Please write only on the front of each sheet.
Box answers where possible.
Fold in half lengthwise and staple in the top left-hand corner.
On the first page and the outside page write your
Name, Course Number, Assignment Number, and Date.
Also, put your last name on each page.
Put solutions in order and number pages.
Since there is no grader for this course, the instructor may
decide to not grade all homework problems but rather a random sampling of them.
A small percentage of your final average on homework will be dropped to
allow for normal mistakes of one kind or another (yours or mine)
and absences.
(When you apply for a job, write a business or technical report,
file a government document or application, etc.,
they must be done in a standard way according to set instructions.
So let's get used to do things in a nice orderly fashion.)
The key to a good grade in this course is a good grade on the homework
assignments.
Academic Integrity:
You may collaborate on homework in the following way only.
On a particular problem, first, you should try to work the problem
by yourself. If you are unsuccessful after you have identified relevant
definitions and theorems and have considered several possible approaches
to solving the problem, you may talk to your colleagues. Before you
write-up your work: separate, rethink the solution, and write up your
solutions alone. You may compare the answers before turning them in to
see if you got the same answers but do not copy solutions from each other.
What you submit should be your own work.
In working the homework problems, you are encouraged to look at other
books such as Lay's textbook and the Student's Study Guide for it.
They may give you ideas, or contain similar problems.
See
Code of Conduct.
Attendance:
Class attendance is expected.
As much as possible, students should attend all classes,
come on time and not leave early.
A student who misses a class is responsible for finding out
what was covered in class, what the homework assignment was,
and whether or not an exam was announced. This information
should be obtained from another student.
Missing one meeting is equivalent to missing three one hour lectures
or a week of a regular class!
Because students may be asked to work homework problems on the board or may be
given in-class work to do at any time during the class,
it is recommended that you attend each class and stay until the end.
Email/Web Site:
All students are urged to obtain an email address for class
communication. Some course material may be accessible from a Web site.
See, for example, the class Web site
http://www.cs.utexas.edu/users/kincaid/M340L/
Lectures:
As much as possible, the following is recommended:
Before each meeting of the class, students should
looked-over the sections in the book scheduled for that meeting.
Then try to work some of the easy problems pertinent
to the topics in each set such as
some of the first few odd-numbered problems,
which have answers in the back of the book.
In this way, class time can then be devoted to clarification, to
examples, and to further exploration of the material.
Do not expect a new, detailed, exposition of the each section of
the book. Consequently,
there will be instead plenty of time for answering your
questions, for additional discussion, for illustrations of the theory,
as well as going over the previously assigned homework and for examples
problems.
Computers:
Computer accounts may be available to you.
If so, some homework problems may be give involving the use of
mathematical software such as Matlab or Maple. It is to your advantage to learn
how to do vector and matrix calculations using mathematical software.
However in using software, remember that you should know how the problem
must be worked, even if you turn over the drudgery of calculations
(arithmetic) to the computer. Exams usually
do not require a detailed knowledge
of Matlab, Maple, or other mathematical software.
General Policies:
All University policies apply to this course, including (1) the
accommodation of disabilities, (2) allowed absence for religious holy
days (see the current General Information catalog, Chapter 4,
"Attendance"), and (3) scholastic dishonesty. Students who violate
University rules on scholastic dishonesty are subject to disciplinary
penalties, including the possibility of failure in the course and/or
dismissal from the University Extension program or The University.
(For relevant definitions, see the current General Information
catalog, Appendix C, Subchapter 11-800.)
Elements Certificates:
Elements of Computing certificates are not issued
automatically when you complete the requisite courses.
If your completing the Elements program this semester, you need to
apply for it by visiting this webpage to download a request:
Elements of Computing Certificates