Programming
Assignment 5
CS 303e
Covered topics:
reading input from the user
loops
boolean conditions
String manipulation and slicing
You may not use any programming constructs or concepts that we have not
covered in class.
A strand of DNA is formed from four nucleotides called adenine
(abbreviated A), thymine (T), cytosine (C), and guanine (G). Genetic
information is determined by the sequence of nucleotides along a
strand.
In this project, you will write a program that finds the longest common
nucleotide sequence in two strands of DNA. Each strand is represented
by a sequence of letters from {A, T, C, G}. For example, in the strands
ATGC and TGAC, the longest common sequence is TG. The two strands are
not required to have the same length, and it is possible for the two
strands not to have any common sequence (a sequence of length 1 does
not count).
Your program will prompt the user to enter two strands of DNA. You will
write out all the common longest subsequences, one line at a time.
There may be 0, 1, 2 or more longest subsequences.
Sample Run:
Enter the first strand: ATGGCATAAGCTT
Enter the second strand: TGCAGCTGCATCAGGAT
Common subsequence(s):
GCAT
AGCT
Sample Run:
Enter the first strand: TAGGCAT
Enter the second strand: GAA
No common subsequence was found for TAGGCAT and GAA.
Use the coding conventions we have discussed and used in class (eg,
conventions for variable names) and include whitespace, comments and
indentation to make your program more readable. Write and use the following functions:
1. getStrands(): This function prompts the user for the two strands, and returns a tuple that contains the two DNA strands.
2. longestCommonSubseq(string1, string2): This function takes two DNA
sequences and returns the longest subsequence of string1 and string2.
Think about how you can use the string.find() function.
Think about the efficiency of your program. If your program does unnecessary work, you will lose some points.
Your output should look like the sample output above. You will lose
credit if it does not.
Programs that contain syntax errors will
not
receive any credit. Please plan
ahead and allow plenty of time to get help from the TAs, proctors or
instructor if you are having trouble with the program. Please do not
email the course staff the day before the assignment is due if you need
help - it is unlikely that we will respond quickly enough to assist
you. Plan to come to the lab during office hours if you need help.
Save your program in a file called DNA.py. This program should
include ample comments, and should use whitespace, indentation, and
meaningful variable names to enhance readability. Include a header in
your program as indicated in the description of project 1.
This program must be submitted by 11 pm on the due date in order to be
considered on time. Please note in your program header how many slip
days you used on this project, if any.
The proctors will be grading this project. IF you have any questions
about
grading, contact them first. If you submit this project late using slip
days, you must email the proctors after your program is submitted and let them
know that your project is ready to be graded.
This project must be done individually.
You
may talk to your classmates
about solution approaches, but then you must write your own code.
Reminders - Did you remember to:
do this assignment by yourself?
use meaningful variable names?
include comments for readability?
make sure that your program does not produce an
error?
make sure that your output matches the sample output
above?
remember to call your main() function???
submit your program in file DNA.py using
the turnin program by 11 pm on the due date?
email the proctors after you have submitted your project, if you are using slip days?