My interests in computer science lie mostly in its intersection with biology, in the field of bioinformatics. Currently, I am focusing on developing probabilistic models to describe the uncertainty for complex functionals computed on objects with native error/uncertainty. A major focus is on generating intelligent (low-discrepancy) samples in high-dimensional spaces. The applications are to protein-protein docking and protein homology modeling.
In the past, I spent a lot of time in (and still often dabble with) the field of Next-Generation Sequencing, and the future of massivly-parallel DNA sequencing.During my time as an undergrad and now graduate student, I've been part (and continue to be a major part) of several research projects. These include, but are not limited to, those seen below.
Uncertainty Quantification for Protein Functionals
I should really put something in here. If you're reading this, feel free to send me an email and ask me about it. Or, check out my proposal slides and/or document above. Because what I really should be doing is working on my research
My current research project involves mapping short DNA fragments to a reference genome. The key to this research is converting DNA sequences into a point in metric space, and then constructing an intelligent database (we use a modified MVP tree, which is an adaptation of the k-d tree) to reduce the overall search time. While not as fast as some of the current DNA mappers, we hope that our exact search algorithm will be able to shed light on some datasets that have relatively poor mapping percentages.
I spent some time working with the Gutell Lab on a progressive RNA folding algorithm. It's still in the development stage, but it's looking like it will be a significant improvement over current algorithms (which perform quite poorly). I'm hoping to be able to translate some of this information to the RNA-Seq problem, which is fast becoming one of the more important problems in next-generation sequencing and drug discovery.
My undergraduate research project was the development of GNUMAP, a program used to find the best alignment for short DNA sequences on a reference genome. This research was used to spur one first-author peer-reviewed journal paper, two conference papers, and several presentations. The webpage for this project (including download instructions, etc), is here.