Protein amino acid sequences can be found from DNA, but their 3D fold is hard to discover.
Prof. Robertus, a UT protein crystallographer.
One technique: X-ray crystallography.
Diffraction data
The dark dots correspond to the squared amplitudes of an electron density map in 3D.
Iso-surfaces in the density map look like the protein molecules.
Some iso-surfaces from a synthetic
density map of a fragment of a protein.
Fitting the model protein molecule to the density map is mostly done manually.
The Phase problem: review of Fourier Transforms. A good overview for computer graphics
people can be found on pages 623-642 of Foley, VanDam, Feiner and Hughes (a book you should own).
Kevin Cowtan's Book of Fourier.
Since we don't get the phases, our reconstructed maps can be very bad.
Reconstructions reflect
the model phases more than the measured amplitudes.
When the phases are not too bad we can use them to recover more information about the
model.
Pretty good phases.
This leads to an iterative process of model refinement and map refinement. Automatic tools (ARP/wARP) do well on high-resolution maps. Most work is manual! Use lots of graphics, stereo, `poor man's stereo'. Skeletonization. Great research opportunity. Use low-end hardware?
Model fitting by moving atoms. Density function f is a weighted sum of density functions of individual atoms. Generally far too many parameters for data.
First cut: least-squares fit.
Better: Likelihood is a framework; a specific systems plugs in good (well-justified, evaluable, works) estimates
of the probabilities involved.
Came up last week, will come up in super-resolution, probably elsewhere.
Bayes' theorem
Evaluating p(data | model).
Getting the right distribution
Evaluating p(model): use physical constraints, groupings, sequence data? Lots of research opportunities here. Current project: finding secondary structure.