UT BioDM (Biological Data Mining) Group Meeting

We review local research progress and read and discuss papers in the area of biological data mining in connection with the UT Austin NSF ITR project on biological data mining.

Current Participants (Local and Nonlocal)

If I have missed your name please let me know
Raymond Mooney Edward Marcotte Inderjit Dhillon Joydeep Ghosh
Daniel Miranker Vishy Iyer Orly Alter Arun
Gunjan Gupta Hyuk Cho Kris McGary Chase Krumpelman
Patrick Killion Razvan Bunescu Shulin Ni Sugato Basu
Suvrit Sra Cara Stockham Glen Nuckolls
Dongmin Kim Chendi Zhang
Mikhail Bilenko Prem Melville Smriti Ramakrishnan
Akshay Razvan Surdulescu Larsson Omberg Insuk Lee
Christine Vogel

Previous Discussions

2007.05.11 Presentation by Shu Wang from Dr. Miranker's group on her research:

"A Biclustering Approach to Multiple Sequence Alignment", joint work with Robin Gutell and Daniel Miranker
2007.04.27 Presentation by XioChitl Morgan from Dr. Iyer's group, on her research:

"Predicting Combinatorial Binding of Transcription Factors to Regulatory Elements in the Human Genome by Association Rule Mining", joint work with Shulin Ni, Vishy Iyer and Daniel Miranker
2007.04.13 Paper Discussion: An Ensemble Framework for Clustering Protein-Protein Interaction Networks, Sitaram Asur, Srinivasan Parthasarathy and Duygu Ucar, to appear in ISMB 2007.

Related paper:
Cluster ensembles - a knowledge reuse framework for combining partitionings, In Proc. Conference on Artificial Intelligence (AAAI 2002), pages 93-98, July 2002.
2007.03.30 Paper Discussion: Dynamic Spectrum Quality Assessment and Iterative Computational Analysis of Shotgun Proteomic Data Nesvizhskii AI, Roos FF, Grossmann J, Vogelzang M, Eddes JS, Gruissem W, Baginsky S, Aebersold R. Mol Cell Proteomics. 2006 Apr;5(4):652-70.
2007.03.02 Paper Discussion:Clustering by Passing Messages Between Data Points, Brendan J. Frey and Delbert Dueck, Science,2007 Feb 16;315(5814):972-6
2007.02.09 Presentation by Smriti Ramakrishnan: 'Mass spectrometry (MS/MS) database search for protein identification: computational challenges & machine learning approaches'
2006.12.01 Presentation of mass spectrography technology by John Prince of the Marcotte Lab. Slides available here.
2006.11.17 Discussion of Protein names peeled off free text by Sven Mika and Burkard Rost (ISMB 04)
2006.11.03 Presentation by Dr. Gunjan Gupta on his dissertation research on density based cluster mining and bioinformatics applications. There will be a special demo of a product developed called Gene DIVER.

Related Papers:

  • Reconstructing the pathways of a cellular system from genome-scale signals by using matrix and tensor computations
    Orly Alter, and Gene H. Golub
    PNAS 12/6/2005 102:49 17559-17564
    Supplementary Materials: Discussion led by Dr. Alter.
  • 2006.10.06 Functional clustering of yeast proteins from the protein-protein interaction networkTaner Z Sen, Andrzej Kloczkowski and Robert L Jernigan BMC Bioinformatics 2006, 7:355
    2006.09.22 A discussion by Chase Krumpelman, Wan Kyu Kim, and Edward Marcotte on approaches to gene function prediction in the context of the Mousefunc I competition
    2006.05.01 Categorization Approach to Automated Ontological Protein Function AnnotationKM Verspoor, JD Cohn, SM Mniszewski, and CA Joslyn Protein Science, 2006, in press.
    A Related paper: Improving Protein Function Prediction using the Hierarchical Structure of the Gene Ontology R. Eisner, B,.Poulin, D. Szafron, P. Lu and R. Greiner 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, November 2005.
    2006.04.17 Inferring protein domain interactions from databases of interacting proteins. Genome Biol. 2005; 6(10): R89.
    Discussion to be let by Akshay Bhinge of the Iyer Lab
    2006.04.03 A review of recent results in graph clustering by Brian Kulis of Inderjit Dhillon's group, entitled Spectral Clustering without Eigenvectors. (Note: Relevant to protein interaction networks.)
    Kernel k-means, Spectral Clustering and Normalized Cuts
    A Fast Kernel-Based Multilevel Algorithm for Graph Clustering
    2006.03.20 Metagenes and molecular pattern discovery using matrix factorization.
    Jean-Philippe Brunet, Pablo Tamayo, Todd R. Golub, and Jill P. Mesirov
    Proc Natl Acad Sci USA March 23, 2004. v. 101 # 12, 4164-4169.
    2006.03.06 Integrating Co-occurence Statistics with Information Extraction for Robust Retrieval of Protein Interactions from Medline Bunescu, Mooney, Ramani, Marcotte
    2006.02.20 Discovery of biological networks from diverse functional genomic data C. Myers, D. Robson, A. Wible, C. Theesfeld, K. Dolinski and O. Troyanskaya
    2006.02.06 Discussion of Lise Getoor's work on link-based analysis:
    Link Mining: A New Data Challenge Lise Getoor
    From Instances to Classes in Probabilistic Models Getoor, Koller, Friedman
    Learning Probabilistic Models of Link Structure Getoor, Friedman, Koller, Taskar
    2005.12.16 A data integration methodology for systems biology Hwang, et al.
    A data integration methodology for systems biology: Experimental verification Hwang, et al.
    2005.12.2 Clustering Short Time Series Gene Expression Data. Ernst, Nau, Bar-Joseph
    2005.11.18 Phenotypic diversity, population growth, and information in fluctuating environments. Kussell E, Leibler S. Science. 2005 Sep 23;309(5743):2005-7.
    2005.11.4 Discussion led by Chase:
    Mining Coherent Dense Subgraphs Across Massive Biological Networks for Functional Discovery. Hu, Yan, Huang, Han, Zhou
    2005.10.21 Discussion led by David Reynolds and Larsson Omberg
    1. Robustness of cellular functions J. Stelling, U. Sauer, Z. Szallasi, F. J. Doyle III, and J. Doyle. Cell, October, 2004.
          2. The Robust Yet Fragile Nature of the Internet Doyle et al, (2005) P Natl Acad Sci USA. vol. 102 no. 41, October 11, 2005
    2005.10.07 Whole-proteome Prediction of Protein Function via Graph-theoretic Analysis of Interaction Maps. Nabieva, Jim, Agarwal, Chazelle, and Singh
    2005.09.23 Causal protein-signaling networks derived from multiparameter single-cell data. Sachs K, Perez O, Pe'er D, Lauffenburger DA, Nolan GP. Science. 2005 Apr 22;308(5721):523-9.
    2005.04.22 Paper discussion (and presentation) led by Christine Vogel:
    Functional annotation and network reconstruction through cross-platform integration of microarray data. Zhou XJ, Kao MC, Huang H, Wong A, Nunez-Iglesias J, Primig M, Aparicio OM, Finch CE, Morgan TE, Wong WH. Nat. Biotechnology 2005 Feb;23(2):238-43. Epub 2005 Jan 16
    2005.03.25 Paper Discussion:
    LAGAN and Multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA. Brudno M, Do C, Cooper GM, Kim MF, Davydov E, Green ED, Sidow A, Batzoglou S. Genome Research 13: 721-731, 2003.
    2005.03.11 Paper Discussion:
    GEST: a gene expression search tool based on a novel Bayesian similarity metric. Lawrence Hunter, Ronald C. Taylor, Sonia M. Leach and Richard Simon Bioinformatics. Vo. 17, No. 90001. 2001. ppS115--S122.
    2005.02.25 Paper presentation (led by Arun and Razvan)
    Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome Arun K Ramani, Razvan C Bunescu, Raymond J Mooney and Edward M Marcotte. Genome Biology (conditionally accepted)
    2005.02.11 Paper discussion (led by Dr. Orly Alter):
  • Integrative analysis of genome-scale data by using pseudoinverse projection predicts novel correlation between DNA replication and RNA transcription, Orly Alter and Gene H. Golub. PNAS November 23, 2004 vol. 101(47)
  • 2005.01.28 Paper discussion:
  • "Multi-Relational Learning, Text-Mining, and Semi-Supervised Learning for Functional Genomics" Krogel & Scheffer, Machine Learning, 57, 2004
  • 2004.12.3 Paper discussion:
    2004.11.15 Talk:
    • SPEAKER: Jack Y. Yang/Indiana School of Medicine
      TITLE: Sequential Bifurcation Approaches in Genomic Data
    2004.10.29 Paper discussion:
    2004.10.15 Paper discussion:
    2004.10.01Paper discussion:
    2004.07.16Paper discussion:
    2004.06.25Paper discussion:
    2004.06.11Paper discussion:
    2004.05.28Paper discussion:
    2004.04.30Paper discussion:
    • Decomposing Gene Expression into Cellular Processes
      E. Segal, A. Battle, D. Koller.
      In Proc. 8th Pacific Symposium on Biocomputing (PSB), Kaua'i, January 2003
    • Probabilistic Discovery of Overlapping Cellular Processes and their Regulation Using Gene Expression Data
      A. Battle, E. Segal, D. Koller.
      In Proc. 8th Inter. Conf. on Research in Computational Molecular Biology (RECOMB), San-Diego, CA, April 2004
    Both papers available from: http://robotics.stanford.edu/~erans/
    2004.04.30Presentation by visiting faculty:
    • Prof. Dan Boley, University of Minnesota
      Automated Data Cleaning
    2004.04.16Overview of Inderjit Dhillon's research
    2004.03.12Overview of Joydeep Ghosh's research
    2004.02.27Overview of Orly Alter's research
    2004.02.13Overview of research in the Marcotte lab.
    2004.01.30Overview of Dr. Miranker's biological databases research.
    2003.12.05Overview of research in the Iyer lab.
    Suggested Reading:
    2003.11.14Research in Dr. Mooney's group.

