Faculty Recruitment, Sara Mostafavi, "Integrating multiple types of genomics data to disentangle meaningful associations "

Contact Name: 
Katie Dahm
GDC 2.216 (Auditorium)
Apr 3, 2014 11:00am - 12:00pm

Signup Schedule: http://apps.cs.utexas.edu/talkschedules/cgi/list_events.cgi

Talk Audience: UTCS Faculty, Grads, Undergrads, Other Interested Parties

Host:  Ray Mooney

Talk Abstract: The quantity and variety of genomics datasets has increased tremendously in the last decade, presenting novel opportunities both for deriving cellular pathways and networks, and for identifying genetic and cellular mechanisms that underlie disease.  However, interpreting this data to extract biological insights requires disentangling meaningful, and hence reproducible and consequential associations, from mere correlations (i.e. spurious associations).  In this talk, I will present computational and machine learning approaches for leveraging prior biological knowledge, while integrating heterogeneous data, in order to find robust associations.  In particular, I will first describe a scalable method for the graph-based integration of diverse types of genomics data, in order to accurately infer functional roles for uncharacterized genes based on a small set of known (training) genes.  This approach results in the state of the art for automatically leveraging the continuous production of new genomics data.  Secondly, focusing on the task of finding associations between genetic variation and cellular (expression) traits in a population-based study, I will present methods for using known confounding factors in order to infer and account for hidden confounding factors.  Thirdly, expanding this task to the context of specific diseases, I will describe a project that combines genotype, RNA-sequencing, and environmental data to find genes and pathway that correlate with disease status.  Applying this approach to a large case/control study of major depression, a highly confounded disorder, sheds new light on molecular mechanisms associated with this pathology.

Speaker Bio: Sara Mostafavi is a postdoctoral fellow at Stanford University. Her research interest lies in developing and applying computational and machine learning approaches to study the genetic basis of complex diseases.  As part of her recent work, in collaboration with colleagues from the psychiatry department, she has developed approaches for identifying genetic and biomolecular markers for major depression, while addressing the problem of technical and biological confounding factors. She received her PhD in Computer Science from University of Toronto. For her PhD research, Sara developed models and algorithms for combining heterogeneous genomic data sources to make predictions about gene and protein functions these methods now underlie the GeneMANIA webserver and are widely used by biologists and experimentalists.