Dr. Sara Mostafavi
Assistant Professor
Statistics, Medical Genetics
Sara Mostafavi is an Assistant Professor at the Department of Statistics and the Department of Medical Genetics, and an affiliate member of the Department of Computer Science, at University of British Columbia (UBC). She is a Canada Research Chair (CRC II) in Computational Biology, and a CIFAR fellow in the Child and Brain Development program. Before arriving at UBC, she did her postdoctoral fellowship with Daphne Koller at Stanford University. She received her PhD in Computer Science from the University of Toronto in 2011, working with Quaid Morris. Her PhD thesis was on integrating large-scale genomics and proteomics datasets to predict gene function.
Current Research Focus
The production of diverse types of high-dimensional biological data has increased tremendously in the last decade, presenting novel opportunities to develop and apply computational and machine learning approaches to understand the genetics of human diseases. However, the high dimensionality of this data, whereby up to millions of diverse and heterogeneous “features” are measured in a single experiment, coupled with the prevalence of systematic confounding factors present significant challenges in disentangling bona fide associations that are informative of causal molecular events in disease. Dr. Mostafavi’s research interest lies in designing tailored computational models for integrating multiple types of high-dimensional “omics” data, with the ultimate goal of disentangling meaningful molecular correlations for common diseases such as psychiatric disorders.
Example Project(s)
“Modelling the Effect of Hidden Environmental Exposures on Genomics Data”
We are building latent variable models, based on matrix factorization, in order to infer and model the effect of hidden environmental exposures on genomics data. In particular, gene expression profiling enables researchers to summarize the joint effect of genetic and environmental factors on a per-gene and per-individual basis. Given genetic data, such models decompose gene expression to its genetic and (unmeasured) environmental components. Application of such models has enabled us to identify unmeasured environments and infer their impact on variation in gene expression and disease outcome.
“Prediction of Causal Genes in Rare Disease Settings”
We are developing graph-based approaches, based on Gaussian Markov Random Fields, to combine multi-omic datasets in order to predict causal genes in rare disease settings. The models that are we building use GMRFs to combine multiple sources of information, while capturing the complexity of biological networks, in order to prioritize mutations in genes that are likely to be causal.
Research Keywords
Computational Biology, Regulatory Networks, Genetics of Complex Traits, Psychiatric Genetics, Machine Learning in Computational Biology, Genomics, Computational Medicine, Data Integration