Saurabh Sinha
Areas of Research
Biomedical Informatics & Systems ModelingBiography
My research program addresses analytical challenges related to gene expression and its regulation, using diverse methods from statistics and machine learning. I have been a PI since 2005, first at the University of Illinois at Urbana-Champaign (Dept. of Computer Science, 2005-22), and then at Georgia Tech (since 2022), where I have a joint faculty appointment in ISyE and BME. We develop computational tools to identify regulators of gene expression and decipher how their influence is encoded in the genome, by analyzing transcriptomics data along with other omics data. Most of these tools are specialized for single-cell data, e.g., the SERGIO tool for simulating scRNA-seq data and the CIMLA tool for discovering inter-condition changes in gene regulation using robust causal inference approaches. We use our tools to uncover regulatory mechanisms underlying diverse phenotypes, including embryonic development, social behavior, cancer drug response. Over the last 4-5 years, our lab’s attention has increasingly turned to analytical challenges in “Spatial Transcriptomics” (ST) technology, which is revolutionizing the measurement of gene expression and thus promises exciting new avenues for our research. We are also developing methods for analyzing other modalities of spatial data, such as spatial metabolomics and microscopy data. I have also served in leadership positions in major federally funded centers. From 2014 to 2019 I served as Co-Director and Research Lead of an NIH-funded Center of Excellence (BD2K) at UIUC. This Center created “KnowEnG”, a Cloud-based toolbox implementing novel algorithms for network-guided analysis of multi-omics data, and trained ~50 personnel in bioinformatics. As a co-PI and a Thrust lead of the NSF AI Institute for Chemical Synthesis, I recently led development of machine learning approaches to optimize biosynthesis or chemical synthesis strategies. I also serve as a Thrust lead on the NSF Engineering Research Center “CMaT: Center for Manufacturing Technologies” led by Georgia Tech.
Research Interests
Focus areas of our future work will include (1) multi-omics – where we develop rigorous analytical approaches to combine multiple types of molecular data, e.g., genomics, transcriptomics, epigenomics, metabolomics, to infer a coherent picture of the underlying cellular biology, and (2) spatial omics – where we analyze transcriptomics and other omics data at the sub-cellular resolution to understand dynamic processes shaping the spatial distribution of molecules. Research into these topics will aim to understand the changes accompanying a biological process such as disease progression or behavioral responses, and how the DNA encodes the program for such changes. We will use methods of machine learning and deep learning as well as probabilistic models and biophysical models, separately and in combination, to tackle these challenging problems.