From Genotype to Function: Integrating Single-Cell Sequencing and Predictive Data Science to Decode Complex Disease Mechanisms
Understanding how genetic variation contributes to common complex diseases remains a central challenge in human genomics. Although genome-wide association studies (GWAS) have identified thousands of disease-associated loci, most lie in non-coding regions, making it difficult to pinpoint the causal variants, their target genes, and the cell types in which they act. To move beyond association and toward mechanism, we need integrative approaches that combine multiple layers of biological information.
In this talk, I will present RegSCOUT (Regulatory Single-Cell Omics for Unraveling Trait-loci), a computational pipeline developed by our lab to predict gene regulatory mechanisms by integrating GWAS, single-cell chromatin accessibility (scATAC-seq), eQTL, and chromatin conformation (Hi-C) datasets. RegSCOUT provides in-silico predictions of (i) disease-relevant cell types, (ii) regulatory elements whose chromatin accessibility and transcription factor binding are likely altered by risk variants, and (iii) downstream target genes and pathways.
The pipeline is flexible and scalable, designed to work across diverse tissue contexts, developmental stages, and diseases, depending on the input datasets. It enables the identification of cell-type- and context-specific regulatory mechanisms that are otherwise hidden in bulk-level or single-modality analyses. I will discuss the computational framework underlying RegSCOUT, including challenges in integrating high-dimensional data, modeling regulatory element–gene interactions, and prioritizing functional SNPs, TFs, and genes for experimental follow-up.
This work highlights how predictive data science, powered by single-cell and multi-omic technologies, can help bridge the gap between genotype and function, illuminating the regulatory architecture of complex disease.