Dr. Sanjarbek Hudaiberdiev on “Using Deep Learning to infer causality of GWAS SNPs”

Dr. Sanjarbek Hudaiberdiev will give a seminar on “Using Deep Learning to infer causality of GWAS SNPs” on November 20 at 4 pm. The abstract of the talk and a short bio is shared below.


Bio:
Sanjar Hudaiberdiev is a research fellow at National Center for Biotechnology and Information (NCBI) of National Institutes of Health (NIH). His current research is focused on the development and implementation of data-centric computational approaches to understand the mechanisms of non-coding regulatory elements in human genome. He received his BSc degree in Computer Science at Erciyes University in Kayseri, and Ph.D. degree at International Center for Genetic Engineering and Biotechnology (ICGEB) in Trieste, Italy.

Abstract:
Decomposition of GWAS signals into the causal SNPs and the noise brought by linkage disequilibrium (LD) remains a key question to understand the mechanisms and the genetic underpinnings of complex human diseases. The overwhelming majority of SNPs linked to the diseases fall into non-coding regions of human genome. We developed a two-step deep learning (DL) framework that identifies active regions within regulatory elements (REs) and quantifies the influence of any arbitrary point mutation on the activity of the host RE. Our framework is tissue-specific, and we show that the mutation-susceptible regions in REs largely correspond to the binding sites of active transcription factors (TFs) and that the predicted mutational impact of these regions matches the binding specificity of the corresponding TFs. We further show that our scores of mutational impacts strongly correlate with the experimental data from a set of arrays, including those for quantitative trait loci for chromatin accessibility (caQTLs), massively parallel reporter assays (MPRAs), and reporter assay QTLs (raQTLs). Application of our method for resolving the ambiguity in LD of Type 2 Diabetes (T2D) genome-wide association studies (GWASs) consisting of 403 strong genetic T2D associations resulted in 706 (5.7%) genetic associations with quantifiable impact on pancreatic islet enhancer activity. We confirmed the directionality and magnitude of disrupted enhancer activity for a panel of experimentally validated T2D single nucleotide polymorphisms (SNPs), and we predicted 46 novel T2D causative enhancer mutations.