1
|
Biddie SC, Weykopf G, Hird EF, Friman ET, Bickmore WA. DNA-binding factor footprints and enhancer RNAs identify functional non-coding genetic variants. Genome Biol 2024; 25:208. [PMID: 39107801 PMCID: PMC11304670 DOI: 10.1186/s13059-024-03352-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 07/25/2024] [Indexed: 08/10/2024] Open
Abstract
BACKGROUND Genome-wide association studies (GWAS) have revealed a multitude of candidate genetic variants affecting the risk of developing complex traits and diseases. However, the highlighted regions are typically in the non-coding genome, and uncovering the functional causative single nucleotide variants (SNVs) is challenging. Prioritization of variants is commonly based on genomic annotation with markers of active regulatory elements, but current approaches still poorly predict functional variants. To address this, we systematically analyze six markers of active regulatory elements for their ability to identify functional variants. RESULTS We benchmark against molecular quantitative trait loci (molQTL) from assays of regulatory element activity that identify allelic effects on DNA-binding factor occupancy, reporter assay expression, and chromatin accessibility. We identify the combination of DNase footprints and divergent enhancer RNA (eRNA) as markers for functional variants. This signature provides high precision, but with a trade-off of low recall, thus substantially reducing candidate variant sets to prioritize variants for functional validation. We present this as a framework called FINDER-Functional SNV IdeNtification using DNase footprints and eRNA. CONCLUSIONS We demonstrate the utility to prioritize variants using leukocyte count trait and analyze variants in linkage disequilibrium with a lead variant to predict a functional variant in asthma. Our findings have implications for prioritizing variants from GWAS, in development of predictive scoring algorithms, and for functionally informed fine mapping approaches.
Collapse
Affiliation(s)
- Simon C Biddie
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.
- NHS Lothian, Edinburgh, UK.
| | - Giovanna Weykopf
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | | | - Elias T Friman
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Wendy A Bickmore
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
2
|
Iñiguez-Muñoz S, Llinàs-Arias P, Ensenyat-Mendez M, Bedoya-López AF, Orozco JIJ, Cortés J, Roy A, Forsberg-Nilsson K, DiNome ML, Marzese DM. Hidden secrets of the cancer genome: unlocking the impact of non-coding mutations in gene regulatory elements. Cell Mol Life Sci 2024; 81:274. [PMID: 38902506 PMCID: PMC11335195 DOI: 10.1007/s00018-024-05314-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 12/07/2023] [Accepted: 06/06/2024] [Indexed: 06/22/2024]
Abstract
Discoveries in the field of genomics have revealed that non-coding genomic regions are not merely "junk DNA", but rather comprise critical elements involved in gene expression. These gene regulatory elements (GREs) include enhancers, insulators, silencers, and gene promoters. Notably, new evidence shows how mutations within these regions substantially influence gene expression programs, especially in the context of cancer. Advances in high-throughput sequencing technologies have accelerated the identification of somatic and germline single nucleotide mutations in non-coding genomic regions. This review provides an overview of somatic and germline non-coding single nucleotide alterations affecting transcription factor binding sites in GREs, specifically involved in cancer biology. It also summarizes the technologies available for exploring GREs and the challenges associated with studying and characterizing non-coding single nucleotide mutations. Understanding the role of GRE alterations in cancer is essential for improving diagnostic and prognostic capabilities in the precision medicine era, leading to enhanced patient-centered clinical outcomes.
Collapse
Affiliation(s)
- Sandra Iñiguez-Muñoz
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain
| | - Pere Llinàs-Arias
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain
| | - Miquel Ensenyat-Mendez
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain
| | - Andrés F Bedoya-López
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain
| | - Javier I J Orozco
- Saint John's Cancer Institute, Providence Saint John's Health Center, Santa Monica, CA, USA
| | - Javier Cortés
- International Breast Cancer Center (IBCC), Pangaea Oncology, Quiron Group, 08017, Barcelona, Spain
- Medica Scientia Innovation Research SL (MEDSIR), 08018, Barcelona, Spain
- Faculty of Biomedical and Health Sciences, Department of Medicine, Universidad Europea de Madrid, 28670, Madrid, Spain
| | - Ananya Roy
- Department of Immunology, Genetics and Pathology and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Karin Forsberg-Nilsson
- Department of Immunology, Genetics and Pathology and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- University of Nottingham Biodiscovery Institute, Nottingham, UK
| | - Maggie L DiNome
- Department of Surgery, Duke University School of Medicine, Durham, NC, USA
| | - Diego M Marzese
- Cancer Epigenetics Laboratory at the Cancer Cell Biology Group, Institut d'Investigació Sanitària Illes Balears (IdISBa), Palma, Spain.
- Department of Surgery, Duke University School of Medicine, Durham, NC, USA.
| |
Collapse
|
3
|
Chin IM, Gardell ZA, Corces MR. Decoding polygenic diseases: advances in noncoding variant prioritization and validation. Trends Cell Biol 2024; 34:465-483. [PMID: 38719704 DOI: 10.1016/j.tcb.2024.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 03/12/2024] [Accepted: 03/21/2024] [Indexed: 06/09/2024]
Abstract
Genome-wide association studies (GWASs) provide a key foundation for elucidating the genetic underpinnings of common polygenic diseases. However, these studies have limitations in their ability to assign causality to particular genetic variants, especially those residing in the noncoding genome. Over the past decade, technological and methodological advances in both analytical and empirical prioritization of noncoding variants have enabled the identification of causative variants by leveraging orthogonal functional evidence at increasing scale. In this review, we present an overview of these approaches and describe how this workflow provides the groundwork necessary to move beyond associations toward genetically informed studies on the molecular and cellular mechanisms of polygenic disease.
Collapse
Affiliation(s)
- Iris M Chin
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Zachary A Gardell
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - M Ryan Corces
- Gladstone Institute of Neurological Disease, Gladstone Institutes, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA; Department of Neurology, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
4
|
Jiang R, Huang W, Qiu X, Chen J, Luo R, Zeng R, Tong S, Lyu Y, Sun P, Lian Q, Leung FW, Liu Y, Sha W, Chen H. Unveiling promising drug targets for autism spectrum disorder: insights from genetics, transcriptomics, and proteomics. Brief Bioinform 2024; 25:bbae353. [PMID: 39038939 PMCID: PMC11262832 DOI: 10.1093/bib/bbae353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 05/20/2024] [Accepted: 07/09/2024] [Indexed: 07/24/2024] Open
Abstract
Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder for which current treatments are limited and drug development costs are prohibitive. Identifying drug targets for ASD is crucial for the development of targeted therapies. Summary-level data of expression quantitative trait loci obtained from GTEx, protein quantitative trait loci data from the ROSMAP project, and two ASD genome-wide association studies datasets were utilized for discovery and replication. We conducted a combined analysis using Mendelian randomization (MR), transcriptome-wide association studies, Bayesian colocalization, and summary-data-based MR to identify potential therapeutic targets associated with ASD and examine whether there are shared causal variants among them. Furthermore, pathway and drug enrichment analyses were performed to further explore the underlying mechanisms and summarize the current status of pharmacological targets for developing drugs to treat ASD. The protein-protein interaction (PPI) network and mouse knockout models were performed to estimate the effect of therapeutic targets. A total of 17 genes revealed causal associations with ASD and were identified as potential targets for ASD patients. Cathepsin B (CTSB) [odd ratio (OR) = 2.66 95, confidence interval (CI): 1.28-5.52, P = 8.84 × 10-3], gamma-aminobutyric acid type B receptor subunit 1 (GABBR1) (OR = 1.99, 95CI: 1.06-3.75, P = 3.24 × 10-2), and formin like 1 (FMNL1) (OR = 0.15, 95CI: 0.04-0.58, P = 5.59 × 10-3) were replicated in the proteome-wide MR analyses. In Drugbank, two potential therapeutic drugs, Acamprosate (GABBR1 inhibitor) and Bryostatin 1 (CASP8 inhibitor), were inferred as potential influencers of autism. Knockout mouse models suggested the involvement of the CASP8, GABBR1, and PLEKHM1 genes in neurological processes. Our findings suggest 17 candidate therapeutic targets for ASD and provide novel drug targets for therapy development and critical drug repurposing opportunities.
Collapse
Affiliation(s)
- Rui Jiang
- Department of Gastroenterology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, No. 106, Zhongshan 2nd Road, Guangzhou 510080, China
- The Second School of Clinical Medicine, Southern Medical University, No. 1023 Shatainan Road, Guangzhou 510515, China
- School of Medicine, South China University of Technology, No. 230, West Waihuan Road, Higher Education Mega Centre, Panyu District, Guangzhou 510006, China
| | - Wentao Huang
- Department of Gastroenterology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, No. 106, Zhongshan 2nd Road, Guangzhou 510080, China
- The Second School of Clinical Medicine, Southern Medical University, No. 1023 Shatainan Road, Guangzhou 510515, China
| | - Xinqi Qiu
- Cancer Prevention Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, No. 651 Dongfeng East Road, Guangzhou 510060, China
| | - Jianyi Chen
- Department of Gastroenterology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, No. 106, Zhongshan 2nd Road, Guangzhou 510080, China
- School of Medicine, South China University of Technology, No. 230, West Waihuan Road, Higher Education Mega Centre, Panyu District, Guangzhou 510006, China
| | - Ruibang Luo
- Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong 999077, China
| | - Ruijie Zeng
- Department of Gastroenterology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, No. 106, Zhongshan 2nd Road, Guangzhou 510080, China
| | - Shuangshuang Tong
- Department of Gastroenterology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, No. 106, Zhongshan 2nd Road, Guangzhou 510080, China
- Shantou University Medical College, Shantou University, No. 22 Xinling Road, Shantou 515041, Guangdong, China
| | - Yanlin Lyu
- Department of Gastroenterology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, No. 106, Zhongshan 2nd Road, Guangzhou 510080, China
- Shantou University Medical College, Shantou University, No. 22 Xinling Road, Shantou 515041, Guangdong, China
| | - Panpan Sun
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, No. 1068 Xueyuan Avenue, Shenzhen University Town, Shenzhen 518055, China
| | - Qizhou Lian
- Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, No. 1068 Xueyuan Avenue, Shenzhen University Town, Shenzhen 518055, China
- Cord Blood Bank, Guangzhou Institute of Eugenics and Perinatology, Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, No. 9 Jinsui Road, Guangzhou 510623, China
- State Key Laboratory of Pharmaceutical Biotechnology, The University of Hong Kong, Pokfulam Road, Hong Kong 999077, China
| | - Felix W Leung
- Sepulveda Ambulatory Care Center, VA Greater Los Angeles Healthcare System, 16111 Plummer Street, Los Angeles 91343, California, United States
- University of California Los Angeles David Geffen School of Medicine, 10833 Le Conte Avenue, Los Angeles 90095, California, United States
| | - Yufeng Liu
- Center for Medical Research on Innovation and Translation, Guangzhou First People's Hospital, the Second Affiliated Hospital of South China University of Technology, No 1 Panfu Road, Guangzhou 510000, China
| | - Weihong Sha
- Department of Gastroenterology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, No. 106, Zhongshan 2nd Road, Guangzhou 510080, China
- The Second School of Clinical Medicine, Southern Medical University, No. 1023 Shatainan Road, Guangzhou 510515, China
- School of Medicine, South China University of Technology, No. 230, West Waihuan Road, Higher Education Mega Centre, Panyu District, Guangzhou 510006, China
- Shantou University Medical College, Shantou University, No. 22 Xinling Road, Shantou 515041, Guangdong, China
| | - Hao Chen
- Department of Gastroenterology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, No. 106, Zhongshan 2nd Road, Guangzhou 510080, China
- The Second School of Clinical Medicine, Southern Medical University, No. 1023 Shatainan Road, Guangzhou 510515, China
- School of Medicine, South China University of Technology, No. 230, West Waihuan Road, Higher Education Mega Centre, Panyu District, Guangzhou 510006, China
- Shantou University Medical College, Shantou University, No. 22 Xinling Road, Shantou 515041, Guangdong, China
| |
Collapse
|
5
|
Jung J, Lu Z, de Smith A, Mancuso N. Novel insight into the etiology of ischemic stroke gained by integrative multiome-wide association study. Hum Mol Genet 2024; 33:170-181. [PMID: 37824084 PMCID: PMC10772041 DOI: 10.1093/hmg/ddad174] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 09/14/2023] [Accepted: 10/09/2023] [Indexed: 10/13/2023] Open
Abstract
Stroke, characterized by sudden neurological deficits, is the second leading cause of death worldwide. Although genome-wide association studies (GWAS) have successfully identified many genomic regions associated with ischemic stroke (IS), the genes underlying risk and their regulatory mechanisms remain elusive. Here, we integrate a large-scale GWAS (N = 1 296 908) for IS together with molecular QTLs data, including mRNA, splicing, enhancer RNA (eRNA), and protein expression data from up to 50 tissues (total N = 11 588). We identify 136 genes/eRNA/proteins associated with IS risk across 60 independent genomic regions and find IS risk is most enriched for eQTLs in arterial and brain-related tissues. Focusing on IS-relevant tissues, we prioritize 9 genes/proteins using probabilistic fine-mapping TWAS analyses. In addition, we discover that blood cell traits, particularly reticulocyte cells, have shared genetic contributions with IS using TWAS-based pheWAS and genetic correlation analysis. Lastly, we integrate our findings with a large-scale pharmacological database and identify a secondary bile acid, deoxycholic acid, as a potential therapeutic component. Our work highlights IS risk genes/splicing-sites/enhancer activity/proteins with their phenotypic consequences using relevant tissues as well as identify potential therapeutic candidates for IS.
Collapse
Affiliation(s)
- Junghyun Jung
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, 1450 Biggy Street, Los Angeles, CA 90033, United States
| | - Zeyun Lu
- Biostatistics Division, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, 2001 North Soto Street, Los Angeles, CA 90033, United States
| | - Adam de Smith
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, 1450 Biggy Street, Los Angeles, CA 90033, United States
| | - Nicholas Mancuso
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, 1450 Biggy Street, Los Angeles, CA 90033, United States
- Biostatistics Division, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, 2001 North Soto Street, Los Angeles, CA 90033, United States
- Department of Quantitative and Computational Biology, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089, United States
| |
Collapse
|
6
|
Chen C, Liu Y, Luo M, Yang J, Chen Y, Wang R, Zhou J, Zang Y, Diao L, Han L. PancanQTLv2.0: a comprehensive resource for expression quantitative trait loci across human cancers. Nucleic Acids Res 2024; 52:D1400-D1406. [PMID: 37870463 PMCID: PMC10767806 DOI: 10.1093/nar/gkad916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 09/29/2023] [Accepted: 10/06/2023] [Indexed: 10/24/2023] Open
Abstract
Expression quantitative trait locus (eQTL) analysis is a powerful tool used to investigate genetic variations in complex diseases, including cancer. We previously developed a comprehensive database, PancanQTL, to characterize cancer eQTLs using The Cancer Genome Atlas (TCGA) dataset, and linked eQTLs with patient survival and GWAS risk variants. Here, we present an updated version, PancanQTLv2.0 (https://hanlaboratory.com/PancanQTLv2/), with advancements in fine-mapping causal variants for eQTLs, updating eQTLs overlapping with GWAS linkage disequilibrium regions and identifying eQTLs associated with drug response and immune infiltration. Through fine-mapping analysis, we identified 58 747 fine-mapped eQTLs credible sets, providing mechanic insights of gene regulation in cancer. We further integrated the latest GWAS Catalog and identified a total of 84 592 135 linkage associations between eQTLs and the existing GWAS loci, which represents a remarkable ∼50-fold increase compared to the previous version. Additionally, PancanQTLv2.0 uncovered 659516 associations between eQTLs and drug response and identified 146948 associations between eQTLs and immune cell abundance, providing potentially clinical utility of eQTLs in cancer therapy. PancanQTLv2.0 expanded the resources available for investigating gene expression regulation in human cancers, leading to advancements in cancer research and precision oncology.
Collapse
Affiliation(s)
- Chengxuan Chen
- Brown Center for Immunotherapy, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
- Department of Biostatistics and Health Data Science, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
- Center for Epigenetics and Disease Prevention, Institute of Biosciences and Technology, Texas A&M University, Houston, TX 77030, USA
| | - Yuan Liu
- Brown Center for Immunotherapy, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
- Department of Biostatistics and Health Data Science, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
- Center for Epigenetics and Disease Prevention, Institute of Biosciences and Technology, Texas A&M University, Houston, TX 77030, USA
| | - Mei Luo
- Brown Center for Immunotherapy, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
- Department of Biostatistics and Health Data Science, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
| | - Jingwen Yang
- Brown Center for Immunotherapy, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
- Department of Biostatistics and Health Data Science, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
| | - Yamei Chen
- Brown Center for Immunotherapy, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
- Department of Biostatistics and Health Data Science, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
| | - Runhao Wang
- Brown Center for Immunotherapy, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
- Department of Biostatistics and Health Data Science, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
| | - Joseph Zhou
- Center for Epigenetics and Disease Prevention, Institute of Biosciences and Technology, Texas A&M University, Houston, TX 77030, USA
| | - Yong Zang
- Department of Biostatistics and Health Data Science, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
| | - Lixia Diao
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Leng Han
- Brown Center for Immunotherapy, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
- Department of Biostatistics and Health Data Science, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
- Center for Epigenetics and Disease Prevention, Institute of Biosciences and Technology, Texas A&M University, Houston, TX 77030, USA
| |
Collapse
|
7
|
Li Y, Zhang XO, Liu Y, Lu A. Allele-specific binding (ASB) analyzer for annotation of allele-specific binding SNPs. BMC Bioinformatics 2023; 24:464. [PMID: 38066439 PMCID: PMC10709849 DOI: 10.1186/s12859-023-05604-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 12/05/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Allele-specific binding (ASB) events occur when transcription factors (TFs) bind more favorably to one of the two parental alleles at heterozygous single nucleotide polymorphisms (SNPs). Evidence suggests that ASB events could reveal the impact of sequence variations on TF binding and may have implications for the risk of diseases. RESULTS Here we present ASB-analyzer, a software platform that enables the users to quickly and efficiently input raw sequencing data to generate individual reports containing the cytogenetic map of ASB SNPs and their associated phenotypes. This interactive tool thereby combines ASB SNP identification, biological annotation, motif analysis, phenotype associations and report summary in one pipeline. With this pipeline, we identified 3772 ASB SNPs from thirty GM12878 ChIP-seq datasets and demonstrated that the ASB SNPs were more likely to be enriched at important sites in TF-binding domains. CONCLUSIONS ASB-analyzer is a user-friendly tool that enables the detection, characterization and visualization of ASB SNPs. It is implemented in Python, R and bash shell and packaged in the Conda environment. It is available as an open-source tool on GitHub at https://github.com/Liying1996/ASBanalyzer .
Collapse
Affiliation(s)
- Ying Li
- Research Center for Translational Medicine at East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, 200010, China
| | - Xiao-Ou Zhang
- Research Center for Translational Medicine at East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, 200010, China
| | - Yan Liu
- Research Center for Translational Medicine at East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, 200010, China
| | - Aiping Lu
- Research Center for Translational Medicine at East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, 200010, China.
| |
Collapse
|
8
|
Zhu Z, Chen X, Zhang S, Yu R, Qi C, Cheng L, Zhang X. Leveraging molecular quantitative trait loci to comprehend complex diseases/traits from the omics perspective. Hum Genet 2023; 142:1543-1560. [PMID: 37755483 DOI: 10.1007/s00439-023-02602-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 09/14/2023] [Indexed: 09/28/2023]
Abstract
Comprehending the molecular basis of quantitative genetic variation is a principal goal for complex diseases or traits. Molecular quantitative trait loci (molQTLs) have made it possible to investigate the effects of genetic variants hiding behind large-scale omics data. A deeper understanding of molQTL is urgently required in light of the multi-dimensionalization of omics data to more fully elucidate the pertinent biological mechanisms. Herein, we reviewed molQTLs with the corresponding resource from the omics perspective and further discussed the integrative strategy of GWAS-molQTL to infer their causal effects. Subsequently, we described the opportunities and challenges encountered by molQTL. The case studies showed that molQTL is essential for complex diseases and traits, whether single- or multi-omics QTLs. Overall, we highlighted the functional significance of genetic variants to employ the discovery of molQTL in complex diseases and traits.
Collapse
Affiliation(s)
- Zijun Zhu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Xinyu Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Sainan Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Rui Yu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Changlu Qi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China.
- NHC Key Laboratory of Molecular Probe and Targeted Diagnosis and Therapy, Harbin Medical University, Harbin, 150028, Heilongjiang, China.
| | - Xue Zhang
- NHC Key Laboratory of Molecular Probe and Targeted Diagnosis and Therapy, Harbin Medical University, Harbin, 150028, Heilongjiang, China
- McKusick-Zhang Center for Genetic Medicine, State Key Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100005, China
| |
Collapse
|
9
|
Fabo T, Khavari P. Functional characterization of human genomic variation linked to polygenic diseases. Trends Genet 2023; 39:462-490. [PMID: 36997428 PMCID: PMC11025698 DOI: 10.1016/j.tig.2023.02.014] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 02/22/2023] [Accepted: 02/23/2023] [Indexed: 03/30/2023]
Abstract
The burden of human disease lies predominantly in polygenic diseases. Since the early 2000s, genome-wide association studies (GWAS) have identified genetic variants and loci associated with complex traits. These have ranged from variants in coding sequences to mutations in regulatory regions, such as promoters and enhancers, as well as mutations affecting mediators of mRNA stability and other downstream regulators, such as 5' and 3'-untranslated regions (UTRs), long noncoding RNA (lncRNA), and miRNA. Recent research advances in genetics have utilized a combination of computational techniques, high-throughput in vitro and in vivo screening modalities, and precise genome editing to impute the function of diverse classes of genetic variants identified through GWAS. In this review, we highlight the vastness of genomic variants associated with polygenic disease risk and address recent advances in how genetic tools can be used to functionally characterize them.
Collapse
Affiliation(s)
- Tania Fabo
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA; Stanford Cancer Institute, Stanford University, Stanford, CA, USA; Graduate Program in Genetics, Stanford University, Stanford, CA, USA; Stanford University School of Medicine, Stanford University, Stanford, CA, USA
| | - Paul Khavari
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA; Stanford Cancer Institute, Stanford University, Stanford, CA, USA; Graduate Program in Genetics, Stanford University, Stanford, CA, USA; Stanford University School of Medicine, Stanford University, Stanford, CA, USA; Veterans Affairs Palo Alto Healthcare System, Palo Alto, CA, USA.
| |
Collapse
|
10
|
Jung J, Lu Z, de Smith A, Mancuso N. Novel insight into the etiology of ischemic stroke gained by integrative transcriptome-wide association study. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.03.30.23287918. [PMID: 37034585 PMCID: PMC10081428 DOI: 10.1101/2023.03.30.23287918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Stroke, characterized by sudden neurological deficits, is the second leading cause of death worldwide. Although genome-wide association studies (GWAS) have successfully identified many genomic regions associated with ischemic stroke (IS), the genes underlying risk and their regulatory mechanisms remain elusive. Here, we integrate a large-scale GWAS (N=1,296,908) for IS together with mRNA, splicing, enhancer RNA (eRNA) and protein expression data (N=11,588) from 50 tissues. We identify 136 genes/eRNA/proteins associated with IS risk across 54 independent genomic regions and find IS risk is most enriched for eQTLs in arterial and brain-related tissues. Focusing on IS-relevant tissues, we prioritize 9 genes/proteins using probabilistic fine-mapping TWAS analyses. In addition, we discover that blood cell traits, particularly reticulocyte cells, have shared genetic contributions with IS using TWAS-based pheWAS and genetic correlation analysis. Lastly, we integrate our findings with a large-scale pharmacological database and identify a secondary bile acid, deoxycholic acid, as a potential therapeutic component. Our work highlights IS risk genes/splicing-sites/enhancer activity/proteins with their phenotypic consequences using relevant tissues as well as identify potential therapeutic candidates for IS.
Collapse
Affiliation(s)
- Junghyun Jung
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Zeyun Lu
- Biostatistics Division, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Adam de Smith
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Nicholas Mancuso
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Biostatistics Division, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|