1
|
Wang J. Deep Learning in Hematology: From Molecules to Patients. Clin Hematol Int 2024; 6:19-42. [PMID: 39417017 PMCID: PMC11477942 DOI: 10.46989/001c.124131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 06/29/2024] [Indexed: 10/19/2024] Open
Abstract
Deep learning (DL), a subfield of machine learning, has made remarkable strides across various aspects of medicine. This review examines DL's applications in hematology, spanning from molecular insights to patient care. The review begins by providing a straightforward introduction to the basics of DL tailored for those without prior knowledge, touching on essential concepts, principal architectures, and prevalent training methods. It then discusses the applications of DL in hematology, concentrating on elucidating the models' architecture, their applications, performance metrics, and inherent limitations. For example, at the molecular level, DL has improved the analysis of multi-omics data and protein structure prediction. For cells and tissues, DL enables the automation of cytomorphology analysis, interpretation of flow cytometry data, and diagnosis from whole slide images. At the patient level, DL's utility extends to analyzing curated clinical data, electronic health records, and clinical notes through large language models. While DL has shown promising results in various hematology applications, challenges remain in model generalizability and explainability. Moreover, the integration of novel DL architectures into hematology has been relatively slow in comparison to that in other medical fields.
Collapse
Affiliation(s)
- Jiasheng Wang
- Division of Hematology, Department of MedicineThe Ohio State University Comprehensive Cancer Center
| |
Collapse
|
2
|
Liang Q, Abraham A, Capra JA, Kostka D. Disease-specific prioritization of non-coding GWAS variants based on chromatin accessibility. HGG ADVANCES 2024; 5:100310. [PMID: 38773771 PMCID: PMC11259938 DOI: 10.1016/j.xhgg.2024.100310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 05/15/2024] [Accepted: 05/16/2024] [Indexed: 05/24/2024] Open
Abstract
Non-protein-coding genetic variants are a major driver of the genetic risk for human disease; however, identifying which non-coding variants contribute to diseases and their mechanisms remains challenging. In silico variant prioritization methods quantify a variant's severity, but for most methods, the specific phenotype and disease context of the prediction remain poorly defined. For example, many commonly used methods provide a single, organism-wide score for each variant, while other methods summarize a variant's impact in certain tissues and/or cell types. Here, we propose a complementary disease-specific variant prioritization scheme, which is motivated by the observation that variants contributing to disease often operate through specific biological mechanisms. We combine tissue/cell-type-specific variant scores (e.g., GenoSkyline, FitCons2, DNA accessibility) into disease-specific scores with a logistic regression approach and apply it to ∼25,000 non-coding variants spanning 111 diseases. We show that this disease-specific aggregation significantly improves the association of common non-coding genetic variants with disease (average precision: 0.151, baseline = 0.09), compared with organism-wide scores (GenoCanyon, LINSIGHT, GWAVA, Eigen, CADD; average precision: 0.129, baseline = 0.09). Further on, disease similarities based on data-driven aggregation weights highlight meaningful disease groups, and it provides information about tissues and cell types that drive these similarities. We also show that so-learned similarities are complementary to genetic similarities as quantified by genetic correlation. Overall, our approach demonstrates the strengths of disease-specific variant prioritization, leads to improvement in non-coding variant prioritization, and enables interpretable models that link variants to disease via specific tissues and/or cell types.
Collapse
Affiliation(s)
- Qianqian Liang
- Department of Computational & Systems Biology and Center for Evolutionary Biology and Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA; Department of Human Genetics, University of Pittsburgh School of Public Health, Pittsburgh, PA, USA
| | - Abin Abraham
- Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - John A Capra
- Department of Epidemiology & Biostatistics and Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Dennis Kostka
- Department of Computational & Systems Biology and Center for Evolutionary Biology and Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA.
| |
Collapse
|
3
|
Giovannetti A, Lazzari S, Mangoni M, Traversa A, Mazza T, Parisi C, Caputo V. Exploring non-coding genetic variability in ACE2: Functional annotation and in vitro validation of regulatory variants. Gene 2024; 915:148422. [PMID: 38570058 DOI: 10.1016/j.gene.2024.148422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 02/23/2024] [Accepted: 03/13/2024] [Indexed: 04/05/2024]
Abstract
The surge in human whole-genome sequencing data has facilitated the study of non-coding region variations, yet understanding their biological significance remains a challenge. We used a computational workflow to assess the regulatory potential of non-coding variants, with a particular focus on the Angiotensin Converting Enzyme 2 (ACE2) gene. This gene is crucial in physiological processes and serves as the entry point for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus causing coronavirus disease 19 (COVID-19). In our analysis, using data from the gnomAD population database and functional annotation, we identified 17 significant Single Nucleotide Variants (SNVs) in ACE2, particularly in its enhancers, promoters, and 3' untranslated regions (UTRs). We found preliminary evidence supporting the regulatory impact of some of these variants on ACE2 expression. Our detailed examination of two SNVs, rs147718775 and rs140394675, in the ACE2 promoter revealed that these co-occurring SNVs, when mutated, significantly enhance promoter activity, suggesting a possible increase in specific ACE2 isoform expression. This method proves effective in identifying and interpreting impactful non-coding variants, aiding in further studies and enhancing understanding of molecular bases of monogenic and complex traits.
Collapse
Affiliation(s)
- Agnese Giovannetti
- Clinical Genomics Laboratory, Fondazione IRCCS Casa Sollievo della Sofferenza, Viale Cappuccini, snc, 71013 S. Giovanni Rotondo (FG), Italy.
| | - Sara Lazzari
- Department of Experimental Medicine, Sapienza University of Rome, Viale Regina Elena, 324, 00161 Rome, Italy.
| | - Manuel Mangoni
- Department of Experimental Medicine, Sapienza University of Rome, Viale Regina Elena, 324, 00161 Rome, Italy; Bioinformatics Laboratory, Fondazione IRCCS Casa Sollievo della Sofferenza, Viale Cappuccini, snc, 71013 S. Giovanni Rotondo (FG), Italy.
| | - Alice Traversa
- Department of Experimental Medicine, Sapienza University of Rome, Viale Regina Elena, 324, 00161 Rome, Italy; Dipartimento di Scienze della Vita, della Salute e delle Professioni Sanitarie, Università degli Studi "Link Campus University", Via del Casale di San Pio V 44, 00165 Roma, Italy.
| | - Tommaso Mazza
- Bioinformatics Laboratory, Fondazione IRCCS Casa Sollievo della Sofferenza, Viale Cappuccini, snc, 71013 S. Giovanni Rotondo (FG), Italy.
| | - Chiara Parisi
- Institute of Biochemistry and Cell Biology, CNR-National Research Council, Via Ercole Ramarini, 32, 00015 Monterotondo Scalo (RM), Italy.
| | - Viviana Caputo
- Department of Experimental Medicine, Sapienza University of Rome, Viale Regina Elena, 324, 00161 Rome, Italy.
| |
Collapse
|
4
|
Jin W, Xia Y, Thela SR, Liu Y, Chen L. In silico generation and augmentation of regulatory variants from massively parallel reporter assay using conditional variational autoencoder. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.25.600715. [PMID: 38979263 PMCID: PMC11230389 DOI: 10.1101/2024.06.25.600715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Predicting the functional consequences of genetic variants in non-coding regions is a challenging problem. Massively parallel reporter assays (MPRAs), which are an in vitro high-throughput method, can simultaneously test thousands of variants by evaluating the existence of allele specific regulatory activity. Nevertheless, the identified labelled variants by MPRAs, which shows differential allelic regulatory effects on the gene expression are usually limited to the scale of hundreds, limiting their potential to be used as the training set for achieving a robust genome-wide prediction. To address the limitation, we propose a deep generative model, MpraVAE, to in silico generate and augment the training sample size of labelled variants. By benchmarking on several MPRA datasets, we demonstrate that MpraVAE significantly improves the prediction performance for MPRA regulatory variants compared to the baseline method, conventional data augmentation approaches as well as existing variant scoring methods. Taking autoimmune diseases as one example, we apply MpraVAE to perform a genome-wide prediction of regulatory variants and find that predicted regulatory variants are more enriched than background variants in enhancers, active histone marks, open chromatin regions in immune-related cell types, and chromatin states associated with promoter, enhancer activity and binding sites of cMyC and Pol II that regulate gene expression. Importantly, predicted regulatory variants are found to link immune-related genes by leveraging chromatin loop and accessible chromatin, demonstrating the importance of MpraVAE in genetic and gene discovery for complex traits.
Collapse
Affiliation(s)
- Weijia Jin
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603, USA
| | - Yi Xia
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603, USA
| | - Sai Ritesh Thela
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603, USA
| | - Yunlong Liu
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Li Chen
- Department of Biostatistics, University of Florida, Gainesville, FL, 32603, USA
| |
Collapse
|
5
|
Hao J, Huang C, Zhao W, Zhao L, Hu X, Zhang W, Guo L, Dou X, Jin T, Hu M. Association of NID2 SNPs with Glioma Risk and Prognosis in the Chinese Population. Neuromolecular Med 2024; 26:27. [PMID: 38935278 DOI: 10.1007/s12017-024-08795-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 10/05/2023] [Indexed: 06/28/2024]
Abstract
Glioma is the most common primary intracranial tumor with high mortality and poor prognosis. The purpose of this study was to investigate how single-nucleotide polymorphisms (SNPs) of the NID2 gene affect glioma risk and prognosis. Four candidate SNPs of NID2 in 529 glioma patients and 478 healthy controls were successfully genotyped by Agena MassARRAY mass spectrometer. Logistic regression was utilized to assess the associations between NID2 SNPs and glioma risk under different genetic models. Furthermore, the relationship between risk-related SNPs in NID2 and the prognosis of glioma patients was explored through Kaplan-Meier (KM) survival curve and Cox proportional hazard regression analysis. The results showed that rs11846847 (OR 1.24, p = 0.017) and rs1874569 (OR 1.22, p = 0.026) were significantly associated with an increased risk of glioma, and rs11846847 also had a risk-increasing effect on glioma in participants ≤ 40 years old. The interaction model of rs11846847 and rs1874569 could be more suitable for forecasting glioma risk. We also discovered a significant association between rs1874569 and poor prognosis in glioma patients (HR 1.32, p = 0.039) and especially CC genotype was relevant to shorter overall survival (OS) and progression-free survival (PFS) in patients with high-grade glioma. Additionally, the study demonstrated that gross total resection or chemotherapy improve glioma prognosis in the Chinese Han population. This study is the first to provide evidence for the association of NID2 SNPs with glioma risk and prognosis, suggesting that NID2 variants might be potential factors for glioma.
Collapse
Affiliation(s)
- Jie Hao
- College of Life Sciences, Northwest University, Taibai Campus, No. 229, Taibai North Road, Beilin District, Xi'an, 710069, Shaanxi, China
- Provincial Key Laboratory of Biotechnology of Shaanxi Province, Northwest University, Xi'an, Shaanxi, China
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Northwest University, Xi'an, Shaanxi, China
| | - Congmei Huang
- Department of Gynaecology, The Second Affiliated Hospital of Hainan Medical University, Haikou, Hainan, China
| | - Weiwei Zhao
- College of Life Sciences, Northwest University, Taibai Campus, No. 229, Taibai North Road, Beilin District, Xi'an, 710069, Shaanxi, China
- Provincial Key Laboratory of Biotechnology of Shaanxi Province, Northwest University, Xi'an, Shaanxi, China
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Northwest University, Xi'an, Shaanxi, China
| | - Lin Zhao
- College of Life Sciences, Northwest University, Taibai Campus, No. 229, Taibai North Road, Beilin District, Xi'an, 710069, Shaanxi, China
| | - Xiuxia Hu
- College of Life Sciences, Northwest University, Taibai Campus, No. 229, Taibai North Road, Beilin District, Xi'an, 710069, Shaanxi, China
- Provincial Key Laboratory of Biotechnology of Shaanxi Province, Northwest University, Xi'an, Shaanxi, China
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Northwest University, Xi'an, Shaanxi, China
| | - WenJie Zhang
- College of Life Sciences, Northwest University, Taibai Campus, No. 229, Taibai North Road, Beilin District, Xi'an, 710069, Shaanxi, China
- Provincial Key Laboratory of Biotechnology of Shaanxi Province, Northwest University, Xi'an, Shaanxi, China
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Northwest University, Xi'an, Shaanxi, China
| | - Le Guo
- College of Life Sciences, Northwest University, Taibai Campus, No. 229, Taibai North Road, Beilin District, Xi'an, 710069, Shaanxi, China
- Provincial Key Laboratory of Biotechnology of Shaanxi Province, Northwest University, Xi'an, Shaanxi, China
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Northwest University, Xi'an, Shaanxi, China
| | - Xia Dou
- College of Life Sciences, Northwest University, Taibai Campus, No. 229, Taibai North Road, Beilin District, Xi'an, 710069, Shaanxi, China
- Provincial Key Laboratory of Biotechnology of Shaanxi Province, Northwest University, Xi'an, Shaanxi, China
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Northwest University, Xi'an, Shaanxi, China
| | - Tianbo Jin
- College of Life Sciences, Northwest University, Taibai Campus, No. 229, Taibai North Road, Beilin District, Xi'an, 710069, Shaanxi, China.
- Provincial Key Laboratory of Biotechnology of Shaanxi Province, Northwest University, Xi'an, Shaanxi, China.
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Northwest University, Xi'an, Shaanxi, China.
| | - Mingjun Hu
- College of Life Sciences, Northwest University, Taibai Campus, No. 229, Taibai North Road, Beilin District, Xi'an, 710069, Shaanxi, China.
- Provincial Key Laboratory of Biotechnology of Shaanxi Province, Northwest University, Xi'an, Shaanxi, China.
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Northwest University, Xi'an, Shaanxi, China.
- Department of Neurosurgery, X'ian Changan District Hospital, Xi'an, Shaanxi, China.
| |
Collapse
|
6
|
Kim Y, Jeong M, Koh IG, Kim C, Lee H, Kim JH, Yurko R, Kim IB, Park J, Werling DM, Sanders SJ, An JY. CWAS-Plus: estimating category-wide association of rare noncoding variation from whole-genome sequencing data with cell-type-specific functional data. Brief Bioinform 2024; 25:bbae323. [PMID: 38966948 PMCID: PMC11224609 DOI: 10.1093/bib/bbae323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 06/13/2024] [Accepted: 06/18/2024] [Indexed: 07/06/2024] Open
Abstract
Variants in cis-regulatory elements link the noncoding genome to human pathology; however, detailed analytic tools for understanding the association between cell-level brain pathology and noncoding variants are lacking. CWAS-Plus, adapted from a Python package for category-wide association testing (CWAS), enhances noncoding variant analysis by integrating both whole-genome sequencing (WGS) and user-provided functional data. With simplified parameter settings and an efficient multiple testing correction method, CWAS-Plus conducts the CWAS workflow 50 times faster than CWAS, making it more accessible and user-friendly for researchers. Here, we used a single-nuclei assay for transposase-accessible chromatin with sequencing to facilitate CWAS-guided noncoding variant analysis at cell-type-specific enhancers and promoters. Examining autism spectrum disorder WGS data (n = 7280), CWAS-Plus identified noncoding de novo variant associations in transcription factor binding sites within conserved loci. Independently, in Alzheimer's disease WGS data (n = 1087), CWAS-Plus detected rare noncoding variant associations in microglia-specific regulatory elements. These findings highlight CWAS-Plus's utility in genomic disorders and scalability for processing large-scale WGS data and in multiple-testing corrections. CWAS-Plus and its user manual are available at https://github.com/joonan-lab/cwas/ and https://cwas-plus.readthedocs.io/en/latest/, respectively.
Collapse
Affiliation(s)
- Yujin Kim
- Department of Integrated Biomedical and Life Science, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
- L-HOPE Program for Community-Based Total Learning Health Systems, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
| | - Minwoo Jeong
- School of Biosystem and Biomedical Science, College of Health Science, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
| | - In Gyeong Koh
- Department of Integrated Biomedical and Life Science, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
- L-HOPE Program for Community-Based Total Learning Health Systems, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
| | - Chanhee Kim
- Department of Integrated Biomedical and Life Science, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
- L-HOPE Program for Community-Based Total Learning Health Systems, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
| | - Hyeji Lee
- Department of Integrated Biomedical and Life Science, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
- L-HOPE Program for Community-Based Total Learning Health Systems, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
| | - Jae Hyun Kim
- Department of Integrated Biomedical and Life Science, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
- L-HOPE Program for Community-Based Total Learning Health Systems, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
| | - Ronald Yurko
- Department of Statistics and Data Science, Carnegie Mellon University, 5000 Forbes Avenue, Squirrel Hill North, Pittsburgh, PA 15213, United States
| | - Il Bin Kim
- Department of Psychiatry, CHA Gangnam Medical Center, CHA University School of Medicine, 566 Nonhyon-ro, Gangnam-gu, Seoul 06135, Republic of Korea
| | - Jeongbin Park
- School of Biomedical Convergence Engineering, Pusan National University, 49 Busandaehak-ro, Mulgeum-eup, Yangsan-si, Gyeongsangnam-do, 50612, Republic of Korea
| | - Donna M Werling
- Laboratory of Genetics, University of Wisconsin-Madison, 425-g Henry Mall, Madison, WI 53706, Unite States
| | - Stephan J Sanders
- Department of Paediatrics, Institute of Developmental and Regenerative Medicine, University of Oxford, Old Road Campus, Roosevelt Dr, Headington, Oxford OX3 7TY, United Kingdom
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, 1651 4th Street, San Francisco, CA 94158, United States
| | - Joon-Yong An
- Department of Integrated Biomedical and Life Science, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
- L-HOPE Program for Community-Based Total Learning Health Systems, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
- School of Biosystem and Biomedical Science, College of Health Science, Korea University, 145 Anam-ro, Seongbuk-ku, Seoul 02841, Republic of Korea
| |
Collapse
|
7
|
Kim Y, Jeong M, Koh IG, Kim C, Lee H, Kim JH, Yurko R, Kim IB, Park J, Werling DM, Sanders SJ, An JY. CWAS-Plus: Estimating category-wide association of rare noncoding variation from whole-genome sequencing data with cell-type-specific functional data. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.15.24305828. [PMID: 38699372 PMCID: PMC11065022 DOI: 10.1101/2024.04.15.24305828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Variants in cis-regulatory elements link the noncoding genome to human brain pathology; however, detailed analytic tools for understanding the association between cell-level brain pathology and noncoding variants are lacking. CWAS-Plus, adapted from a Python package for category-wide association testing (CWAS) employs both whole-genome sequencing and user-provided functional data to enhance noncoding variant analysis, with a faster and more efficient execution of the CWAS workflow. Here, we used single-nuclei assay for transposase-accessible chromatin with sequencing to facilitate CWAS-guided noncoding variant analysis at cell-type specific enhancers and promoters. Examining autism spectrum disorder whole-genome sequencing data (n = 7,280), CWAS-Plus identified noncoding de novo variant associations in transcription factor binding sites within conserved loci. Independently, in Alzheimer's disease whole-genome sequencing data (n = 1,087), CWAS-Plus detected rare noncoding variant associations in microglia-specific regulatory elements. These findings highlight CWAS-Plus's utility in genomic disorders and scalability for processing large-scale whole-genome sequencing data and in multiple-testing corrections. CWAS-Plus and its user manual are available at https://github.com/joonan-lab/cwas/ and https://cwas-plus.readthedocs.io/en/latest/, respectively.
Collapse
Affiliation(s)
- Yujin Kim
- Department of Integrated Biomedical and Life Science, Korea University, Seoul, 02841, Republic of Korea
- L-HOPE Program for Community-Based Total Learning Health Systems, Korea University, Seoul, 02841, Republic of Korea
| | - Minwoo Jeong
- School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul, 02841, Republic of Korea
| | - In Gyeong Koh
- Department of Integrated Biomedical and Life Science, Korea University, Seoul, 02841, Republic of Korea
- L-HOPE Program for Community-Based Total Learning Health Systems, Korea University, Seoul, 02841, Republic of Korea
| | - Chanhee Kim
- Department of Integrated Biomedical and Life Science, Korea University, Seoul, 02841, Republic of Korea
- L-HOPE Program for Community-Based Total Learning Health Systems, Korea University, Seoul, 02841, Republic of Korea
| | - Hyeji Lee
- Department of Integrated Biomedical and Life Science, Korea University, Seoul, 02841, Republic of Korea
- L-HOPE Program for Community-Based Total Learning Health Systems, Korea University, Seoul, 02841, Republic of Korea
| | - Jae Hyun Kim
- Department of Integrated Biomedical and Life Science, Korea University, Seoul, 02841, Republic of Korea
- L-HOPE Program for Community-Based Total Learning Health Systems, Korea University, Seoul, 02841, Republic of Korea
| | - Ronald Yurko
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Il Bin Kim
- Department of Psychiatry, CHA Gangnam Medical Center, CHA University School of Medicine, Seoul, 06135, Republic of Korea
| | - Jeongbin Park
- School of Biomedical Convergence Engineering, Pusan National University, Busan, 50612, Republic of Korea
| | - Donna M. Werling
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Stephan J. Sanders
- Institute of Developmental and Regenerative Medicine, Department of Paediatrics, University of Oxford, Oxford, OX3 7TY, UK
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, CA 94158, USA
| | - Joon-Yong An
- Department of Integrated Biomedical and Life Science, Korea University, Seoul, 02841, Republic of Korea
- L-HOPE Program for Community-Based Total Learning Health Systems, Korea University, Seoul, 02841, Republic of Korea
- School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul, 02841, Republic of Korea
| |
Collapse
|
8
|
Ponomarenko I, Pasenov K, Churnosova M, Sorokina I, Aristova I, Churnosov V, Ponomarenko M, Reshetnikova Y, Reshetnikov E, Churnosov M. Obesity-Dependent Association of the rs10454142 PPP1R21 with Breast Cancer. Biomedicines 2024; 12:818. [PMID: 38672173 PMCID: PMC11048332 DOI: 10.3390/biomedicines12040818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 03/30/2024] [Accepted: 04/02/2024] [Indexed: 04/28/2024] Open
Abstract
The purpose of this work was to find a link between the breast cancer (BC)-risk effects of sex hormone-binding globulin (SHBG)-associated polymorphisms and obesity. The study was conducted on a sample of 1498 women (358 BC; 1140 controls) who, depending on the presence/absence of obesity, were divided into two groups: obese (119 BC; 253 controls) and non-obese (239 BC; 887 controls). Genotyping of nine SHBG-associated single nucleotide polymorphisms (SNP)-rs17496332 PRMT6, rs780093 GCKR, rs10454142 PPP1R21, rs3779195 BAIAP2L1, rs440837 ZBTB10, rs7910927 JMJD1C, rs4149056 SLCO1B1, rs8023580 NR2F2, and rs12150660 SHBG-was executed, and the BC-risk impact of these loci was analyzed by logistic regression separately in each group of obese/non-obese women. We found that the BC-risk effect correlated by GWAS with the SHBG-level polymorphism rs10454142 PPP1R21 depends on the presence/absence of obesity. The SHBG-lowering allele C rs10454142 PPP1R21 has a risk value for BC in obese women (allelic model: CvsT, OR = 1.52, 95%CI = 1.10-2.11, and pperm = 0.013; additive model: CCvsTCvsTT, OR = 1.71, 95%CI = 1.15-2.62, and pperm = 0.011; dominant model: CC + TCvsTT, OR = 1.95, 95%CI = 1.13-3.37, and pperm = 0.017) and is not associated with the disease in women without obesity. SNP rs10454142 PPP1R21 and 10 proxy SNPs have adipose-specific regulatory effects (epigenetic modifications of promoters/enhancers, DNA interaction with 51 transcription factors, eQTL/sQTL effects on five genes (PPP1R21, RP11-460M2.1, GTF2A1L, STON1-GTF2A1L, and STON1), etc.), can be "likely cancer driver" SNPs, and are involved in cancer-significant pathways. In conclusion, our study detected an obesity-dependent association of the rs10454142 PPP1R21 with BC in women.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | - Mikhail Churnosov
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia; (I.P.); (K.P.); (M.C.); (I.S.); (I.A.); (V.C.); (M.P.); (Y.R.); (E.R.)
| |
Collapse
|
9
|
Nakamura T, Ueda J, Mizuno S, Honda K, Kazuno AA, Yamamoto H, Hara T, Takata A. Topologically associating domains define the impact of de novo promoter variants on autism spectrum disorder risk. CELL GENOMICS 2024; 4:100488. [PMID: 38280381 PMCID: PMC10879036 DOI: 10.1016/j.xgen.2024.100488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 08/24/2023] [Accepted: 01/02/2024] [Indexed: 01/29/2024]
Abstract
Whole-genome sequencing (WGS) studies of autism spectrum disorder (ASD) have demonstrated the roles of rare promoter de novo variants (DNVs). However, most promoter DNVs in ASD are not located immediately upstream of known ASD genes. In this study analyzing WGS data of 5,044 ASD probands, 4,095 unaffected siblings, and their parents, we show that promoter DNVs within topologically associating domains (TADs) containing ASD genes are significantly and specifically associated with ASD. An analysis considering TADs as functional units identified specific TADs enriched for promoter DNVs in ASD and indicated that common variants in these regions also confer ASD heritability. Experimental validation using human induced pluripotent stem cells (iPSCs) showed that likely deleterious promoter DNVs in ASD can influence multiple genes within the same TAD, resulting in overall dysregulation of ASD-associated genes. These results highlight the importance of TADs and gene-regulatory mechanisms in better understanding the genetic architecture of ASD.
Collapse
Affiliation(s)
- Takumi Nakamura
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Junko Ueda
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan.
| | - Shota Mizuno
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Kurara Honda
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - An-A Kazuno
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Hirona Yamamoto
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan; Department of Neuropsychiatry, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8654, Japan
| | - Tomonori Hara
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan; Department of Organ Anatomy, Tohoku University Graduate School of Medicine, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan
| | - Atsushi Takata
- Laboratory for Molecular Pathology of Psychiatric Disorders, RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan; Research Institute for Diseases of Old Age, Juntendo University Graduate School of Medicine, 2-1-1 Hongo, Bunkyo-ku, Tokyo 113-8421, Japan.
| |
Collapse
|
10
|
Ponomarenko I, Pasenov K, Churnosova M, Sorokina I, Aristova I, Churnosov V, Ponomarenko M, Reshetnikov E, Churnosov M. Sex-Hormone-Binding Globulin Gene Polymorphisms and Breast Cancer Risk in Caucasian Women of Russia. Int J Mol Sci 2024; 25:2182. [PMID: 38396861 PMCID: PMC10888713 DOI: 10.3390/ijms25042182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 02/07/2024] [Accepted: 02/09/2024] [Indexed: 02/25/2024] Open
Abstract
In our work, the associations of GWAS (genome-wide associative studies) impact for sex-hormone-binding globulin (SHBG)-level SNPs with the risk of breast cancer (BC) in the cohort of Caucasian women of Russia were assessed. The work was performed on a sample of 1498 women (358 BC patients and 1140 control (non BC) subjects). SHBG correlated in previously GWAS nine polymorphisms such as rs780093 GCKR, rs17496332 PRMT6, rs3779195 BAIAP2L1, rs10454142 PPP1R21, rs7910927 JMJD1C, rs4149056 SLCO1B1, rs440837 ZBTB10, rs12150660 SHBG, and rs8023580 NR2F2 have been genotyped. BC risk effects of allelic and non-allelic SHBG-linked gene SNPs interactions were detected by regression analysis. The risk genetic factor for BC developing is an SHBG-lowering allele variant C rs10454142 PPP1R21 ([additive genetic model] OR = 1.31; 95%CI = 1.08-1.65; pperm = 0.024; power = 85.26%), which determines 0.32% of the cancer variance. Eight of the nine studied SHBG-related SNPs have been involved in cancer susceptibility as part of nine different non-allelic gene interaction models, the greatest contribution to which is made by rs10454142 PPP1R21 (included in all nine models, 100%) and four more SNPs-rs7910927 JMJD1C (five models, 55.56%), rs17496332 PRMT6 (four models, 44.44%), rs780093 GCKR (four models, 44.44%), and rs440837 ZBTB10 (four models, 44.44%). For SHBG-related loci, pronounced functionality in the organism (including breast, liver, fibroblasts, etc.) was predicted in silico, having a direct relationship through many pathways with cancer pathophysiology. In conclusion, our results demonstrated the involvement of SHBG-correlated genes polymorphisms in BC risk in Caucasian women in Russia.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Mikhail Churnosov
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia; (I.P.); (K.P.); (M.C.); (I.S.); (I.A.); (V.C.); (M.P.); (E.R.)
| |
Collapse
|
11
|
Wang Z, Zhao G, Zhu Z, Wang Y, Xiang X, Zhang S, Luo T, Zhou Q, Qiu J, Tang B, Xia K, Li B, Li J. VarCards2: an integrated genetic and clinical database for ACMG-AMP variant-interpretation guidelines in the human whole genome. Nucleic Acids Res 2024; 52:D1478-D1489. [PMID: 37956311 PMCID: PMC10767961 DOI: 10.1093/nar/gkad1061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/21/2023] [Accepted: 10/25/2023] [Indexed: 11/15/2023] Open
Abstract
VarCards, an online database, combines comprehensive variant- and gene-level annotation data to streamline genetic counselling for coding variants. Recognising the increasing clinical relevance of non-coding variations, there has been an accelerated development of bioinformatics tools dedicated to interpreting non-coding variations, including single-nucleotide variants and copy number variations. Regrettably, most tools remain as either locally installed databases or command-line tools dispersed across diverse online platforms. Such a landscape poses inconveniences and challenges for genetic counsellors seeking to utilise these resources without advanced bioinformatics expertise. Consequently, we developed VarCards2, which incorporates nearly nine billion artificially generated single-nucleotide variants (including those from mitochondrial DNA) and compiles vital annotation information for genetic counselling based on ACMG-AMP variant-interpretation guidelines. These annotations include (I) functional effects; (II) minor allele frequencies; (III) comprehensive function and pathogenicity predictions covering all potential variants, such as non-synonymous substitutions, non-canonical splicing variants, and non-coding variations and (IV) gene-level information. Furthermore, VarCards2 incorporates 368 820 266 documented short insertions and deletions and 2 773 555 documented copy number variations, complemented by their corresponding annotation and prediction tools. In conclusion, VarCards2, by integrating over 150 variant- and gene-level annotation sources, significantly enhances the efficiency of genetic counselling and can be freely accessed at http://www.genemed.tech/varcards2/.
Collapse
Affiliation(s)
- Zheng Wang
- National Clinical Research Center for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Hunan Key Laboratory of Molecular Precision Medicine, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Guihu Zhao
- National Clinical Research Center for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Bioinformatics Center, Furong Laboratory & Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Zhaopo Zhu
- Center for Medical Genetics & Hunan Key Laboratory, School of Life Sciences, Central South University, Changsha, Hunan 410008, China
| | - Yijing Wang
- National Clinical Research Center for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Bioinformatics Center, Furong Laboratory & Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Xudong Xiang
- National Clinical Research Center for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Shiyu Zhang
- Xiangya School of Medicine, Central South University, Changsha, Hunan 410013, China
| | - Tengfei Luo
- Center for Medical Genetics & Hunan Key Laboratory, School of Life Sciences, Central South University, Changsha, Hunan 410008, China
| | - Qiao Zhou
- National Clinical Research Center for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Bioinformatics Center, Furong Laboratory & Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Jian Qiu
- National Clinical Research Center for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Hunan Key Laboratory of Molecular Precision Medicine, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Beisha Tang
- National Clinical Research Center for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Department of Neurology, & Multi-Omics Research Center for Brain Disorders, The First Affiliated Hospital, University of South China, Hengyang, Hunan, China
| | - Kun Xia
- Center for Medical Genetics & Hunan Key Laboratory, School of Life Sciences, Central South University, Changsha, Hunan 410008, China
| | - Bin Li
- National Clinical Research Center for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Bioinformatics Center, Furong Laboratory & Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Jinchen Li
- National Clinical Research Center for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Center for Medical Genetics & Hunan Key Laboratory, School of Life Sciences, Central South University, Changsha, Hunan 410008, China
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
- Bioinformatics Center, Furong Laboratory & Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| |
Collapse
|
12
|
Shirvanizadeh N, Vihinen M. VariBench, new variation benchmark categories and data sets. FRONTIERS IN BIOINFORMATICS 2023; 3:1248732. [PMID: 37795169 PMCID: PMC10546188 DOI: 10.3389/fbinf.2023.1248732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 09/08/2023] [Indexed: 10/06/2023] Open
Affiliation(s)
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, Sweden
| |
Collapse
|
13
|
Wang Z, Zhao G, Li B, Fang Z, Chen Q, Wang X, Luo T, Wang Y, Zhou Q, Li K, Xia L, Zhang Y, Zhou X, Pan H, Zhao Y, Wang Y, Wang L, Guo J, Tang B, Xia K, Li J. Performance Comparison of Computational Methods for the Prediction of the Function and Pathogenicity of Non-coding Variants. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:649-661. [PMID: 35272052 PMCID: PMC10787016 DOI: 10.1016/j.gpb.2022.02.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 12/28/2021] [Accepted: 02/27/2022] [Indexed: 06/14/2023]
Abstract
Non-coding variants in the human genome significantly influence human traits and complex diseases via their regulation and modification effects. Hence, an increasing number of computational methods are developed to predict the effects of variants in human non-coding sequences. However, it is difficult for inexperienced users to select appropriate computational methods from dozens of available methods. To solve this issue, we assessed 12 performance metrics of 24 methods on four independent non-coding variant benchmark datasets: (1) rare germline variants from clinical relevant sequence variants (ClinVar), (2) rare somatic variants from Catalogue Of Somatic Mutations In Cancer (COSMIC), (3) common regulatory variants from curated expression quantitative trait locus (eQTL) data, and (4) disease-associated common variants from curated genome-wide association studies (GWAS). All 24 tested methods performed differently under various conditions, indicating varying strengths and weaknesses under different scenarios. Importantly, the performance of existing methods was acceptable for rare germline variants from ClinVar with the area under the receiver operating characteristic curve (AUROC) of 0.4481-0.8033 and poor for rare somatic variants from COSMIC (AUROC = 0.4984-0.7131), common regulatory variants from curated eQTL data (AUROC = 0.4837-0.6472), and disease-associated common variants from curated GWAS (AUROC = 0.4766-0.5188). We also compared the prediction performance of 24 methods for non-coding de novo mutations in autism spectrum disorder, and found that the combined annotation-dependent depletion (CADD) and context-dependent tolerance score (CDTS) methods showed better performance. Summarily, we assessed the performance of 24 computational methods under diverse scenarios, providing preliminary advice for proper tool selection and guiding the development of new techniques in interpreting non-coding variants.
Collapse
Affiliation(s)
- Zheng Wang
- National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Guihu Zhao
- National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha 410008, China; Department of Neurology, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Bin Li
- National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha 410008, China; Department of Neurology, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Zhenghuan Fang
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha 410008, China
| | - Qian Chen
- National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Xiaomeng Wang
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha 410008, China
| | - Tengfei Luo
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha 410008, China
| | - Yijing Wang
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha 410008, China
| | - Qiao Zhou
- National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Kuokuo Li
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha 410008, China
| | - Lu Xia
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha 410008, China
| | - Yi Zhang
- National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Xun Zhou
- National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Hongxu Pan
- Department of Neurology, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Yuwen Zhao
- Department of Neurology, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Yige Wang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Lin Wang
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha 410008, China; Reproductive Medicine Center, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Jifeng Guo
- National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha 410008, China; Department of Neurology, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Beisha Tang
- National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha 410008, China; Department of Neurology, Xiangya Hospital, Central South University, Changsha 410008, China
| | - Kun Xia
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha 410008, China
| | - Jinchen Li
- National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha 410008, China; Department of Neurology, Xiangya Hospital, Central South University, Changsha 410008, China; Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha 410008, China.
| |
Collapse
|
14
|
Lu H, Ma L, Quan C, Li L, Lu Y, Zhou G, Zhang C. RegVar: Tissue-specific Prioritization of Non-coding Regulatory Variants. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:385-395. [PMID: 34973416 PMCID: PMC10626172 DOI: 10.1016/j.gpb.2021.08.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 06/11/2021] [Accepted: 09/27/2021] [Indexed: 06/14/2023]
Abstract
Non-coding genomic variants constitute the majority of trait-associated genome variations; however, the identification of functional non-coding variants is still a challenge in human genetics, and a method for systematically assessing the impact of regulatory variants on gene expression and linking these regulatory variants to potential target genes is still lacking. Here, we introduce a deep neural network (DNN)-based computational framework, RegVar, which can accurately predict the tissue-specific impact of non-coding regulatory variants on target genes. We show that by robustly learning the genomic characteristics of massive variant-gene expression associations in a variety of human tissues, RegVar vastly surpasses all current non-coding variant prioritization methods in predicting regulatory variants under different circumstances. The unique features of RegVar make it an excellent framework for assessing the regulatory impact of any variant on its putative target genes in a variety of tissues. RegVar is available as a web server at https://regvar.omic.tech/.
Collapse
Affiliation(s)
- Hao Lu
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| | - Luyu Ma
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| | - Cheng Quan
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| | - Lei Li
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| | - Yiming Lu
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China.
| | - Gangqiao Zhou
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China.
| | - Chenggang Zhang
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China.
| |
Collapse
|
15
|
Flerlage JE, Myers JR, Maciaszek JL, Oak N, Rashkin SR, Hui Y, Wang YD, Chen W, Wu G, Chang TC, Hamilton K, Tithi SS, Goldin LR, Rotunno M, Caporaso N, Vogt A, Flamish D, Wyatt K, Liu J, Tucker M, Hahn CN, Brown AL, Scott HS, Mullighan C, Nichols KE, Metzger ML, McMaster ML, Yang JJ, Rampersaud E. Discovery of novel predisposing coding and noncoding variants in familial Hodgkin lymphoma. Blood 2023; 141:1293-1307. [PMID: 35977101 PMCID: PMC10082357 DOI: 10.1182/blood.2022016056] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 07/12/2022] [Accepted: 08/02/2022] [Indexed: 11/20/2022] Open
Abstract
Familial aggregation of Hodgkin lymphoma (HL) has been demonstrated in large population studies, pointing to genetic predisposition to this hematological malignancy. To understand the genetic variants associated with the development of HL, we performed whole genome sequencing on 234 individuals with and without HL from 36 pedigrees that had 2 or more first-degree relatives with HL. Our pedigree selection criteria also required at least 1 affected individual aged <21 years, with the median age at diagnosis of 21.98 years (3-55 years). Family-based segregation analysis was performed for the identification of coding and noncoding variants using linkage and filtering approaches. Using our tiered variant prioritization algorithm, we identified 44 HL-risk variants in 28 pedigrees, of which 33 are coding and 11 are noncoding. The top 4 recurrent risk variants are a coding variant in KDR (rs56302315), a 5' untranslated region variant in KLHDC8B (rs387906223), a noncoding variant in an intron of PAX5 (rs147081110), and another noncoding variant in an intron of GATA3 (rs3824666). A newly identified splice variant in KDR (c.3849-2A>C) was observed for 1 pedigree, and high-confidence stop-gain variants affecting IRF7 (p.W238∗) and EEF2KMT (p.K116∗) were also observed. Multiple truncating variants in POLR1E were found in 3 independent pedigrees as well. Whereas KDR and KLHDC8B have previously been reported, PAX5, GATA3, IRF7, EEF2KMT, and POLR1E represent novel observations. Although there may be environmental factors influencing lymphomagenesis, we observed segregation of candidate germline variants likely to predispose HL in most of the pedigrees studied.
Collapse
Affiliation(s)
- Jamie E. Flerlage
- Department of Oncology, St. Jude Children’s Research Hospital and the University of Tennessee Health Sciences Center, Memphis, TN
| | - Jason R. Myers
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, TN
| | - Jamie L. Maciaszek
- Department of Pathology, St. Jude Children’s Research Hospital, Memphis, TN
| | - Ninad Oak
- Department of Oncology, St. Jude Children’s Research Hospital and the University of Tennessee Health Sciences Center, Memphis, TN
| | - Sara R. Rashkin
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, TN
| | - Yawei Hui
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, TN
| | - Yong-Dong Wang
- Department of Cell and Molecular Biology, St. Jude Children's Research Hospital, Memphis, TN
| | - Wenan Chen
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, TN
| | - Gang Wu
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, TN
| | - Ti-Cheng Chang
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, TN
| | - Kayla Hamilton
- Department of Oncology, St. Jude Children’s Research Hospital and the University of Tennessee Health Sciences Center, Memphis, TN
| | - Saima S. Tithi
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, TN
| | - Lynn R. Goldin
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Melissa Rotunno
- Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Neil Caporaso
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD
| | | | | | | | - Jia Liu
- Leidos Biomedical, Inc, Frederick, MD
| | - Margaret Tucker
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD
| | - Christopher N. Hahn
- Department of Genetics and Molecular Pathology, SA Pathology, Adelaide, SA, Australia
- Centre for Cancer Biology, SA Pathology and University of South Australia, Adelaide, SA, Australia
- Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia
| | - Anna L. Brown
- Department of Genetics and Molecular Pathology, SA Pathology, Adelaide, SA, Australia
- Centre for Cancer Biology, SA Pathology and University of South Australia, Adelaide, SA, Australia
- Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia
| | - Hamish S. Scott
- Department of Genetics and Molecular Pathology, SA Pathology, Adelaide, SA, Australia
- Centre for Cancer Biology, SA Pathology and University of South Australia, Adelaide, SA, Australia
- Adelaide Medical School, University of Adelaide, Adelaide, SA, Australia
- School of Biological Sciences, University of Adelaide, Adelaide, SA, Australia
| | - Charles Mullighan
- Department of Pathology, St. Jude Children’s Research Hospital, Memphis, TN
| | - Kim E. Nichols
- Department of Oncology, St. Jude Children’s Research Hospital and the University of Tennessee Health Sciences Center, Memphis, TN
| | - Monika L. Metzger
- Department of Oncology, St. Jude Children’s Research Hospital and the University of Tennessee Health Sciences Center, Memphis, TN
- Department of Global Pediatric Medicine, St. Jude Children’s Research Hospital, Memphis, TN
| | - Mary L. McMaster
- Department of Cell and Molecular Biology, St. Jude Children's Research Hospital, Memphis, TN
| | - Jun J. Yang
- Department of Oncology, St. Jude Children’s Research Hospital and the University of Tennessee Health Sciences Center, Memphis, TN
- Department of Pharmacy and Pharmaceutical Sciences, St. Jude Children’s Research Hospital, Memphis, TN
| | - Evadnie Rampersaud
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, TN
| |
Collapse
|
16
|
An autoimmune pleiotropic SNP modulates IRF5 alternative promoter usage through ZBTB3-mediated chromatin looping. Nat Commun 2023; 14:1208. [PMID: 36869052 PMCID: PMC9984425 DOI: 10.1038/s41467-023-36897-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 02/22/2023] [Indexed: 03/05/2023] Open
Abstract
Genetic sharing is extensively observed for autoimmune diseases, but the causal variants and their underlying molecular mechanisms remain largely unknown. Through systematic investigation of autoimmune disease pleiotropic loci, we found most of these shared genetic effects are transmitted from regulatory code. We used an evidence-based strategy to functionally prioritize causal pleiotropic variants and identify their target genes. A top-ranked pleiotropic variant, rs4728142, yielded many lines of evidence as being causal. Mechanistically, the rs4728142-containing region interacts with the IRF5 alternative promoter in an allele-specific manner and orchestrates its upstream enhancer to regulate IRF5 alternative promoter usage through chromatin looping. A putative structural regulator, ZBTB3, mediates the allele-specific loop to promote IRF5-short transcript expression at the rs4728142 risk allele, resulting in IRF5 overactivation and M1 macrophage polarization. Together, our findings establish a causal mechanism between the regulatory variant and fine-scale molecular phenotype underlying the dysfunction of pleiotropic genes in human autoimmunity.
Collapse
|
17
|
Babushkina NP, Kucher AN. Regulatory Potential of SNP Markers in Genes of DNA Repair Systems. Mol Biol 2023. [DOI: 10.1134/s002689332301003x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
|
18
|
Sobahy TM, Motwalli O, Alazmi M. AllelePred: A Simple Allele Frequencies Ensemble Predictor for Different Single Nucleotide Variants. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:796-801. [PMID: 35239491 DOI: 10.1109/tcbb.2022.3155659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
BACKGROUND & OBJECTIVE Genomic medicine stands to be revolutionized by understanding single nucleotide variants (SNVs) and their expression in single-gene disorders (Mendelian diseases). Computational tools can play a vital role in the exploration of such variations and their pathogenicity. Consequently, we developed the ensemble prediction tool AllelePred to identify deleterious SNVs and disease causative genes. RESULTS The model utilizes different population genetics backgrounds and restricted criteria for features selection to help generate high accuracy results. In comparison to other tools, such as Eigen, PROVEAN, and fathmm-MKL our classifier achieves higher accuracy (98%), precision (96%), F1 score (93%), and coverage (100%) for different types of coding variants. The new method was also compared against a bioinformatics analytical workflow, which uses gnomAD overall AFs (less than 1%) and CADD (scaled C-score of at least 15). Furthermore, this research highlights the stature of genetic variant sharing and curation. We accumulated a list of highly probable deleterious variants and recommended further experimental validation before medical diagnostic usage. CONCLUSIONS The ensemble prediction tool AllelePred enables increased accuracy in recognizing deleterious SNVs and the genetic determinants in real clinical data.
Collapse
|
19
|
Schubach M, Nazaretyan L, Kircher M. The Regulatory Mendelian Mutation score for GRCh38. Gigascience 2022; 12:giad024. [PMID: 37083939 PMCID: PMC10120424 DOI: 10.1093/gigascience/giad024] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 01/10/2023] [Accepted: 03/21/2023] [Indexed: 04/22/2023] Open
Abstract
BACKGROUND Genome sequencing efforts for individuals with rare Mendelian disease have increased the research focus on the noncoding genome and the clinical need for methods that prioritize potentially disease causal noncoding variants. Some tools for assessment of variant pathogenicity as well as annotations are not available for the current human genome build (GRCh38), for which the adoption in databases, software, and pipelines was slow. RESULTS Here, we present an updated version of the Regulatory Mendelian Mutation (ReMM) score, retrained on features and variants derived from the GRCh38 genome build. Like its GRCh37 version, it achieves good performance on its highly imbalanced data. To improve accessibility and provide users with a toolbox to score their variant files and look up scores in the genome, we developed a website and API for easy score lookup. CONCLUSIONS Scores of the GRCh38 genome build are highly correlated to the prior release with a performance increase due to the better coverage of features. For prioritization of noncoding mutations in imbalanced datasets, the ReMM score performed much better than other variation scores. Prescored whole-genome files of GRCh37 and GRCh38 genome builds are cited in the article and the website; UCSC genome browser tracks, and an API are available at https://remm.bihealth.org.
Collapse
Affiliation(s)
- Max Schubach
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité–Universitätsmedizin Berlin, 10117 Berlin, Germany
| | - Lusiné Nazaretyan
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité–Universitätsmedizin Berlin, 10117 Berlin, Germany
| | - Martin Kircher
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité–Universitätsmedizin Berlin, 10117 Berlin, Germany
- Institute of Human Genetics, University Medical Center Schleswig-Holstein, University of Lübeck, 23562 Lübeck, Germany
| |
Collapse
|
20
|
He Z, Liu L, Belloy ME, Le Guen Y, Sossin A, Liu X, Qi X, Ma S, Gyawali PK, Wyss-Coray T, Tang H, Sabatti C, Candès E, Greicius MD, Ionita-Laza I. GhostKnockoff inference empowers identification of putative causal variants in genome-wide association studies. Nat Commun 2022; 13:7209. [PMID: 36418338 PMCID: PMC9684164 DOI: 10.1038/s41467-022-34932-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Accepted: 11/09/2022] [Indexed: 11/27/2022] Open
Abstract
Recent advances in genome sequencing and imputation technologies provide an exciting opportunity to comprehensively study the contribution of genetic variants to complex phenotypes. However, our ability to translate genetic discoveries into mechanistic insights remains limited at this point. In this paper, we propose an efficient knockoff-based method, GhostKnockoff, for genome-wide association studies (GWAS) that leads to improved power and ability to prioritize putative causal variants relative to conventional GWAS approaches. The method requires only Z-scores from conventional GWAS and hence can be easily applied to enhance existing and future studies. The method can also be applied to meta-analysis of multiple GWAS allowing for arbitrary sample overlap. We demonstrate its performance using empirical simulations and two applications: (1) a meta-analysis for Alzheimer's disease comprising nine overlapping large-scale GWAS, whole-exome and whole-genome sequencing studies and (2) analysis of 1403 binary phenotypes from the UK Biobank data in 408,961 samples of European ancestry. Our results demonstrate that GhostKnockoff can identify putatively functional variants with weaker statistical effects that are missed by conventional association tests.
Collapse
Affiliation(s)
- Zihuai He
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, 94305, USA.
- Quantitative Sciences Unit, Department of Medicine, Stanford University, Stanford, CA, 94305, USA.
| | - Linxi Liu
- Department of Statistics, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Michael E Belloy
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, 94305, USA
| | - Yann Le Guen
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, 94305, USA
- Institut du Cerveau - Paris Brain Institute - ICM, Paris, 75013, France
| | - Aaron Sossin
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Xiaoxia Liu
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, 94305, USA
| | - Xinran Qi
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, 94305, USA
| | - Shiyang Ma
- Department of Biostatistics, Columbia University, New York, NY, 10032, USA
| | - Prashnna K Gyawali
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, 94305, USA
| | - Tony Wyss-Coray
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, 94305, USA
| | - Hua Tang
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
| | - Chiara Sabatti
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Emmanuel Candès
- Department of Statistics, Stanford University, Stanford, CA, 94305, USA
- Department of Mathematics, Stanford University, Stanford, CA, 94305, USA
| | - Michael D Greicius
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, 94305, USA
| | | |
Collapse
|
21
|
Lu F, Sossin A, Abell N, Montgomery SB, He Z. Deep learning-assisted genome-wide characterization of massively parallel reporter assays. Nucleic Acids Res 2022; 50:11442-11454. [PMID: 36350674 PMCID: PMC9723615 DOI: 10.1093/nar/gkac990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 10/04/2022] [Accepted: 10/19/2022] [Indexed: 11/10/2022] Open
Abstract
Massively parallel reporter assay (MPRA) is a high-throughput method that enables the study of the regulatory activities of tens of thousands of DNA oligonucleotides in a single experiment. While MPRA experiments have grown in popularity, their small sample sizes compared to the scale of the human genome limits our understanding of the regulatory effects they detect. To address this, we develop a deep learning model, MpraNet, to distinguish potential MPRA targets from the background genome. This model achieves high discriminative performance (AUROC = 0.85) at differentiating MPRA positives from a set of control variants that mimic the background genome when applied to the lymphoblastoid cell line. We observe that existing functional scores represent very distinct functional effects, and most of them fail to characterize the regulatory effect that MPRA detects. Using MpraNet, we predict potential MPRA functional variants across the genome and identify the distributions of MPRA effect relative to other characteristics of genetic variation, including allele frequency, alternative functional annotations specified by FAVOR, and phenome-wide associations. We also observed that the predicted MPRA positives are not uniformly distributed across the genome; instead, they are clumped together in active regions comprising 9.95% of the genome and inactive regions comprising 89.07% of the genome. Furthermore, we propose our model as a screen to filter MPRA experiment candidates at genome-wide scale, enabling future experiments to be more cost-efficient by increasing precision relative to that observed from previous MPRAs.
Collapse
Affiliation(s)
| | | | - Nathan Abell
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Stephen B Montgomery
- Department of Genetics, Stanford University, Stanford, CA 94305, USA,Department of Pathology, Stanford University, Stanford, CA 94305, USA
| | - Zihuai He
- To whom correspondence should be addressed. Tel: +1 718 869 4929;
| |
Collapse
|
22
|
Van de Sompele S, Small KW, Cicekdal MB, Soriano VL, D'haene E, Shaya FS, Agemy S, Van der Snickt T, Rey AD, Rosseel T, Van Heetvelde M, Vergult S, Balikova I, Bergen AA, Boon CJF, De Zaeytijd J, Inglehearn CF, Kousal B, Leroy BP, Rivolta C, Vaclavik V, van den Ende J, van Schooneveld MJ, Gómez-Skarmeta JL, Tena JJ, Martinez-Morales JR, Liskova P, Vleminckx K, De Baere E. Multi-omics approach dissects cis-regulatory mechanisms underlying North Carolina macular dystrophy, a retinal enhanceropathy. Am J Hum Genet 2022; 109:2029-2048. [PMID: 36243009 PMCID: PMC9674966 DOI: 10.1016/j.ajhg.2022.09.013] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Accepted: 09/28/2022] [Indexed: 01/26/2023] Open
Abstract
North Carolina macular dystrophy (NCMD) is a rare autosomal-dominant disease affecting macular development. The disease is caused by non-coding single-nucleotide variants (SNVs) in two hotspot regions near PRDM13 and by duplications in two distinct chromosomal loci, overlapping DNase I hypersensitive sites near either PRDM13 or IRX1. To unravel the mechanisms by which these variants cause disease, we first established a genome-wide multi-omics retinal database, RegRet. Integration of UMI-4C profiles we generated on adult human retina then allowed fine-mapping of the interactions of the PRDM13 and IRX1 promoters and the identification of eighteen candidate cis-regulatory elements (cCREs), the activity of which was investigated by luciferase and Xenopus enhancer assays. Next, luciferase assays showed that the non-coding SNVs located in the two hotspot regions of PRDM13 affect cCRE activity, including two NCMD-associated non-coding SNVs that we identified herein. Interestingly, the cCRE containing one of these SNVs was shown to interact with the PRDM13 promoter, demonstrated in vivo activity in Xenopus, and is active at the developmental stage when progenitor cells of the central retina exit mitosis, suggesting that this region is a PRDM13 enhancer. Finally, mining of single-cell transcriptional data of embryonic and adult retina revealed the highest expression of PRDM13 and IRX1 when amacrine cells start to synapse with retinal ganglion cells, supporting the hypothesis that altered PRDM13 or IRX1 expression impairs interactions between these cells during retinogenesis. Overall, this study provides insight into the cis-regulatory mechanisms of NCMD and supports that this condition is a retinal enhanceropathy.
Collapse
Affiliation(s)
- Stijn Van de Sompele
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium; Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Kent W Small
- Macula and Retina Institute, Los Angeles and Glendale, California, USA
| | - Munevver Burcu Cicekdal
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium; Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium; Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium
| | - Víctor López Soriano
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium; Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Eva D'haene
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium; Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Fadi S Shaya
- Macula and Retina Institute, Los Angeles and Glendale, California, USA
| | - Steven Agemy
- Department of Ophthalmology, SUNY Downstate Medical Center University, Brooklyn, New York, USA
| | - Thijs Van der Snickt
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium; Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Alfredo Dueñas Rey
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium; Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Toon Rosseel
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium; Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Mattias Van Heetvelde
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium; Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Sarah Vergult
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium; Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Irina Balikova
- Department of Ophthalmology, University Hospitals Leuven, Leuven, Belgium
| | - Arthur A Bergen
- Department of Human Genetics, Amsterdam UMC, Academic Medical Center, 1105 AZ Amsterdam, The Netherlands; Queen Emma Centre of Precision Medicine, Amsterdam University Medical Centre, University of Amsterdam, Amsterdam, The Netherlands
| | - Camiel J F Boon
- Department of Ophthalmology, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, The Netherlands; Department of Ophthalmology, Leiden University Medical Center, Leiden, The Netherlands
| | - Julie De Zaeytijd
- Department of Ophthalmology, Ghent University Hospital, Ghent, Belgium
| | - Chris F Inglehearn
- Division of Molecular Medicine, Leeds Institute of Medical Research, University of Leeds, Leeds, UK
| | - Bohdan Kousal
- Department of Ophthalmology, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic
| | - Bart P Leroy
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium; Department of Ophthalmology, Ghent University Hospital, Ghent, Belgium; Department of Head & Skin, Ghent University, Ghent, Belgium; Division of Ophthalmology & Center for Cellular & Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Carlo Rivolta
- Institute of Molecular and Clinical Ophthalmology Basel (IOB), Basel, Switzerland; Department of Ophthalmology, University of Basel, Basel, Switzerland; Department of Genetics and Genome Biology, University of Leicester, Leicester, UK
| | - Veronika Vaclavik
- University of Lausanne, Jules-Gonin Eye Hospital, Lausanne, Switzerland
| | | | - Mary J van Schooneveld
- Department of Ophthalmology, Amsterdam University Medical Centers, University of Amsterdam, Amsterdam, The Netherlands; Bartiméus, Diagnostic Center for Complex Visual Disorders, Zeist, The Netherlands
| | - José Luis Gómez-Skarmeta
- Centro Andaluz de Biología del Desarrollo, Consejo Superior de Investigaciones Científicas and Universidad Pablo de Olavide, Sevilla, Spain
| | - Juan J Tena
- Centro Andaluz de Biología del Desarrollo, Consejo Superior de Investigaciones Científicas and Universidad Pablo de Olavide, Sevilla, Spain
| | - Juan R Martinez-Morales
- Centro Andaluz de Biología del Desarrollo, Consejo Superior de Investigaciones Científicas and Universidad Pablo de Olavide, Sevilla, Spain
| | - Petra Liskova
- Department of Ophthalmology, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic; Department of Paediatrics and Inherited Metabolic Disorders, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic
| | - Kris Vleminckx
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium; Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium
| | - Elfride De Baere
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium; Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium.
| |
Collapse
|
23
|
Exploration of Tools for the Interpretation of Human Non-Coding Variants. Int J Mol Sci 2022; 23:ijms232112977. [PMID: 36361767 PMCID: PMC9654743 DOI: 10.3390/ijms232112977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 10/17/2022] [Accepted: 10/23/2022] [Indexed: 02/01/2023] Open
Abstract
The advent of Whole Genome Sequencing (WGS) broadened the genetic variation detection range, revealing the presence of variants even in non-coding regions of the genome, which would have been missed using targeted approaches. One of the most challenging issues in WGS analysis regards the interpretation of annotated variants. This review focuses on tools suitable for the functional annotation of variants falling into non-coding regions. It couples the description of non-coding genomic areas with the results and performance of existing tools for a functional interpretation of the effect of variants in these regions. Tools were tested in a controlled genomic scenario, representing the ground-truth and allowing us to determine software performance.
Collapse
|
24
|
Nayara Góes de Araújo J, Fernandes de Oliveira V, Bassani Borges J, Dagli-Hernandez C, da Silva Rodrigues Marçal E, Caroline Costa de Freitas R, Medeiros Bastos G, Marques Gonçalves R, Arpad Faludi A, Elim Jannes C, da Costa Pereira A, Dominguez Crespo Hirata R, Hiroyuki Hirata M, Ducati Luchessi A, Nogueira Silbiger V. In silico analysis of upstream variants in Brazilian patients with Familial Hypercholesterolemia. Gene X 2022; 849:146908. [PMID: 36167182 DOI: 10.1016/j.gene.2022.146908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 08/16/2022] [Accepted: 09/19/2022] [Indexed: 10/14/2022] Open
Abstract
Familial hypercholesterolemia (FH) is a prevalent autosomal genetic disease associated with increased risk of early cardiovascular events and death due to chronic exposure to very high levels of low-density lipoprotein cholesterol (LDL-c). Pathogenic variants in the coding regions of LDLR, APOB and PCSK9 account for most FH cases, and variants in non-coding regions maybe involved in FH as well. Variants in the upstream region of LDLR, APOB and PCSK9 were screened by targeted next-generation sequencing and their effects were explored using in silico tools. Twenty-five patients without pathogenic variants in FH-related genes were selected. 3 kb upstream regions of LDLR, APOB and PCSK9 were sequenced using the AmpliSeq (Illumina) and Miseq Reagent Nano Kit v2 (Illumina). Sequencing data were analyzed using variant discovery and functional annotation tools. Potentially regulatory variants were selected by integrating data from public databases, published data and context-dependent regulatory prediction score. Thirty-four single nucleotide variants (SNVs) in upstream regions were identified (6 in LDLR, 15 in APOB, and 13 in PCSK9). Five SNVs were prioritized as potentially regulatory variants (rs934197, rs9282606, rs36218923, rs538300761, g.55038486A>G). APOB rs934197 was previously associated with increased rate of transcription, which in silico analysis suggests that could be due to reducing binding affinity of a transcriptional repressor. Our findings highlight the importance of variant screening outside of coding regions of all relevant genes. Further functional studies are necessary to confirm that prioritized variants could impact gene regulation and contribute to the FH phenotype.
Collapse
Affiliation(s)
- Jéssica Nayara Góes de Araújo
- Northeast Biotechnology Network (RENORBIO), Graduate Program in Biotechnology, Federal University of Rio Grande do Norte, Natal 59078-900, Brazil
| | - Victor Fernandes de Oliveira
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of Sao Paulo, Sao Paulo 05508-000, Brazil
| | - Jéssica Bassani Borges
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of Sao Paulo, Sao Paulo 05508-000, Brazil; Laboratory of Molecular Research in Cardiology, Institute Dante Pazzanese of Cardiology, Sao Paulo, 04012-909, Brazil
| | - Carolina Dagli-Hernandez
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of Sao Paulo, Sao Paulo 05508-000, Brazil
| | | | - Renata Caroline Costa de Freitas
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of Sao Paulo, Sao Paulo 05508-000, Brazil
| | - Gisele Medeiros Bastos
- Laboratory of Molecular Research in Cardiology, Institute Dante Pazzanese of Cardiology, Sao Paulo, 04012-909, Brazil; Medical Clinic Division, Institute Dante Pazzanese of Cardiology, Sao Paulo 04012-909, Brazil
| | | | - André Arpad Faludi
- Medical Clinic Division, Institute Dante Pazzanese of Cardiology, Sao Paulo 04012-909, Brazil
| | - Cinthia Elim Jannes
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, University of Sao Paulo 05403-900, Brazil
| | - Alexandre da Costa Pereira
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, University of Sao Paulo 05403-900, Brazil
| | - Rosario Dominguez Crespo Hirata
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of Sao Paulo, Sao Paulo 05508-000, Brazil
| | - Mario Hiroyuki Hirata
- Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences, University of Sao Paulo, Sao Paulo 05508-000, Brazil
| | - André Ducati Luchessi
- Northeast Biotechnology Network (RENORBIO), Graduate Program in Biotechnology, Federal University of Rio Grande do Norte, Natal 59078-900, Brazil; Department of Clinical and Toxicological Analyses, Federal University of Rio Grande do Norte, Natal 59012-570, Brazil
| | - Vivian Nogueira Silbiger
- Northeast Biotechnology Network (RENORBIO), Graduate Program in Biotechnology, Federal University of Rio Grande do Norte, Natal 59078-900, Brazil; Department of Clinical and Toxicological Analyses, Federal University of Rio Grande do Norte, Natal 59012-570, Brazil.
| |
Collapse
|
25
|
Li K, Luo T, Zhu Y, Huang Y, Wang A, Zhang D, Dong L, Wang Y, Wang R, Tang D, Yu Z, Shen Q, Lv M, Ling Z, Fang Z, Yuan J, Li B, Xia K, He X, Li J, Zhao G. Performance evaluation of differential splicing analysis methods and splicing analytics platform construction. Nucleic Acids Res 2022; 50:9115-9126. [PMID: 35993808 PMCID: PMC9458456 DOI: 10.1093/nar/gkac686] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Revised: 07/01/2022] [Accepted: 08/01/2022] [Indexed: 12/24/2022] Open
Abstract
A proportion of previously defined benign variants or variants of uncertain significance in humans, which are challenging to identify, may induce an abnormal splicing process. An increasing number of methods have been developed to predict splicing variants, but their performance has not been completely evaluated using independent benchmarks. Here, we manually sourced ∼50 000 positive/negative splicing variants from > 8000 studies and selected the independent splicing variants to evaluate the performance of prediction methods. These methods showed different performances in recognizing splicing variants in donor and acceptor regions, reminiscent of different weight coefficient applications to predict novel splicing variants. Of these methods, 66.67% exhibited higher specificities than sensitivities, suggesting that more moderate cut-off values are necessary to distinguish splicing variants. Moreover, the high correlation and consistent prediction ratio validated the feasibility of integration of the splicing prediction method in identifying splicing variants. We developed a splicing analytics platform called SPCards, which curates splicing variants from publications and predicts splicing scores of variants in genomes. SPCards also offers variant-level and gene-level annotation information, including allele frequency, non-synonymous prediction and comprehensive functional information. SPCards is suitable for high-throughput genetic identification of splicing variants, particularly those located in non-canonical splicing regions.
Collapse
Affiliation(s)
| | | | - Yan Zhu
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
| | - Yuanfeng Huang
- Bioinformatics Center & National Clinical Research Centre for Geriatric Disorders & Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan, China,Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - An Wang
- Department of Obstetrics and Gynecology, The First Affiliated Hospital of Anhui Medical University, Hefei 230022, China,NHC Key Laboratory of Study on Abnormal Gametes and Reproductive Tract (Anhui Medical University), No 81 Meishan Road, Hefei 230032, Anhui, China,Key Laboratory of Population Health Across Life Cycle (Anhui Medical University), Ministry of Education of the People's Republic of China, No 81 Meishan Road, Hefei 230032, Anhui, China
| | - Di Zhang
- Department of Obstetrics and Gynecology, The First Affiliated Hospital of Anhui Medical University, Hefei 230022, China,NHC Key Laboratory of Study on Abnormal Gametes and Reproductive Tract (Anhui Medical University), No 81 Meishan Road, Hefei 230032, Anhui, China,Key Laboratory of Population Health Across Life Cycle (Anhui Medical University), Ministry of Education of the People's Republic of China, No 81 Meishan Road, Hefei 230032, Anhui, China
| | - Lijie Dong
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
| | - Yujian Wang
- Bioinformatics Center & National Clinical Research Centre for Geriatric Disorders & Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan, China,Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Rui Wang
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
| | - Dongdong Tang
- Department of Obstetrics and Gynecology, The First Affiliated Hospital of Anhui Medical University, Hefei 230022, China,NHC Key Laboratory of Study on Abnormal Gametes and Reproductive Tract (Anhui Medical University), No 81 Meishan Road, Hefei 230032, Anhui, China,Key Laboratory of Population Health Across Life Cycle (Anhui Medical University), Ministry of Education of the People's Republic of China, No 81 Meishan Road, Hefei 230032, Anhui, China
| | - Zhen Yu
- Department of Obstetrics and Gynecology, The First Affiliated Hospital of Anhui Medical University, Hefei 230022, China,NHC Key Laboratory of Study on Abnormal Gametes and Reproductive Tract (Anhui Medical University), No 81 Meishan Road, Hefei 230032, Anhui, China,Key Laboratory of Population Health Across Life Cycle (Anhui Medical University), Ministry of Education of the People's Republic of China, No 81 Meishan Road, Hefei 230032, Anhui, China
| | - Qunshan Shen
- Department of Obstetrics and Gynecology, The First Affiliated Hospital of Anhui Medical University, Hefei 230022, China,NHC Key Laboratory of Study on Abnormal Gametes and Reproductive Tract (Anhui Medical University), No 81 Meishan Road, Hefei 230032, Anhui, China,Key Laboratory of Population Health Across Life Cycle (Anhui Medical University), Ministry of Education of the People's Republic of China, No 81 Meishan Road, Hefei 230032, Anhui, China
| | - Mingrong Lv
- Department of Obstetrics and Gynecology, The First Affiliated Hospital of Anhui Medical University, Hefei 230022, China,NHC Key Laboratory of Study on Abnormal Gametes and Reproductive Tract (Anhui Medical University), No 81 Meishan Road, Hefei 230032, Anhui, China,Key Laboratory of Population Health Across Life Cycle (Anhui Medical University), Ministry of Education of the People's Republic of China, No 81 Meishan Road, Hefei 230032, Anhui, China
| | - Zhengbao Ling
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
| | - Zhenghuan Fang
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China
| | - Jing Yuan
- Department of Obstetrics and Gynecology, The First Affiliated Hospital of Anhui Medical University, Hefei 230022, China,NHC Key Laboratory of Study on Abnormal Gametes and Reproductive Tract (Anhui Medical University), No 81 Meishan Road, Hefei 230032, Anhui, China,Key Laboratory of Population Health Across Life Cycle (Anhui Medical University), Ministry of Education of the People's Republic of China, No 81 Meishan Road, Hefei 230032, Anhui, China
| | - Bin Li
- Bioinformatics Center & National Clinical Research Centre for Geriatric Disorders & Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan, China,Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China
| | - Kun Xia
- Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan, China,Hengyang Medical School, University of South China, Hengyang, Hunan, China
| | - Xiaojin He
- Correspondence may also be addressed to Xiaojin He. Tel: +86 731 8975 2406; Fax: +86 731 8432 7332;
| | - Jinchen Li
- To whom correspondence should be addressed. Tel: +86 731 8975 2406; Fax: +86 731 8432 7332;
| | - Guihu Zhao
- Correspondence may also be addressed to Guihu Zhao. Tel: +86 731 8975 2406; Fax: +86 731 8432 7332;
| |
Collapse
|
26
|
Giovannetti A, Bianco SD, Traversa A, Panzironi N, Bruselles A, Lazzari S, Liorni N, Tartaglia M, Carella M, Pizzuti A, Mazza T, Caputo V. MiRLog and dbmiR: prioritization and functional annotation tools to study human microRNA sequence variants. Hum Mutat 2022; 43:1201-1215. [PMID: 35583122 PMCID: PMC9546175 DOI: 10.1002/humu.24399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 05/03/2022] [Accepted: 05/11/2022] [Indexed: 11/22/2022]
Abstract
The recent identification of noncoding variants with pathogenic effects suggests that these variations could underlie a significant number of undiagnosed cases. Several computational methods have been developed to predict the functional impact of noncoding variants, but they exhibit only partial concordance and are not integrated with functional annotation resources, making the interpretation of these variants still challenging. MicroRNAs (miRNAs) are small noncoding RNA molecules that act as fine regulators of gene expression and play crucial functions in several biological processes, such as cell proliferation and differentiation. An increasing number of studies demonstrate a significant impact of miRNA single nucleotide variants (SNVs) both in Mendelian diseases and complex traits. To predict the functional effect of miRNA SNVs, we implemented a new meta‐predictor, MiRLog, and we integrated it into a comprehensive database, dbmiR, which includes a precompiled list of all possible miRNA allelic SNVs, providing their biological annotations at nucleotide and miRNA levels. MiRLog and dbmiR were used to explore the genetic variability of miRNAs in 15,708 human genomes included in the gnomAD project, finding several ultra‐rare SNVs with a potentially deleterious effect on miRNA biogenesis and function representing putative contributors to human phenotypes.
Collapse
Affiliation(s)
- Agnese Giovannetti
- Laboratory of Clinical Genomics, Fondazione IRCCS Casa Sollievo della Sofferenza, San Giovanni Rotondo (FG), Italy
| | - Salvatore Daniele Bianco
- Department of Experimental Medicine, Sapienza University of Rome, Rome, Italy.,Unit of Bioinformatics, Fondazione IRCCS Casa Sollievo della Sofferenza, San Giovanni Rotondo (FG), Italy
| | - Alice Traversa
- Laboratory of Clinical Genomics, Fondazione IRCCS Casa Sollievo della Sofferenza, San Giovanni Rotondo (FG), Italy
| | - Noemi Panzironi
- Laboratory of Clinical Genomics, Fondazione IRCCS Casa Sollievo della Sofferenza, San Giovanni Rotondo (FG), Italy
| | - Alessandro Bruselles
- Department of Oncology and Molecular Medicine, Istituto Superiore di Sanità, Rome, Italy
| | - Sara Lazzari
- Department of Experimental Medicine, Sapienza University of Rome, Rome, Italy
| | - Niccolò Liorni
- Department of Experimental Medicine, Sapienza University of Rome, Rome, Italy.,Unit of Bioinformatics, Fondazione IRCCS Casa Sollievo della Sofferenza, San Giovanni Rotondo (FG), Italy
| | - Marco Tartaglia
- Genetics and Rare Diseases Research Division, Ospedale Pediatrico Bambino Gesù, IRCCS, Rome, Italy
| | - Massimo Carella
- Medical Genetics Unit, Fondazione IRCCS Casa Sollievo della Sofferenza, San Giovanni Rotondo (FG), Italy
| | - Antonio Pizzuti
- Department of Experimental Medicine, Sapienza University of Rome, Rome, Italy
| | - Tommaso Mazza
- Unit of Bioinformatics, Fondazione IRCCS Casa Sollievo della Sofferenza, San Giovanni Rotondo (FG), Italy
| | - Viviana Caputo
- Department of Experimental Medicine, Sapienza University of Rome, Rome, Italy
| |
Collapse
|
27
|
Shah H, Khan K, Khan N, Badshah Y, Ashraf NM, Shabbir M. Impact of deleterious missense PRKCI variants on structural and functional dynamics of protein. Sci Rep 2022; 12:3781. [PMID: 35260606 PMCID: PMC8904829 DOI: 10.1038/s41598-022-07526-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 02/08/2022] [Indexed: 11/09/2022] Open
Abstract
Protein kinase C iota (PKCɩ) is a novel protein containing 596 amino acids and is also a member of atypical kinase family. The role of PKCɩ has been explored in neurodegenerative diseases, neuroblastoma, ovarian and pancreatic cancers. Single nucleotide polymorphisms (SNPs) have not been studied in PKCɩ till date. The purpose of the current study is to scrutinize the deleterious missense variants in PKCɩ and determine the effect of these variants on stability and dynamics of the protein. The structure of protein PKCɩ was predicted for the first time and post translational modifications were determined. Genetic variants of PKCɩ were retrieved from ENSEMBL and only missense variants were further analyzed because of its linkage with diseases. The pathogenicity of missense variants, effect on structure and function of protein, association with cancer and conservancy of the protein residues were determined through computational approaches. It is observed that C1 and the pseudo substrate region has the highest number of pathogenic SNPs. Variations in the kinase domain of the protein are predicted to alter overall phosphorylation of the protein. Molecular dynamic simulations predicted noteworthy change in structural and functional dynamics of the protein because of these variants. The study revealed that nine deleterious variants can possibly contribute to malfunctioning of the protein and can be associated with diseases. This can be useful in diagnostics and developing therapeutics for diseases related to these polymorphisms.
Collapse
Affiliation(s)
- Hania Shah
- Department of Healthcare Biotechnology, Atta-Ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Khushbukhat Khan
- Department of Healthcare Biotechnology, Atta-Ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Naila Khan
- Department of Healthcare Biotechnology, Atta-Ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Yasmin Badshah
- Department of Healthcare Biotechnology, Atta-Ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Naeem Mahmood Ashraf
- Department of Biochemistry and Biotechnology, University of Gujrat, Gujrat, Pakistan
| | - Maria Shabbir
- Department of Healthcare Biotechnology, Atta-Ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan.
| |
Collapse
|
28
|
Dai H, Chu X, Liang Q, Wang M, Li L, Zhou Y, Zheng Z, Wang W, Wang Z, Li H, Wang J, Zheng H, Zhao Y, Liu L, Yao H, Luo M, Wang Q, Kang S, Li Y, Wang K, Song F, Zhang R, Wu X, Cheng X, Zhang W, Wei Q, Li MJ, Chen K. Genome-wide association and functional interrogation identified a variant at 3p26.1 modulating ovarian cancer survival among Chinese women. Cell Discov 2021; 7:121. [PMID: 34930913 PMCID: PMC8688503 DOI: 10.1038/s41421-021-00342-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 09/23/2021] [Indexed: 12/03/2022] Open
Abstract
Ovarian cancer survival varies considerably among patients, to which germline variation may also contribute in addition to mutational signatures. To identify genetic markers modulating ovarian cancer outcome, we performed a genome-wide association study in 2130 Chinese ovarian cancer patients and found a hitherto unrecognized locus at 3p26.1 to be associated with the overall survival (Pcombined = 8.90 × 10−10). Subsequent statistical fine-mapping, functional annotation, and eQTL mapping prioritized a likely casual SNP rs9311399 in the non-coding regulatory region. Mechanistically, rs9311399 altered its enhancer activity through an allele-specific transcription factor binding and a long-range interaction with the promoter of a lncRNA BHLHE40-AS1. Deletion of the rs9311399-associated enhancer resulted in expression changes in several oncogenic signaling pathway genes and a decrease in tumor growth. Thus, we have identified a novel genetic locus that is associated with ovarian cancer survival possibly through a long-range gene regulation of oncogenic pathways.
Collapse
Affiliation(s)
- Hongji Dai
- Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Xinlei Chu
- Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Qian Liang
- Department of Pharmacology, the Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Mengyun Wang
- Cancer Institute, Fudan University Shanghai Cancer Center, Fudan University, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Lian Li
- Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Yao Zhou
- Department of Pharmacology, the Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Zhanye Zheng
- Department of Pharmacology, the Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Wei Wang
- Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Zhao Wang
- Department of Pharmacology, the Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Haixin Li
- Cancer Biobank, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Jianhua Wang
- Department of Pharmacology, the Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Hong Zheng
- Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Yanrui Zhao
- Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Luyang Liu
- Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Hongcheng Yao
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, SAR, China
| | - Menghan Luo
- Department of Pharmacology, the Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Qiong Wang
- Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Shan Kang
- Department of Obstetrics and Gynaecology, Hebei Medical University, Fourth Hospital, Shijiazhuang, China
| | - Yan Li
- Department of Molecular Biology, Hebei Medical University, Fourth Hospital, Shijiazhuang, China
| | - Ke Wang
- Department of Gynecologic Oncology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Fengju Song
- Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Ruoxin Zhang
- Cancer Institute, Fudan University Shanghai Cancer Center, Fudan University, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Xiaohua Wu
- Cancer Institute, Fudan University Shanghai Cancer Center, Fudan University, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Xi Cheng
- Cancer Institute, Fudan University Shanghai Cancer Center, Fudan University, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Wei Zhang
- Center for Cancer Genomics and Precision Oncology, Wake Forest Baptist Comprehensive Cancer Center, Wake Forest Baptist Medical Center, Winston-Salem, NC, USA.,Department of Cancer Biology, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Qingyi Wei
- Cancer Institute, Fudan University Shanghai Cancer Center, Fudan University, Shanghai, China. .,Duke Cancer Institute, Duke University Medical Center, Durham, NC, USA. .,Department of Population Health Sciences, Duke University School of Medicine, Durham, NC, USA.
| | - Mulin Jun Li
- Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China. .,Department of Pharmacology, the Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China.
| | - Kexin Chen
- Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.
| |
Collapse
|
29
|
Dong S, Boyle AP. Prioritization of regulatory variants with tissue-specific function in the non-coding regions of human genome. Nucleic Acids Res 2021; 50:e6. [PMID: 34648033 PMCID: PMC8754628 DOI: 10.1093/nar/gkab924] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 09/21/2021] [Accepted: 09/27/2021] [Indexed: 02/06/2023] Open
Abstract
Understanding the functional consequences of genetic variation in the non-coding regions of the human genome remains a challenge. We introduce h ere a computational tool, TURF, to prioritize regulatory variants with tissue-specific function by leveraging evidence from functional genomics experiments, including over 3000 functional genomics datasets from the ENCODE project provided in the RegulomeDB database. TURF is able to generate prediction scores at both organism and tissue/organ-specific levels for any non-coding variant on the genome. We present that TURF has an overall top performance in prediction by using validated variants from MPRA experiments. We also demonstrate how TURF can pick out the regulatory variants with tissue-specific function over a candidate list from associate studies. Furthermore, we found that various GWAS traits showed the enrichment of regulatory variants predicted by TURF scores in the trait-relevant organs, which indicates that these variants can be a valuable source for future studies.
Collapse
Affiliation(s)
- Shengcheng Dong
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Alan P Boyle
- To whom correspondence should be addressed. Tel: +1 734 763 7382; Fax: +1 734 763 7382;
| |
Collapse
|
30
|
Huang D, Zhou Y, Yi X, Fan X, Wang J, Yao H, Sham PC, Hao J, Chen K, Li MJ. VannoPortal: multiscale functional annotation of human genetic variants for interrogating molecular mechanism of traits and diseases. Nucleic Acids Res 2021; 50:D1408-D1416. [PMID: 34570217 PMCID: PMC8728305 DOI: 10.1093/nar/gkab853] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Revised: 09/05/2021] [Accepted: 09/14/2021] [Indexed: 12/16/2022] Open
Abstract
Interpreting the molecular mechanism of genomic variations and their causal relationship with diseases/traits are important and challenging problems in the human genetic study. To provide comprehensive and context-specific variant annotations for biologists and clinicians, here, by systematically integrating over 4TB genomic/epigenomic profiles and frequently-used annotation databases from various biological domains, we develop a variant annotation database, called VannoPortal. In general, the database has following major features: (i) systematically integrates 40 genome-wide variant annotations and prediction scores regarding allele frequency, linkage disequilibrium, evolutionary signature, disease/trait association, tissue/cell type-specific epigenome, base-wise functional prediction, allelic imbalance and pathogenicity; (ii) equips with our recent novel index system and parallel random-sweep searching algorithms for efficient management of backend databases and information extraction; (iii) greatly expands context-dependent variant annotation to incorporate large-scale epigenomic maps and regulatory profiles (such as EpiMap) across over 33 tissue/cell types; (iv) compiles many genome-scale base-wise prediction scores for regulatory/pathogenic variant classification beyond protein-coding region; (v) enables fast retrieval and direct comparison of functional evidence among linked variants using highly interactive web panel in addition to plain table; (vi) introduces many visualization functions for more efficient identification and interpretation of functional variants in single web page. VannoPortal is freely available at http://mulinlab.org/vportal.
Collapse
Affiliation(s)
- Dandan Huang
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China.,Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Yao Zhou
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Xianfu Yi
- School of Biomedical Engineering, Tianjin Medical University, Tianjin 300070, China
| | - Xutong Fan
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Jianhua Wang
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Hongcheng Yao
- Centre for PanorOmic Sciences-Genomics and Bioinformatics Cores, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 999077, China
| | - Pak Chung Sham
- Centre for PanorOmic Sciences-Genomics and Bioinformatics Cores, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 999077, China
| | - Jihui Hao
- Department of Pancreatic Cancer, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin's Clinical Research Center for Cancer, Tianjin 300060, China
| | - Kexin Chen
- Department of Epidemiology and Biostatistics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300060, China
| | - Mulin Jun Li
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China.,Department of Epidemiology and Biostatistics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300060, China
| |
Collapse
|
31
|
Huang D, Wang Z, Zhou Y, Liang Q, Sham PC, Yao H, Li MJ. vSampler: fast and annotation-based matched variant sampling tool. Bioinformatics 2021; 37:1915-1917. [PMID: 33270826 DOI: 10.1093/bioinformatics/btaa883] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 07/28/2020] [Accepted: 09/30/2020] [Indexed: 11/13/2022] Open
Abstract
SUMMARY Sampling of control variants having matched properties with input variants is widely used in enrichment analysis of genome-wide association studies/quantitative trait loci and negative data construction for pathogenic/regulatory variant prediction methods. Spurious enrichment results because of confounding factors, such as minor allele frequency and linkage disequilibrium pattern, can be avoided by calibration of statistical significance based on matched controls. Here, we presented vSampler which can generate sets of randomly drawn variants with comprehensive choices of matching properties, such as tissue/cell type-specific epigenomic features. Importantly, the development of a novel data structure and sampling algorithms for vSampler makes it significantly fast than existing tools. AVAILABILITY AND IMPLEMENTATION vSampler web server and local program are available at http://mulinlab.org/vsampler. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dandan Huang
- Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Zhao Wang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Yao Zhou
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Qian Liang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | | | - Hongcheng Yao
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 999077, China
| | - Mulin Jun Li
- Department of Epidemiology and Biostatistics, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| |
Collapse
|
32
|
Green DJ, Lenassi E, Manning CS, McGaughey D, Sharma V, Black GC, Ellingford JM, Sergouniotis PI. North Carolina Macular Dystrophy: Phenotypic Variability and Computational Analysis of Disease-Associated Noncoding Variants. Invest Ophthalmol Vis Sci 2021; 62:16. [PMID: 34125159 PMCID: PMC8212441 DOI: 10.1167/iovs.62.7.16] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Purpose North Carolina macular dystrophy (NCMD) is an autosomal dominant, congenital disorder affecting the central retina. Here, we report clinical and genetic findings in three families segregating NCMD and use epigenomic datasets from human tissues to gain insights into the effect of NCMD-implicated variants. Methods Clinical assessment and genetic testing were performed. Publicly available transcriptomic and epigenomic datasets were analyzed and the activity-by-contact method for scoring enhancer elements and linking them to target genes was used. Results A previously described, heterozygous, noncoding variant upstream of the PRDM13 gene was detected in all six affected study participants (chr6:100,040,987G>C [GRCh37/hg19]). Interfamilial and intrafamilial variability were observed; the visual acuity ranged from 0.0 to 1.6 LogMAR and fundoscopic findings ranged from visually insignificant, confluent, drusen-like macular deposits to coloboma-like macular lesions. Variable degrees of peripheral retinal spots (which were easily detected on widefield retinal imaging) were observed in all study subjects. Notably, a 6-year-old patient developed choroidal neovascularization and required treatment with intravitreal bevacizumab injections. Computational analysis of the five single nucleotide variants that have been implicated in NCMD revealed that these noncoding changes lie within two putative enhancer elements; these elements are predicted to interact with PRDM13 in the developing human retina. PRDM13 was found to be expressed in the fetal retina, with greatest expression in the amacrine precursor cell population. Conclusions We provide further evidence supporting the role of PRDM13 dysregulation in the pathogenesis of NCMD and highlight the usefulness of widefield retinal imaging in individuals suspected to have this condition.
Collapse
Affiliation(s)
- David J Green
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
| | - Eva Lenassi
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
- Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester University NHS Foundation Trust, Manchester, United Kingdom
- Manchester Royal Eye Hospital, Manchester University NHS Foundation Trust, Manchester, United Kingdom
| | - Cerys S Manning
- Division of Developmental Biology and Medicine, School of Medical Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
| | - David McGaughey
- Ophthalmic Genetics and Visual Function Branch, National Eye Institute, National Institutes of Health, Bethesda, Maryland, United States
| | - Vinod Sharma
- Manchester Royal Eye Hospital, Manchester University NHS Foundation Trust, Manchester, United Kingdom
| | - Graeme C Black
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
- Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester University NHS Foundation Trust, Manchester, United Kingdom
| | - Jamie M Ellingford
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
| | - Panagiotis I Sergouniotis
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom
- Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester University NHS Foundation Trust, Manchester, United Kingdom
- Manchester Royal Eye Hospital, Manchester University NHS Foundation Trust, Manchester, United Kingdom
- Institute of Biochemistry and Molecular Genetics, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| |
Collapse
|
33
|
Collobert M, Bocher O, Le Nabec A, Génin E, Férec C, Moisan S. CFTR Cooperative Cis-Regulatory Elements in Intestinal Cells. Int J Mol Sci 2021; 22:ijms22052599. [PMID: 33807548 PMCID: PMC7961337 DOI: 10.3390/ijms22052599] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 02/25/2021] [Accepted: 02/27/2021] [Indexed: 11/16/2022] Open
Abstract
About 8% of the human genome is covered with candidate cis-regulatory elements (cCREs). Disruptions of CREs, described as "cis-ruptions" have been identified as being involved in various genetic diseases. Thanks to the development of chromatin conformation study techniques, several long-range cystic fibrosis transmembrane conductance regulator (CFTR) regulatory elements were identified, but the regulatory mechanisms of the CFTR gene have yet to be fully elucidated. The aim of this work is to improve our knowledge of the CFTR gene regulation, and to identity factors that could impact the CFTR gene expression, and potentially account for the variability of the clinical presentation of cystic fibrosis as well as CFTR-related disorders. Here, we apply the robust GWAS3D score to determine which of the CFTR introns could be involved in gene regulation. This approach highlights four particular CFTR introns of interest. Using reporter gene constructs in intestinal cells, we show that two new introns display strong cooperative effects in intestinal cells. Chromatin immunoprecipitation analyses further demonstrate fixation of transcription factors network. These results provide new insights into our understanding of the CFTR gene regulation and allow us to suggest a 3D CFTR locus structure in intestinal cells. A better understand of regulation mechanisms of the CFTR gene could elucidate cases of patients where the phenotype is not yet explained by the genotype. This would thus help in better diagnosis and therefore better management. These cis-acting regions may be a therapeutic challenge that could lead to the development of specific molecules capable of modulating gene expression in the future.
Collapse
Affiliation(s)
- Mégane Collobert
- Univ. Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France; (O.B.); (A.L.N.); (E.G.); (C.F.)
- Correspondence: (M.C.); (S.M.); Tel.: +33-298-0165-67 (M.C.)
| | - Ozvan Bocher
- Univ. Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France; (O.B.); (A.L.N.); (E.G.); (C.F.)
| | - Anaïs Le Nabec
- Univ. Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France; (O.B.); (A.L.N.); (E.G.); (C.F.)
| | - Emmanuelle Génin
- Univ. Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France; (O.B.); (A.L.N.); (E.G.); (C.F.)
| | - Claude Férec
- Univ. Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France; (O.B.); (A.L.N.); (E.G.); (C.F.)
- Department of Molecular Genetics and Reproduction Biology, CHRU Brest, F-29200 Brest, France
| | - Stéphanie Moisan
- Univ. Brest, Inserm, EFS, UMR 1078, GGB, F-29200 Brest, France; (O.B.); (A.L.N.); (E.G.); (C.F.)
- Department of Molecular Genetics and Reproduction Biology, CHRU Brest, F-29200 Brest, France
- Correspondence: (M.C.); (S.M.); Tel.: +33-298-0165-67 (M.C.)
| |
Collapse
|
34
|
Momozawa Y, Mizukami K. Unique roles of rare variants in the genetics of complex diseases in humans. J Hum Genet 2021; 66:11-23. [PMID: 32948841 PMCID: PMC7728599 DOI: 10.1038/s10038-020-00845-2] [Citation(s) in RCA: 76] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 09/06/2020] [Indexed: 12/19/2022]
Abstract
Genome-wide association studies have identified >10,000 genetic variants associated with various phenotypes and diseases. Although the majority are common variants, rare variants with >0.1% of minor allele frequency have been investigated by imputation and using disease-specific custom SNP arrays. Rare variants sequencing analysis mainly revealed have played unique roles in the genetics of complex diseases in humans due to their distinctive features, in contrast to common variants. Unique roles are hypothesis-free evidence for gene causality, a precise target of functional analysis for understanding disease mechanisms, a new favorable target for drug development, and a genetic marker with high disease risk for personalized medicine. As whole-genome sequencing continues to identify more rare variants, the roles associated with rare variants will also increase. However, a better estimation of the functional impact of rare variants across whole genome is needed to enhance their contribution to improvements in human health.
Collapse
Affiliation(s)
- Yukihide Momozawa
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan.
- Laboratory for Molecular Science for Drug Discovery, Graduate School of Medical Life Science, Yokohama City University, Kanagawa, Japan.
| | - Keijiro Mizukami
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
| |
Collapse
|
35
|
Huang D, Yi X, Zhou Y, Yao H, Xu H, Wang J, Zhang S, Nong W, Wang P, Shi L, Xuan C, Li M, Wang J, Li W, Kwan HS, Sham PC, Wang K, Li MJ. Ultrafast and scalable variant annotation and prioritization with big functional genomics data. Genome Res 2020; 30:1789-1801. [PMID: 33060171 PMCID: PMC7706736 DOI: 10.1101/gr.267997.120] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Accepted: 09/22/2020] [Indexed: 02/06/2023]
Abstract
The advances of large-scale genomics studies have enabled compilation of cell type–specific, genome-wide DNA functional elements at high resolution. With the growing volume of functional annotation data and sequencing variants, existing variant annotation algorithms lack the efficiency and scalability to process big genomic data, particularly when annotating whole-genome sequencing variants against a huge database with billions of genomic features. Here, we develop VarNote to rapidly annotate genome-scale variants in large and complex functional annotation resources. Equipped with a novel index system and a parallel random-sweep searching algorithm, VarNote shows substantial performance improvements (two to three orders of magnitude) over existing algorithms at different scales. It supports both region-based and allele-specific annotations and introduces advanced functions for the flexible extraction of annotations. By integrating massive base-wise and context-dependent annotations in the VarNote framework, we introduce three efficient and accurate pipelines to prioritize the causal regulatory variants for common diseases, Mendelian disorders, and cancers.
Collapse
Affiliation(s)
- Dandan Huang
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China.,Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Xianfu Yi
- School of Biomedical Engineering, Tianjin Medical University, Tianjin 300070, China
| | - Yao Zhou
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Hongcheng Yao
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 999077, China
| | - Hang Xu
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China.,School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 999077, China
| | - Jianhua Wang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Shijie Zhang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Wenyan Nong
- School of Life Sciences, The Chinese University of Hong Kong, Hong Kong SAR 999077, China
| | - Panwen Wang
- Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic, Scottsdale, Arizona 85259, USA
| | - Lei Shi
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Chenghao Xuan
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Miaoxin Li
- Center for Genome Research, Center for Precision Medicine, Zhongshan School of Medicine, First Affiliated Hospital, Sun Yat-Sen University, Guangzhou 510080, China
| | - Junwen Wang
- Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic, Scottsdale, Arizona 85259, USA
| | - Weidong Li
- Department of Genetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Hoi Shan Kwan
- School of Life Sciences, The Chinese University of Hong Kong, Hong Kong SAR 999077, China
| | - Pak Chung Sham
- Centre of Genomics Sciences, Departments of Psychiatry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 999077, China
| | - Kai Wang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA
| | - Mulin Jun Li
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China.,Department of Epidemiology and Biostatistics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| |
Collapse
|
36
|
Wang H, Wang T, Zhao X, Wu H, You M, Sun Z, Mao F. AI-Driver: an ensemble method for identifying driver mutations in personal cancer genomes. NAR Genom Bioinform 2020; 2:lqaa084. [PMID: 33575629 PMCID: PMC7671397 DOI: 10.1093/nargab/lqaa084] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2020] [Revised: 09/22/2020] [Accepted: 09/30/2020] [Indexed: 01/02/2023] Open
Abstract
The current challenge in cancer research is to increase the resolution of driver prediction from gene-level to mutation-level, which is more closely aligned with the goal of precision cancer medicine. Improved methods to distinguish drivers from passengers are urgently needed to dig out driver mutations from increasing exome sequencing studies. Here, we developed an ensemble method, AI-Driver (AI-based driver classifier, https://github.com/hatchetProject/AI-Driver), to predict the driver status of somatic missense mutations based on 23 pathogenicity features. AI-Driver has the best overall performance compared with any individual tool and two cancer-specific driver predicting methods. We demonstrate the superior and stable performance of our model using four independent benchmarks. We provide pre-computed AI-Driver scores for all possible human missense variants (http://aidriver.maolab.org/) to identify driver mutations in the sea of somatic mutations discovered by personal cancer sequencing. We believe that AI-Driver together with pre-computed database will play vital important roles in the human cancer studies, such as identification of driver mutation in personal cancer genomes, discovery of targeting sites for cancer therapeutic treatments and prediction of tumor biomarkers for early diagnosis by liquid biopsy.
Collapse
Affiliation(s)
- Haoxuan Wang
- Center of Basic Medical Research, Institute of Medical Innovation and Research, Peking University Third Hospital, Beijing 100191, China
| | - Tao Wang
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan 410083, China
| | - Xiaolu Zhao
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Honghu Wu
- Department of Science and Technology, Second Affiliated Hospital of Nanchang University, Nanchang 330006, China
| | | | - Zhongsheng Sun
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing 100101, China
| | - Fengbiao Mao
- Center of Basic Medical Research, Institute of Medical Innovation and Research, Peking University Third Hospital, Beijing 100191, China
| |
Collapse
|
37
|
Wu Z, Ioannidis NM, Zou J. Predicting target genes of non-coding regulatory variants with IRT. Bioinformatics 2020; 36:4440-4448. [PMID: 32330225 DOI: 10.1093/bioinformatics/btaa254] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 03/15/2020] [Accepted: 04/17/2020] [Indexed: 11/14/2022] Open
Abstract
SUMMARY Interpreting genetic variants of unknown significance (VUS) is essential in clinical applications of genome sequencing for diagnosis and personalized care. Non-coding variants remain particularly difficult to interpret, despite making up a large majority of trait associations identified in genome-wide association studies (GWAS) analyses. Predicting the regulatory effects of non-coding variants on candidate genes is a key step in evaluating their clinical significance. Here, we develop a machine-learning algorithm, Inference of Connected expression quantitative trait loci (eQTLs) (IRT), to predict the regulatory targets of non-coding variants identified in studies of eQTLs. We assemble datasets using eQTL results from the Genotype-Tissue Expression (GTEx) project and learn to separate positive and negative pairs based on annotations characterizing the variant, gene and the intermediate sequence. IRT achieves an area under the receiver operating characteristic curve (ROC-AUC) of 0.799 using random cross-validation, and 0.700 for a more stringent position-based cross-validation. Further evaluation on rare variants and experimentally validated regulatory variants shows a significant enrichment in IRT identifying the true target genes versus negative controls. In gene-ranking experiments, IRT achieves a top-1 accuracy of 50% and top-3 accuracy of 90%. Salient features, including GC-content, histone modifications and Hi-C interactions are further analyzed and visualized to illustrate their influences on predictions. IRT can be applied to any VUS of interest and each candidate nearby gene to output a score reflecting the likelihood of regulatory effect on the expression level. These scores can be used to prioritize variants and genes to assist in patient diagnosis and GWAS follow-up studies. AVAILABILITY AND IMPLEMENTATION Codes and data used in this work are available at https://github.com/miaecle/eQTL_Trees. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhenqin Wu
- Department of Chemistry, Stanford University, CA 94305, USA.,Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, 94305 CA, USA
| | - Nilah M Ioannidis
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, 94305 CA, USA
| | - James Zou
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, 94305 CA, USA.,Chan-Zuckerberg Biohub, San Francisco, 94158 CA, USA
| |
Collapse
|
38
|
Bocher O, Génin E. Rare variant association testing in the non-coding genome. Hum Genet 2020; 139:1345-1362. [PMID: 32500240 DOI: 10.1007/s00439-020-02190-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 05/29/2020] [Indexed: 12/25/2022]
Abstract
The development of next-generation sequencing technologies has opened-up some new possibilities to explore the contribution of genetic variants to human diseases and in particular that of rare variants. Statistical methods have been developed to test for association with rare variants that require the definition of testing units and, in these testing units, the selection of qualifying variants to include in the test. In the coding regions of the genome, testing units are usually the different genes and qualifying variants are selected based on their functional effects on the encoded proteins. Extending these tests to the non-coding regions of the genome is challenging. Testing units are difficult to define as the non-coding genome organisation is still rather unknown. Qualifying variants are difficult to select as the functional impact of non-coding variants on gene expression is hard to predict. These difficulties could explain why very few investigators so far have analysed the non-coding parts of their whole genome sequencing data. These non-coding parts yet represent the vast majority of the genome and some studies suggest that they could play a major role in disease susceptibility. In this review, we discuss recent experimental and statistical developments to gain knowledge on the non-coding genome and how this knowledge could be used to include rare non-coding variants in association tests. We describe the few studies that have considered variants from the non-coding genome in association tests and how they managed to define testing units and select qualifying variants.
Collapse
Affiliation(s)
- Ozvan Bocher
- Génétique, Génomique Fonctionnelle Et Biotechnologies, Faculté de Médecine, Univ Brest, Inserm, Inserm UMR1078, Bâtiment E-IBRBS 2ieme étage, 22 avenue Camille Desmoulins, 29238, Brest Cedex 3, France.
| | - Emmanuelle Génin
- Génétique, Génomique Fonctionnelle Et Biotechnologies, Faculté de Médecine, Univ Brest, Inserm, Inserm UMR1078, Bâtiment E-IBRBS 2ieme étage, 22 avenue Camille Desmoulins, 29238, Brest Cedex 3, France.
- CHU Brest, Brest, France.
| |
Collapse
|
39
|
Wang J, Huang D, Zhou Y, Yao H, Liu H, Zhai S, Wu C, Zheng Z, Zhao K, Wang Z, Yi X, Zhang S, Liu X, Liu Z, Chen K, Yu Y, Sham PC, Li MJ. CAUSALdb: a database for disease/trait causal variants identified using summary statistics of genome-wide association studies. Nucleic Acids Res 2020; 48:D807-D816. [PMID: 31691819 PMCID: PMC7145620 DOI: 10.1093/nar/gkz1026] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2019] [Revised: 10/19/2019] [Accepted: 10/21/2019] [Indexed: 12/13/2022] Open
Abstract
Genome-wide association studies (GWASs) have revolutionized the field of complex trait genetics over the past decade, yet for most of the significant genotype-phenotype associations the true causal variants remain unknown. Identifying and interpreting how causal genetic variants confer disease susceptibility is still a big challenge. Herein we introduce a new database, CAUSALdb, to integrate the most comprehensive GWAS summary statistics to date and identify credible sets of potential causal variants using uniformly processed fine-mapping. The database has six major features: it (i) curates 3052 high-quality, fine-mappable GWAS summary statistics across five human super-populations and 2629 unique traits; (ii) estimates causal probabilities of all genetic variants in GWAS significant loci using three state-of-the-art fine-mapping tools; (iii) maps the reported traits to a powerful ontology MeSH, making it simple for users to browse studies on the trait tree; (iv) incorporates highly interactive Manhattan and LocusZoom-like plots to allow visualization of credible sets in a single web page more efficiently; (v) enables online comparison of causal relations on variant-, gene- and trait-levels among studies with different sample sizes or populations and (vi) offers comprehensive variant annotations by integrating massive base-wise and allele-specific functional annotations. CAUSALdb is freely available at http://mulinlab.org/causaldb.
Collapse
Affiliation(s)
- Jianhua Wang
- 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Dandan Huang
- 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Yao Zhou
- 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Hongcheng Yao
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Huanhuan Liu
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Sinan Zhai
- School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| | - Chengwei Wu
- School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| | - Zhanye Zheng
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Ke Zhao
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Zhao Wang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Xianfu Yi
- School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| | - Shijie Zhang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Xiaorong Liu
- Clinical laboratory, Institute of Pediatrics, Shenzhen Children's Hospital, Shenzhen, China
| | - Zipeng Liu
- Centre of Genomics Sciences, State Key Laboratory of Brain and Cognitive Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Kexin Chen
- Department of Epidemiology and Biostatistics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Ying Yu
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Pak Chung Sham
- Centre of Genomics Sciences, State Key Laboratory of Brain and Cognitive Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Mulin Jun Li
- 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China.,Department of Epidemiology and Biostatistics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| |
Collapse
|
40
|
Zheng Z, Huang D, Wang J, Zhao K, Zhou Y, Guo Z, Zhai S, Xu H, Cui H, Yao H, Wang Z, Yi X, Zhang S, Sham PC, Li MJ. QTLbase: an integrative resource for quantitative trait loci across multiple human molecular phenotypes. Nucleic Acids Res 2020; 48:D983-D991. [PMID: 31598699 PMCID: PMC6943073 DOI: 10.1093/nar/gkz888] [Citation(s) in RCA: 82] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Revised: 09/24/2019] [Accepted: 10/02/2019] [Indexed: 12/20/2022] Open
Abstract
Recent advances in genome sequencing and functional genomic profiling have promoted many large-scale quantitative trait locus (QTL) studies, which connect genotypes with tissue/cell type-specific cellular functions from transcriptional to post-translational level. However, no comprehensive resource can perform QTL lookup across multiple molecular phenotypes and investigate the potential cascade effect of functional variants. We developed a versatile resource, named QTLbase, for interpreting the possible molecular functions of genetic variants, as well as their tissue/cell-type specificity. Overall, QTLbase has five key functions: (i) curating and compiling genome-wide QTL summary statistics for 13 human molecular traits from 233 independent studies; (ii) mapping QTL-relevant tissue/cell types to 78 unified terms according to a standard anatomogram; (iii) normalizing variant and trait information uniformly, yielding >170 million significant QTLs; (iv) providing a rich web client that enables phenome- and tissue-wise visualization; and (v) integrating the most comprehensive genomic features and functional predictions to annotate the potential QTL mechanisms. QTLbase provides a one-stop shop for QTL retrieval and comparison across multiple tissues and multiple layers of molecular complexity, and will greatly help researchers interrogate the biological mechanism of causal variants and guide the direction of functional validation. QTLbase is freely available at http://mulinlab.org/qtlbase.
Collapse
Affiliation(s)
- Zhanye Zheng
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Dandan Huang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Jianhua Wang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Ke Zhao
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Yao Zhou
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Zhenyang Guo
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 999077, China
| | - Sinan Zhai
- School of Biomedical Engineering, Tianjin Medical University, Tianjin 300070, China
| | - Hang Xu
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Hui Cui
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Hongcheng Yao
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 999077, China
| | - Zhao Wang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Xianfu Yi
- School of Biomedical Engineering, Tianjin Medical University, Tianjin 300070, China
| | - Shijie Zhang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Pak Chung Sham
- Centre of Genomics Sciences, State Key Laboratory of Brain and Cognitive Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 999077, China
| | - Mulin Jun Li
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| |
Collapse
|