1
|
Murali H, Wang P, Liao EC, Wang K. Genetic variant classification by predicted protein structure: A case study on IRF6. Comput Struct Biotechnol J 2024; 23:892-904. [PMID: 38370976 PMCID: PMC10869248 DOI: 10.1016/j.csbj.2024.01.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 01/24/2024] [Accepted: 01/25/2024] [Indexed: 02/20/2024] Open
Abstract
Next-generation genome sequencing has revolutionized genetic testing, identifying numerous rare disease-associated gene variants. However, to impute pathogenicity, computational approaches remain inadequate and functional testing of gene variant is required to provide the highest level of evidence. The emergence of AlphaFold2 has transformed the field of protein structure determination, and here we outline a strategy that leverages predicted protein structure to enhance genetic variant classification. We used the gene IRF6 as a case study due to its clinical relevance, its critical role in cleft lip/palate malformation, and the availability of experimental data on the pathogenicity of IRF6 gene variants through phenotype rescue experiments in irf6-/- zebrafish. We compared results from over 30 pathogenicity prediction tools on 37 IRF6 missense variants. IRF6 lacks an experimentally derived structure, so we used predicted structures to explore associations between mutational clustering and pathogenicity. We found that among these variants, 19 of 37 were unanimously predicted as deleterious by computational tools. Comparing in silico predictions with experimental findings, 12 variants predicted as pathogenic were experimentally determined as benign. Even with the recently published AlphaMissense model, 15/18 (83%) of the predicted pathogenic variants were experimentally determined as benign. In comparison, mapping variants to the protein revealed deleterious mutation clusters around the protein binding domain, whereas N-terminal variants tend to be benign, suggesting the importance of structural information in determining pathogenicity of mutations in this gene. In conclusion, incorporating gene-specific structural features of known pathogenic/benign mutations may provide meaningful insights into pathogenicity predictions in a gene-specific manner and facilitate the interpretation of variant pathogenicity.
Collapse
Affiliation(s)
- Hemma Murali
- Graduate Program in Biochemistry and Molecular Biophysics, University of Pennsylvania, Philadelphia, PA 19104, United States
- Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, United States
| | - Peng Wang
- Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, United States
- Master of Biotechnology Program, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Eric C. Liao
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
- Center for Craniofacial Innovation, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, United States
| | - Kai Wang
- Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, United States
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
| |
Collapse
|
2
|
Ye D, Garmany R, Martinez-Barrios E, Gao X, Neves RAL, Tester DJ, Bains S, Zhou W, Giudicessi JR, Ackerman MJ. Clinical Utility of Protein Language Models in Resolution of Variants of Uncertain Significance in KCNQ1, KCNH2, and SCN5A Compared With Patch-Clamp Functional Characterization. CIRCULATION. GENOMIC AND PRECISION MEDICINE 2024; 17:e004584. [PMID: 39119706 DOI: 10.1161/circgen.124.004584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 07/08/2024] [Indexed: 08/10/2024]
Abstract
BACKGROUND Genetic testing for cardiac channelopathies is the standard of care. However, many rare genetic variants remain classified as variants of uncertain significance (VUS) due to lack of epidemiological and functional data. Whether deep protein language models may aid in VUS resolution remains unknown. Here, we set out to compare how 2 deep protein language models perform at VUS resolution in the 3 most common long-QT syndrome-causative genes compared with the gold-standard patch clamp. METHODS A total of 72 rare nonsynonymous VUS (9 KCNQ1, 19 KCNH2, and 50 SCN5A) were engineered by site-directed mutagenesis and expressed in either HEK293 cells or TSA201 cells. Whole-cell patch-clamp technique was used to functionally characterize these variants. The protein language models, evolutionary scale modeling, version 1b and AlphaMissense, were used to predict the variant effect of missense variants and compared with patch clamp. RESULTS Considering variants in all 3 genes, the evolutionary scale modeling, version 1b model had a receiver operating characteristic curve-area under the curve of 0.75 (P=0.0003). It had a sensitivity of 88% and a specificity of 50%. AlphaMissense performed well compared with patch-clamp with an receiver operating characteristic curve-area under the curve of 0.85 (P<0.0001), sensitivity of 80%, and specificity of 76%. CONCLUSIONS Deep protein language models aid in VUS resolution with high sensitivity but lower specificity. Thus, these tools cannot fully replace functional characterization but can aid in reducing the number of variants that may require functional analysis.
Collapse
Affiliation(s)
- Dan Ye
- Department of Molecular Pharmacology and Experimental Therapeutics (Windland Smith Rice Sudden Death Genomics Laboratory). Department of Cardiovascular Medicine, Division of Heart Rhythm Services (Windland Smith Rice Genetic Heart Rhythm Clinic). Department of Pediatric and Adolescent Medicine, Division of Pediatric Cardiology, Mayo Clinic
| | - Ramin Garmany
- Department of Molecular Pharmacology and Experimental Therapeutics (Windland Smith Rice Sudden Death Genomics Laboratory). Department of Cardiovascular Medicine, Division of Heart Rhythm Services (Windland Smith Rice Genetic Heart Rhythm Clinic). Department of Pediatric and Adolescent Medicine, Division of Pediatric Cardiology, Mayo Clinic
| | - Estefania Martinez-Barrios
- Department of Molecular Pharmacology and Experimental Therapeutics (Windland Smith Rice Sudden Death Genomics Laboratory). Department of Cardiovascular Medicine, Division of Heart Rhythm Services (Windland Smith Rice Genetic Heart Rhythm Clinic). Department of Pediatric and Adolescent Medicine, Division of Pediatric Cardiology, Mayo Clinic
| | - Xiaozhi Gao
- Department of Molecular Pharmacology and Experimental Therapeutics (Windland Smith Rice Sudden Death Genomics Laboratory). Department of Cardiovascular Medicine, Division of Heart Rhythm Services (Windland Smith Rice Genetic Heart Rhythm Clinic). Department of Pediatric and Adolescent Medicine, Division of Pediatric Cardiology, Mayo Clinic
| | - Raquel Almeida Lopes Neves
- Department of Molecular Pharmacology and Experimental Therapeutics (Windland Smith Rice Sudden Death Genomics Laboratory). Department of Cardiovascular Medicine, Division of Heart Rhythm Services (Windland Smith Rice Genetic Heart Rhythm Clinic). Department of Pediatric and Adolescent Medicine, Division of Pediatric Cardiology, Mayo Clinic
| | - David J Tester
- Department of Molecular Pharmacology and Experimental Therapeutics (Windland Smith Rice Sudden Death Genomics Laboratory). Department of Cardiovascular Medicine, Division of Heart Rhythm Services (Windland Smith Rice Genetic Heart Rhythm Clinic). Department of Pediatric and Adolescent Medicine, Division of Pediatric Cardiology, Mayo Clinic
| | - Sahej Bains
- Department of Molecular Pharmacology and Experimental Therapeutics (Windland Smith Rice Sudden Death Genomics Laboratory). Department of Cardiovascular Medicine, Division of Heart Rhythm Services (Windland Smith Rice Genetic Heart Rhythm Clinic). Department of Pediatric and Adolescent Medicine, Division of Pediatric Cardiology, Mayo Clinic
| | - Wei Zhou
- Department of Molecular Pharmacology and Experimental Therapeutics (Windland Smith Rice Sudden Death Genomics Laboratory). Department of Cardiovascular Medicine, Division of Heart Rhythm Services (Windland Smith Rice Genetic Heart Rhythm Clinic). Department of Pediatric and Adolescent Medicine, Division of Pediatric Cardiology, Mayo Clinic
| | - John R Giudicessi
- Department of Molecular Pharmacology and Experimental Therapeutics (Windland Smith Rice Sudden Death Genomics Laboratory). Department of Cardiovascular Medicine, Division of Heart Rhythm Services (Windland Smith Rice Genetic Heart Rhythm Clinic). Department of Pediatric and Adolescent Medicine, Division of Pediatric Cardiology, Mayo Clinic
| | - Michael J Ackerman
- Department of Molecular Pharmacology and Experimental Therapeutics (Windland Smith Rice Sudden Death Genomics Laboratory). Department of Cardiovascular Medicine, Division of Heart Rhythm Services (Windland Smith Rice Genetic Heart Rhythm Clinic). Department of Pediatric and Adolescent Medicine, Division of Pediatric Cardiology, Mayo Clinic
| |
Collapse
|
3
|
Kliche J, Simonetti L, Krystkowiak I, Kuss H, Diallo M, Rask E, Nilsson J, Davey NE, Ivarsson Y. Proteome-scale characterisation of motif-based interactome rewiring by disease mutations. Mol Syst Biol 2024; 20:1025-1048. [PMID: 39009827 PMCID: PMC11369174 DOI: 10.1038/s44320-024-00055-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 06/14/2024] [Accepted: 06/28/2024] [Indexed: 07/17/2024] Open
Abstract
Whole genome and exome sequencing are reporting on hundreds of thousands of missense mutations. Taking a pan-disease approach, we explored how mutations in intrinsically disordered regions (IDRs) break or generate protein interactions mediated by short linear motifs. We created a peptide-phage display library tiling ~57,000 peptides from the IDRs of the human proteome overlapping 12,301 single nucleotide variants associated with diverse phenotypes including cancer, metabolic diseases and neurological diseases. By screening 80 human proteins, we identified 366 mutation-modulated interactions, with half of the mutations diminishing binding, and half enhancing binding or creating novel interaction interfaces. The effects of the mutations were confirmed by affinity measurements. In cellular assays, the effects of motif-disruptive mutations were validated, including loss of a nuclear localisation signal in the cell division control protein CDC45 by a mutation associated with Meier-Gorlin syndrome. The study provides insights into how disease-associated mutations may perturb and rewire the motif-based interactome.
Collapse
Affiliation(s)
- Johanna Kliche
- Department of Chemistry - BMC, Box 576, Husargatan 3, 751 23, Uppsala, Sweden
| | - Leandro Simonetti
- Department of Chemistry - BMC, Box 576, Husargatan 3, 751 23, Uppsala, Sweden
| | - Izabella Krystkowiak
- Division of Cancer Biology, Institute of Cancer Research, Chester Beatty Laboratories, 237 Fulham Road, SW3 6JB, Chelsea, London, UK
| | - Hanna Kuss
- Department of Chemistry - BMC, Box 576, Husargatan 3, 751 23, Uppsala, Sweden
- University of Münster, Institute of Pharmaceutical and Medicinal Chemistry, DE-48149, Münster, Germany
| | - Marcel Diallo
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3B, 2200, Copenhagen, Denmark
| | - Emma Rask
- Department of Chemistry - BMC, Box 576, Husargatan 3, 751 23, Uppsala, Sweden
| | - Jakob Nilsson
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3B, 2200, Copenhagen, Denmark
| | - Norman E Davey
- Division of Cancer Biology, Institute of Cancer Research, Chester Beatty Laboratories, 237 Fulham Road, SW3 6JB, Chelsea, London, UK.
| | - Ylva Ivarsson
- Department of Chemistry - BMC, Box 576, Husargatan 3, 751 23, Uppsala, Sweden.
| |
Collapse
|
4
|
Faraggi E, Jernigan RL, Kloczkowski A. Rapid discrimination between deleterious and benign missense mutations in the CAGI 6 experiment. Hum Genomics 2024; 18:89. [PMID: 39192324 PMCID: PMC11350969 DOI: 10.1186/s40246-024-00655-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 08/08/2024] [Indexed: 08/29/2024] Open
Abstract
We describe the machine learning tool that we applied in the CAGI 6 experiment to predict whether single residue mutations in proteins are deleterious or benign. This tool was trained using only single sequences, i.e., without multiple sequence alignments or structural information. Instead, we used global characterizations of the protein sequence. Training and testing data for human gene mutations was obtained from ClinVar (ncbi.nlm.nih.gov/pub/ClinVar/), and for non-human gene mutations from Uniprot (www.uniprot.org). Testing was done on post-training data from ClinVar. This testing yielded high AUC and Matthews correlation coefficient (MCC) for well trained examples but low generalizability. For genes with either sparse or unbalanced training data, the prediction accuracy is poor. The resulting prediction server is available online at http://www.mamiris.com/Shoni.cagi6.
Collapse
Affiliation(s)
- Eshel Faraggi
- Research and Information Systems, LLC, 1620 E. 72nd ST., Indianapolis, IN, 46240, USA.
- Physics Department, Indiana University Purdue University Indianapolis, Indianapolis, IN, 46202, USA.
| | - Robert L Jernigan
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, IA, 50011, USA
| | - Andrzej Kloczkowski
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Columbus, OH, 43205, USA
- Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, 43205, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH, 43205, USA
| |
Collapse
|
5
|
Hauser BM, Place E, Huckfeldt R, Vavvas DG. A novel homozygous nonsense variant in CABP4 causing stationary cone/rod synaptic dysfunction. Ophthalmic Genet 2024:1-6. [PMID: 39148310 DOI: 10.1080/13816810.2024.2371875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 06/13/2024] [Accepted: 06/19/2024] [Indexed: 08/17/2024]
Abstract
INTRODUCTION Variants in the CABP4 gene cause a phenotype to be included in the spectrum of congenital stationary night blindness, though some reports suggest that the clinical abnormalities are more accurately categorized as a synaptic disease of the cones and rods. We report a novel homozygous nonsense variant in CABP4 in a patient complaining of non-progressive reduced visual acuity and photophobia but not nyctalopia. METHODS Complete ocular examination, fundus photographs, autofluorescence, optical coherence tomography, electroretinography, and targeted sequencing of known inherited retinal disease-associated genes. RESULTS A 25-year-old man monitored for 13 years complains of a lifelong history of stable reduced visual acuity (20/150), impaired color vision (1 of 14 plates), small-amplitude nystagmus, and photophobia without nyctalopia. He is also hyperopic (+7D), and his electroretinography shows significantly reduced rod and cone responses. Targeted genetic analysis revealed a novel homozygous variant in the CABP4 gene at c.181C>T, p. (Gln61*) underlying his clinical presentation. CONCLUSIONS A novel variant in CABP4 is associated with stationary cone and rod dysfunction resulting in decreased acuity, color deficit, and photophobia, but not nyctalopia.
Collapse
Affiliation(s)
- Blake M Hauser
- Harvard Medical School Department of Ophthalmology, Retina Service, Massachusetts Eye and Ear, Boston, Massachusetts, USA
| | - Emily Place
- Harvard Medical School Department of Ophthalmology, Retina Service, Massachusetts Eye and Ear, Boston, Massachusetts, USA
| | - Rachel Huckfeldt
- Harvard Medical School Department of Ophthalmology, Retina Service, Massachusetts Eye and Ear, Boston, Massachusetts, USA
| | - Demetrios G Vavvas
- Harvard Medical School Department of Ophthalmology, Retina Service, Massachusetts Eye and Ear, Boston, Massachusetts, USA
| |
Collapse
|
6
|
Mahmood A, Samad A, Bano S, Umair M, Ajmal A, Ilyas I, Shah AA, Li P, Hu J. Structural and dynamics insights into the GBA variants associated with Parkinson's disease. J Biomol Struct Dyn 2024; 42:6256-6268. [PMID: 37434319 DOI: 10.1080/07391102.2023.2233617] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Accepted: 07/01/2023] [Indexed: 07/13/2023]
Abstract
The GBA1 gene encodes for the lysosomal enzyme glucocerebrosidase (GCase), which maintains glycosphingolipid homeostasis and regulates the autophagy process. Genomic variants of GBA1 are associated with Goucher disease; however, several heterozygous variants of GBA (E326K, T369M, N370S, L444P) are frequent high-risk factors for Parkinson's disease (PD). The underlying mechanism of these variants has been revealed through functional and patient-centered research, but the structural and dynamical aspects of these variants have not yet been thoroughly investigated. In the current study, we used a thorough computational method to pinpoint the structural changes that GBA underwent because of genomic variants and drug binding mechanisms. According to our findings, PD-linked nsSNP variants of GBA showed structural variation and abnormal dynamics when compared to wild-typ. The docking analysis demonstrated that the mutants E326K, N370S, and L444P have higher binding affinities for Ambroxol. Root means square deviation (RMSD), Root mean square fluctuation analysis (RMSF), and MM-GBSA analysis confirmed that the Ambroxol are more stable in the binding site of N370S and L444P, and that their binding affinities are stronger as compared to the wild-type and T369M variants of GBA. The evaluation of hydrogen bonds and the calculation of the free binding energy provided additional evidence in favor of this conclusion. When docked with Ambroxol, GBA demonstrated an increase in binding affinity and catalytic activity. Understanding the therapeutic efficacy and potential against the aforementioned changes in the GBA will be beneficial in order to use more efficient methods for developing novel drugs.
Collapse
Affiliation(s)
- Arif Mahmood
- Center for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Abdus Samad
- Department of Biochemistry, Abdul Wali Khan University, Mardan, Pakistan
| | - Shazia Bano
- Department of Optometry and Vision Sciences, University of Lahore, Lahore, Pakistan
| | - Muhammad Umair
- Medical Genomics Research Department, King Abdullah International Medical Research Center (KAIMRC), Ministry of National Guard Health Affairs (MNGH), King Saud Bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - Amar Ajmal
- Department of Biochemistry, Abdul Wali Khan University, Mardan, Pakistan
| | - Iqra Ilyas
- National Centre of Excellence in Molecular Biology (CEMB), University of The Punjab, Lahore, Pakistan
| | - Abid Ali Shah
- Center for Medical Genetics and Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Ping Li
- Institute of Biomedical Sciences, Shanxi University, Taiyuan, China
| | - Junjian Hu
- Department of Central Laboratory, SSL Central Hospital of Dongguan City, Affiliated Dongguan Shilong People's Hospital of Southern Medical University, Dongguan, China
| |
Collapse
|
7
|
Hauser BM, Luo Y, Nathan A, Al-Moujahed A, Vavvas DG, Comander J, Pierce EA, Place EM, Bujakowska KM, Gaiha GD, Rossin EJ. Structure-based network analysis predicts pathogenic variants in human proteins associated with inherited retinal disease. NPJ Genom Med 2024; 9:31. [PMID: 38802398 PMCID: PMC11130145 DOI: 10.1038/s41525-024-00416-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 05/02/2024] [Indexed: 05/29/2024] Open
Abstract
Advances in gene sequencing technologies have accelerated the identification of genetic variants, but better tools are needed to understand which are causal of disease. This would be particularly useful in fields where gene therapy is a potential therapeutic modality for a disease-causing variant such as inherited retinal disease (IRD). Here, we apply structure-based network analysis (SBNA), which has been successfully utilized to identify variant-constrained amino acid residues in viral proteins, to identify residues that may cause IRD if subject to missense mutation. SBNA is based entirely on structural first principles and is not fit to specific outcome data, which makes it distinct from other contemporary missense prediction tools. In 4 well-studied human disease-associated proteins (BRCA1, HRAS, PTEN, and ERK2) with high-quality structural data, we find that SBNA scores correlate strongly with deep mutagenesis data. When applied to 47 IRD genes with available high-quality crystal structure data, SBNA scores reliably identified disease-causing variants according to phenotype definitions from the ClinVar database. Finally, we applied this approach to 63 patients at Massachusetts Eye and Ear (MEE) with IRD but for whom no genetic cause had been identified. Untrained models built using SBNA scores and BLOSUM62 scores for IRD-associated genes successfully predicted the pathogenicity of novel variants (AUC = 0.851), allowing us to identify likely causative disease variants in 40 IRD patients. Model performance was further augmented by incorporating orthogonal data from EVE scores (AUC = 0.927), which are based on evolutionary multiple sequence alignments. In conclusion, SBNA can used to successfully identify variants as causal of disease in human proteins and may help predict variants causative of IRD in an unbiased fashion.
Collapse
Affiliation(s)
| | - Yuyang Luo
- Harvard Medical School, Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA, USA
| | - Anusha Nathan
- Ragon Institute of Mass General, MIT, and Harvard, Cambridge, MA, USA
| | - Ahmad Al-Moujahed
- Harvard Medical School, Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA, USA
| | - Demetrios G Vavvas
- Harvard Medical School, Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA, USA
| | - Jason Comander
- Harvard Medical School, Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA, USA
| | - Eric A Pierce
- Harvard Medical School, Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA, USA
| | - Emily M Place
- Harvard Medical School, Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA, USA
| | - Kinga M Bujakowska
- Harvard Medical School, Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA, USA
| | - Gaurav D Gaiha
- Ragon Institute of Mass General, MIT, and Harvard, Cambridge, MA, USA
- Division of Gastroenterology, Massachusetts General Hospital, Boston, MA, USA
| | - Elizabeth J Rossin
- Harvard Medical School, Department of Ophthalmology, Massachusetts Eye and Ear, Boston, MA, USA.
| |
Collapse
|
8
|
Linga BG, Mohammed SGAA, Farrell T, Rifai HA, Al-Dewik N, Qoronfleh MW. Genomic Newborn Screening for Pediatric Cancer Predisposition Syndromes: A Holistic Approach. Cancers (Basel) 2024; 16:2017. [PMID: 38893137 PMCID: PMC11171256 DOI: 10.3390/cancers16112017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 05/23/2024] [Accepted: 05/24/2024] [Indexed: 06/21/2024] Open
Abstract
As next-generation sequencing (NGS) has become more widely used, germline and rare genetic variations responsible for inherited illnesses, including cancer predisposition syndromes (CPSs) that account for up to 10% of childhood malignancies, have been found. The CPSs are a group of germline genetic disorders that have been identified as risk factors for pediatric cancer development. Excluding a few "classic" CPSs, there is no agreement regarding when and how to conduct germline genetic diagnostic studies in children with cancer due to the constant evolution of knowledge in NGS technologies. Various clinical screening tools have been suggested to aid in the identification of individuals who are at greater risk, using diverse strategies and with varied outcomes. We present here an overview of the primary clinical and molecular characteristics of various CPSs and summarize the existing clinical genomics data on the prevalence of CPSs in pediatric cancer patients. Additionally, we discuss several ethical issues, challenges, limitations, cost-effectiveness, and integration of genomic newborn screening for CPSs into a healthcare system. Furthermore, we assess the effectiveness of commonly utilized decision-support tools in identifying patients who may benefit from genetic counseling and/or direct genetic testing. This investigation highlights a tailored and systematic approach utilizing medical newborn screening tools such as the genome sequencing of high-risk newborns for CPSs, which could be a practical and cost-effective strategy in pediatric cancer care.
Collapse
Affiliation(s)
- BalaSubramani Gattu Linga
- Department of Research, Women’s Wellness and Research Center, Hamad Medical Corporation (HMC), P.O. Box 3050, Doha 0974, Qatar
- Translational and Precision Medicine Research, Women’s Wellness and Research Center (WWRC), Hamad Medical Corporation (HMC), Doha 0974, Qatar
| | | | - Thomas Farrell
- Department of Research, Women’s Wellness and Research Center, Hamad Medical Corporation (HMC), P.O. Box 3050, Doha 0974, Qatar
| | - Hilal Al Rifai
- Neonatal Intensive Care Unit (NICU), Newborn Screening Unit, Department of Pediatrics and Neonatology, Women’s Wellness and Research Center (WWRC), Hamad Medical Corporation (HMC), Doha 0974, Qatar
| | - Nader Al-Dewik
- Department of Research, Women’s Wellness and Research Center, Hamad Medical Corporation (HMC), P.O. Box 3050, Doha 0974, Qatar
- Translational and Precision Medicine Research, Women’s Wellness and Research Center (WWRC), Hamad Medical Corporation (HMC), Doha 0974, Qatar
- Neonatal Intensive Care Unit (NICU), Newborn Screening Unit, Department of Pediatrics and Neonatology, Women’s Wellness and Research Center (WWRC), Hamad Medical Corporation (HMC), Doha 0974, Qatar
- Genomics and Precision Medicine (GPM), College of Health & Life Science (CHLS), Hamad Bin Khalifa University (HBKU), Doha 0974, Qatar
- Faculty of Health and Social Care Sciences, Kingston University and St George’s University of London, Kingston upon Thames, Surrey, London KT1 2EE, UK
| | - M. Walid Qoronfleh
- Healthcare Research & Policy Division, Q3 Research Institute (QRI), Ann Arbor, MI 48197, USA
| |
Collapse
|
9
|
Holm LL, Doktor TK, Flugt KK, Petersen US, Petersen R, Andresen B. All exons are not created equal-exon vulnerability determines the effect of exonic mutations on splicing. Nucleic Acids Res 2024; 52:4588-4603. [PMID: 38324470 PMCID: PMC11077056 DOI: 10.1093/nar/gkae077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 01/05/2024] [Accepted: 01/26/2024] [Indexed: 02/09/2024] Open
Abstract
It is now widely accepted that aberrant splicing of constitutive exons is often caused by mutations affecting cis-acting splicing regulatory elements (SREs), but there is a misconception that all exons have an equal dependency on SREs and thus a similar vulnerability to aberrant splicing. We demonstrate that some exons are more likely to be affected by exonic splicing mutations (ESMs) due to an inherent vulnerability, which is context dependent and influenced by the strength of exon definition. We have developed VulExMap, a tool which is based on empirical data that can designate whether a constitutive exon is vulnerable. Using VulExMap, we find that only 25% of all exons can be categorized as vulnerable, whereas two-thirds of 359 previously reported ESMs in 75 disease genes are located in vulnerable exons. Because VulExMap analysis is based on empirical data on splicing of exons in their endogenous context, it includes all features important in determining the vulnerability. We believe that VulExMap will be an important tool when assessing the effect of exonic mutations by pinpointing whether they are located in exons vulnerable to ESMs.
Collapse
Affiliation(s)
- Lise L Holm
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark
- Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Thomas K Doktor
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark
- Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Katharina K Flugt
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark
- Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Ulrika S S Petersen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark
- Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Rikke Petersen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark
- Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Brage S Andresen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark
- Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| |
Collapse
|
10
|
Zhang S, Xu N, Fu L, Yang X, Li Y, Yang Z, Feng Y, Ma K, Jiang X, Han J, Hu R, Zhang L, de Gennaro L, Ryabov F, Meng D, He Y, Wu D, Yang C, Paparella A, Mao Y, Bian X, Lu Y, Antonacci F, Ventura M, Shepelev VA, Miga KH, Alexandrov IA, Logsdon GA, Phillippy AM, Su B, Zhang G, Eichler EE, Lu Q, Shi Y, Sun Q, Mao Y. Comparative genomics of macaques and integrated insights into genetic variation and population history. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.07.588379. [PMID: 38645259 PMCID: PMC11030432 DOI: 10.1101/2024.04.07.588379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
The crab-eating macaques ( Macaca fascicularis ) and rhesus macaques ( M. mulatta ) are widely studied nonhuman primates in biomedical and evolutionary research. Despite their significance, the current understanding of the complex genomic structure in macaques and the differences between species requires substantial improvement. Here, we present a complete genome assembly of a crab-eating macaque and 20 haplotype-resolved macaque assemblies to investigate the complex regions and major genomic differences between species. Segmental duplication in macaques is ∼42% lower, while centromeres are ∼3.7 times longer than those in humans. The characterization of ∼2 Mbp fixed genetic variants and ∼240 Mbp complex loci highlights potential associations with metabolic differences between the two macaque species (e.g., CYP2C76 and EHBP1L1 ). Additionally, hundreds of alternative splicing differences show post-transcriptional regulation divergence between these two species (e.g., PNPO ). We also characterize 91 large-scale genomic differences between macaques and humans at a single-base-pair resolution and highlight their impact on gene regulation in primate evolution (e.g., FOLH1 and PIEZO2 ). Finally, population genetics recapitulates macaque speciation and selective sweeps, highlighting potential genetic basis of reproduction and tail phenotype differences (e.g., STAB1 , SEMA3F , and HOXD13 ). In summary, the integrated analysis of genetic variation and population genetics in macaques greatly enhances our comprehension of lineage-specific phenotypes, adaptation, and primate evolution, thereby improving their biomedical applications in human diseases.
Collapse
|
11
|
Mróz J, Pelc M, Mitusińska K, Chorostowska-Wynimko J, Jezela-Stanek A. Computational Tools to Assist in Analyzing Effects of the SERPINA1 Gene Variation on Alpha-1 Antitrypsin (AAT). Genes (Basel) 2024; 15:340. [PMID: 38540399 PMCID: PMC10970068 DOI: 10.3390/genes15030340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Revised: 02/28/2024] [Accepted: 03/04/2024] [Indexed: 06/14/2024] Open
Abstract
In the rapidly advancing field of bioinformatics, the development and application of computational tools to predict the effects of single nucleotide variants (SNVs) are shedding light on the molecular mechanisms underlying disorders. Also, they hold promise for guiding therapeutic interventions and personalized medicine strategies in the future. A comprehensive understanding of the impact of SNVs in the SERPINA1 gene on alpha-1 antitrypsin (AAT) protein structure and function requires integrating bioinformatic approaches. Here, we provide a guide for clinicians to navigate through the field of computational analyses which can be applied to describe a novel genetic variant. Predicting the clinical significance of SERPINA1 variation allows clinicians to tailor treatment options for individuals with alpha-1 antitrypsin deficiency (AATD) and related conditions, ultimately improving the patient's outcome and quality of life. This paper explores the various bioinformatic methodologies and cutting-edge approaches dedicated to the assessment of molecular variants of genes and their product proteins using SERPINA1 and AAT as an example.
Collapse
Affiliation(s)
- Jakub Mróz
- Tunneling Group, Biotechnology Center, Silesian University of Technology, Krzywoustego St. 8, 44-100 Gliwice, Poland;
| | - Magdalena Pelc
- Department of Genetics and Clinical Immunology, National Institute of Tuberculosis and Lung Diseases, 26 Plocka St., 01-138 Warsaw, Poland; (M.P.); (J.C.-W.)
| | - Karolina Mitusińska
- Tunneling Group, Biotechnology Center, Silesian University of Technology, Krzywoustego St. 8, 44-100 Gliwice, Poland;
| | - Joanna Chorostowska-Wynimko
- Department of Genetics and Clinical Immunology, National Institute of Tuberculosis and Lung Diseases, 26 Plocka St., 01-138 Warsaw, Poland; (M.P.); (J.C.-W.)
| | - Aleksandra Jezela-Stanek
- Department of Genetics and Clinical Immunology, National Institute of Tuberculosis and Lung Diseases, 26 Plocka St., 01-138 Warsaw, Poland; (M.P.); (J.C.-W.)
| |
Collapse
|
12
|
Yuan X, Su J, Wang J, Dai B, Sun Y, Zhang K, Li Y, Chuan J, Tang C, Yu Y, Gong Q. Refined preferences of prioritizers improve intelligent diagnosis for Mendelian diseases. Sci Rep 2024; 14:2845. [PMID: 38310124 PMCID: PMC10838329 DOI: 10.1038/s41598-024-53461-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 01/31/2024] [Indexed: 02/05/2024] Open
Abstract
Phenotype-guided gene prioritizers have proved a highly efficient approach to identifying causal genes for Mendelian diseases. In our previous study, we preliminarily evaluated the performance of ten prioritizers. However, all the selected software was run based on default settings and singleton mode. With a large-scale family dataset from Deciphering Developmental Disorders (DDD) project (N = 305) and an in-house trio cohort (N = 152), the four optimal performers in our prior study including Exomiser, PhenIX, AMELIE, and LIRCIAL were further assessed through parameter optimization and/or the utilization of trio mode. The in-depth assessment revealed high diagnostic yields of the four prioritizers with refined preferences, each alone or together: (1) 83.3-91.8% of the causal genes were presented among the first ten candidates in the final ranking lists of the four tools; (2) Over 97.7% of the causal genes were successfully captured within the top 50 by either of the four software. Exomiser did best in directly hitting the target (ranking the causal gene at the very top) while LIRICAL displayed a predominant overall detection capability. Besides, cases affected by low-penetrance and high-frequency pathogenic variants were found misjudged during the automated prioritization process. The discovery of the limitations shed light on the specific directions of future enhancement for causal-gene ranking tools.
Collapse
Affiliation(s)
- Xiao Yuan
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Jieqiong Su
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Jing Wang
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Bing Dai
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Yanfang Sun
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Keke Zhang
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Yinghua Li
- Guangzhou Kingmed Center for Clinical Laboratory, Guangzhou, Guangdong, China
| | - Jun Chuan
- Genetalks Biotech. Co., Ltd., Changsha, Hunan, China
| | - Chunyan Tang
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China
| | - Yan Yu
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China.
| | - Qiang Gong
- Changsha Kingmed Center for Clinical Laboratory, Lutian Road 28, Changsha, 410000, Hunan, China.
| |
Collapse
|
13
|
Zech M, Winkelmann J. Next-generation sequencing and bioinformatics in rare movement disorders. Nat Rev Neurol 2024; 20:114-126. [PMID: 38172289 DOI: 10.1038/s41582-023-00909-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/29/2023] [Indexed: 01/05/2024]
Abstract
The ability to sequence entire exomes and genomes has revolutionized molecular testing in rare movement disorders, and genomic sequencing is becoming an integral part of routine diagnostic workflows for these heterogeneous conditions. However, interpretation of the extensive genomic variant information that is being generated presents substantial challenges. In this Perspective, we outline multidimensional strategies for genetic diagnosis in patients with rare movement disorders. We examine bioinformatics tools and computational metrics that have been developed to facilitate accurate prioritization of disease-causing variants. Additionally, we highlight community-driven data-sharing and case-matchmaking platforms, which are designed to foster the discovery of new genotype-phenotype relationships. Finally, we consider how multiomic data integration might optimize diagnostic success by combining genomic, epigenetic, transcriptomic and/or proteomic profiling to enable a more holistic evaluation of variant effects. Together, the approaches that we discuss offer pathways to the improved understanding of the genetic basis of rare movement disorders.
Collapse
Affiliation(s)
- Michael Zech
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany
- Institute of Neurogenomics, Helmholtz Zentrum München, Munich, Germany
- Institute for Advanced Study, Technical University of Munich, Garching, Germany
| | - Juliane Winkelmann
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany.
- Institute of Neurogenomics, Helmholtz Zentrum München, Munich, Germany.
- Munich Cluster for Systems Neurology, SyNergy, Munich, Germany.
- DZPG, Deutsches Zentrum für Psychische Gesundheit, Munich, Germany.
| |
Collapse
|
14
|
Cui H, Srinivasan S, Gao Z, Korkin D. The Extent of Edgetic Perturbations in the Human Interactome Caused by Population-Specific Mutations. Biomolecules 2023; 14:40. [PMID: 38254640 PMCID: PMC11154503 DOI: 10.3390/biom14010040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 11/30/2023] [Accepted: 12/03/2023] [Indexed: 01/24/2024] Open
Abstract
Until recently, efforts in population genetics have been focused primarily on people of European ancestry. To attenuate this bias, global population studies, such as the 1000 Genomes Project, have revealed differences in genetic variation across ethnic groups. How many of these differences can be attributed to population-specific traits? To answer this question, the mutation data must be linked with functional outcomes. A new "edgotype" concept has been proposed, which emphasizes the interaction-specific, "edgetic", perturbations caused by mutations in the interacting proteins. In this work, we performed systematic in silico edgetic profiling of ~50,000 non-synonymous SNVs (nsSNVs) from the 1000 Genomes Project by leveraging our semi-supervised learning approach SNP-IN tool on a comprehensive set of over 10,000 protein interaction complexes. We interrogated the functional roles of the variants and their impact on the human interactome and compared the results with the pathogenic variants disrupting PPIs in the same interactome. Our results demonstrated that a considerable number of nsSNVs from healthy populations could rewire the interactome. We also showed that the proteins enriched with interaction-disrupting mutations were associated with diverse functions and had implications in a broad spectrum of diseases. Further analysis indicated that distinct gene edgetic profiles among major populations could shed light on the molecular mechanisms behind the population phenotypic variances. Finally, the network analysis revealed that the disease-associated modules surprisingly harbored a higher density of interaction-disrupting mutations from healthy populations. The variation in the cumulative network damage within these modules could potentially account for the observed disparities in disease susceptibility, which are distinctly specific to certain populations. Our work demonstrates the feasibility of a large-scale in silico edgetic study, and reveals insights into the orchestrated play of population-specific mutations in the human interactome.
Collapse
Affiliation(s)
- Hongzhu Cui
- Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, MA 01609, USA;
- Chromatography and Mass Spectrometry Division, Thermo Fisher Scientific, San Jose, CA 95134, USA
| | - Suhas Srinivasan
- Data Science Program, Worcester Polytechnic Institute, Worcester, MA 01609, USA;
- Program in Epithelial Biology, Stanford School of Medicine, Stanford, CA 94305, USA
- Center for Personal Dynamic Regulomes, Stanford School of Medicine, Stanford, CA 94305, USA
| | - Ziyang Gao
- Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, MA 01609, USA;
| | - Dmitry Korkin
- Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, MA 01609, USA;
- Data Science Program, Worcester Polytechnic Institute, Worcester, MA 01609, USA;
- Computer Science Department, Worcester Polytechnic Institute, Worcester, MA 01609, USA
| |
Collapse
|
15
|
Kyriazis CC, Robinson JA, Lohmueller KE. Using Computational Simulations to Model Deleterious Variation and Genetic Load in Natural Populations. Am Nat 2023; 202:737-752. [PMID: 38033186 PMCID: PMC10897732 DOI: 10.1086/726736] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
AbstractDeleterious genetic variation is abundant in wild populations, and understanding the ecological and conservation implications of such variation is an area of active research. Genomic methods are increasingly used to quantify the impacts of deleterious variation in natural populations; however, these approaches remain limited by an inability to accurately predict the selective and dominance effects of mutations. Computational simulations of deleterious variation offer a complementary tool that can help overcome these limitations, although such approaches have yet to be widely employed. In this perspective article, we aim to encourage ecological and conservation genomics researchers to adopt greater use of computational simulations to aid in deepening our understanding of deleterious variation in natural populations. We first provide an overview of the components of a simulation of deleterious variation, describing the key parameters involved in such models. Next, we discuss several approaches for validating simulation models. Finally, we compare and validate several recently proposed deleterious mutation models, demonstrating that models based on estimates of selection parameters from experimental systems are biased toward highly deleterious mutations. We describe a new model that is supported by multiple orthogonal lines of evidence and provide example scripts for implementing this model (https://github.com/ckyriazis/simulations_review).
Collapse
|
16
|
Aweidah H, Xi Z, Sahel JA, Byrne LC. PRPF31-retinitis pigmentosa: Challenges and opportunities for clinical translation. Vision Res 2023; 213:108315. [PMID: 37714045 PMCID: PMC10872823 DOI: 10.1016/j.visres.2023.108315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 08/23/2023] [Accepted: 08/24/2023] [Indexed: 09/17/2023]
Abstract
Mutations in pre-mRNA processing factor 31 cause autosomal dominant retinitis pigmentosa (PRPF31-RP), for which there is currently no efficient treatment, making this disease a prime target for the development of novel therapeutic strategies. PRPF31-RP exhibits incomplete penetrance due to haploinsufficiency, in which reduced levels of gene expression from the mutated allele result in disease. A variety of model systems have been used in the investigation of disease etiology and therapy development. In this review, we discuss recent advances in both in vivo and in vitro model systems, evaluating their advantages and limitations in the context of therapy development for PRPF31-RP. Additionally, we describe the latest approaches for treatment, including AAV-mediated gene augmentation, genome editing, and late-stage therapies such as optogenetics, cell transplantation, and retinal prostheses.
Collapse
Affiliation(s)
- Hamzah Aweidah
- Department of Ophthalmology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Zhouhuan Xi
- Department of Ophthalmology, University of Pittsburgh, Pittsburgh, PA, USA; Department of Ophthalmology, Eye Center, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - José-Alain Sahel
- Department of Ophthalmology, University of Pittsburgh, Pittsburgh, PA, USA; Department of Neurobiology, University of Pittsburgh, Pittsburgh, PA, USA; Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Leah C Byrne
- Department of Ophthalmology, University of Pittsburgh, Pittsburgh, PA, USA; Department of Neurobiology, University of Pittsburgh, Pittsburgh, PA, USA; Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
17
|
Staklinski SJ, Scheben A, Siepel A, Kilberg MS. Utility of AlphaMissense predictions in Asparagine Synthetase deficiency variant classification. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.30.564808. [PMID: 37961642 PMCID: PMC10634951 DOI: 10.1101/2023.10.30.564808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
AlphaMissense is a recently developed method that is designed to classify missense variants into pathogenic, benign, or ambiguous categories across the entire human proteome. Asparagine Synthetase Deficiency (ASNSD) is a developmental disorder associated with severe symptoms, including congenital microcephaly, seizures, and premature death. Diagnosing ASNSD relies on identifying mutations in the asparagine synthetase (ASNS) gene through DNA sequencing and determining whether these variants are pathogenic or benign. Pathogenic ASNS variants are predicted to disrupt the protein's structure and/or function, leading to asparagine depletion within cells and inhibition of cell growth. AlphaMissense offers a promising solution for the rapid classification of ASNS variants established by DNA sequencing and provides a community resource of pathogenicity scores and classifications for newly diagnosed ASNSD patients. Here, we assessed AlphaMissense's utility in ASNSD by benchmarking it against known critical residues in ASNS and evaluating its performance against a list of previously reported ASNSD-associated variants. We also present a pipeline to calculate AlphaMissense scores for any protein in the UniProt database. AlphaMissense accurately attributed a high average pathogenicity score to known critical residues within the two ASNS active sites and the connecting intramolecular tunnel. The program successfully categorized 78.9% of known ASNSD-associated missense variants as pathogenic. The remaining variants were primarily labeled as ambiguous, with a smaller proportion classified as benign. This study underscores the potential role of AlphaMissense in classifying ASNS variants in suspected cases of ASNSD, potentially providing clarity to patients and their families grappling with ongoing diagnostic uncertainty.
Collapse
Affiliation(s)
- Stephen J. Staklinski
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724
- School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724
| | - Armin Scheben
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724
| | - Adam Siepel
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724
| | - Michael S. Kilberg
- Department of Biochemistry and Molecular Biology, University of Florida College of Medicine, Box 100245, Gainesville, FL 326010-0245
| |
Collapse
|
18
|
Boulogne F, Claus LR, Wiersma H, Oelen R, Schukking F, de Klein N, Li S, Westra HJ, van der Zwaag B, van Reekum F, Sierks D, Schönauer R, Li Z, Bijlsma EK, Bos WJW, Halbritter J, Knoers NVAM, Besse W, Deelen P, Franke L, van Eerde AM. KidneyNetwork: using kidney-derived gene expression data to predict and prioritize novel genes involved in kidney disease. Eur J Hum Genet 2023; 31:1300-1308. [PMID: 36807342 PMCID: PMC10620423 DOI: 10.1038/s41431-023-01296-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 11/24/2022] [Accepted: 01/18/2023] [Indexed: 02/22/2023] Open
Abstract
Genetic testing in patients with suspected hereditary kidney disease may not reveal the genetic cause for the disorder as potentially pathogenic variants can reside in genes that are not yet known to be involved in kidney disease. We have developed KidneyNetwork, that utilizes tissue-specific expression to inform candidate gene prioritization specifically for kidney diseases. KidneyNetwork is a novel method constructed by integrating a kidney RNA-sequencing co-expression network of 878 samples with a multi-tissue network of 31,499 samples. It uses expression patterns and established gene-phenotype associations to predict which genes could be related to what (disease) phenotypes in an unbiased manner. We applied KidneyNetwork to rare variants in exome sequencing data from 13 kidney disease patients without a genetic diagnosis to prioritize candidate genes. KidneyNetwork can accurately predict kidney-specific gene functions and (kidney disease) phenotypes for disease-associated genes. The intersection of prioritized genes with genes carrying rare variants in a patient with kidney and liver cysts identified ALG6 as plausible candidate gene. We strengthen this plausibility by identifying ALG6 variants in several cystic kidney and liver disease cases without alternative genetic explanation. We present KidneyNetwork, a publicly available kidney-specific co-expression network with optimized gene-phenotype predictions for kidney disease phenotypes. We designed an easy-to-use online interface that allows clinicians and researchers to use gene expression and co-regulation data and gene-phenotype connections to accelerate advances in hereditary kidney disease diagnosis and research. TRANSLATIONAL STATEMENT: Genetic testing in patients with suspected hereditary kidney disease may not reveal the genetic cause for the patient's disorder. Potentially pathogenic variants can reside in genes not yet known to be involved in kidney disease, making it difficult to interpret the relevance of these variants. This reveals a clear need for methods to predict the phenotypic consequences of genetic variation in an unbiased manner. Here we describe KidneyNetwork, a tool that utilizes tissue-specific expression to predict kidney-specific gene functions. Applying KidneyNetwork to a group of undiagnosed cases identified ALG6 as a candidate gene in cystic kidney and liver disease. In summary, KidneyNetwork can aid the interpretation of genetic variants and can therefore be of value in translational nephrogenetics and help improve the diagnostic yield in kidney disease patients.
Collapse
Affiliation(s)
- Floranne Boulogne
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| | - Laura R Claus
- Department of Genetics, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Henry Wiersma
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Roy Oelen
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| | - Floor Schukking
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Niek de Klein
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Shuang Li
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Genomics Coordination Center, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Harm-Jan Westra
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| | - Bert van der Zwaag
- Department of Genetics, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Franka van Reekum
- Department of Nephrology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Dana Sierks
- Medical Department III - Endocrinology, Nephrology, Rheumatology Department of Internal Medicine, Division of Nephrology, University of Leipzig Medical Center, Leipzig, Germany
| | - Ria Schönauer
- Medical Department III - Endocrinology, Nephrology, Rheumatology Department of Internal Medicine, Division of Nephrology, University of Leipzig Medical Center, Leipzig, Germany
- Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Zhigui Li
- Department of Internal Medicine (Nephrology), Yale School of Medicine, New Haven, CT, USA
| | - Emilia K Bijlsma
- Department of Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Willem Jan W Bos
- Department of Internal Medicine, St Antonius Hospital, Nieuwegein, The Netherlands
- Department of Internal Medicine, Leiden University Medical Center, Leiden, The Netherlands
| | - Jan Halbritter
- Medical Department III - Endocrinology, Nephrology, Rheumatology Department of Internal Medicine, Division of Nephrology, University of Leipzig Medical Center, Leipzig, Germany
- Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Nine V A M Knoers
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Whitney Besse
- Department of Internal Medicine (Nephrology), Yale School of Medicine, New Haven, CT, USA
| | - Patrick Deelen
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
- Department of Genetics, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Lude Franke
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| | - Albertien M van Eerde
- Department of Genetics, University Medical Center Utrecht, Utrecht, The Netherlands.
| |
Collapse
|
19
|
Nigenda-Morales SF, Lin M, Nuñez-Valencia PG, Kyriazis CC, Beichman AC, Robinson JA, Ragsdale AP, Urbán R J, Archer FI, Viloria-Gómora L, Pérez-Álvarez MJ, Poulin E, Lohmueller KE, Moreno-Estrada A, Wayne RK. The genomic footprint of whaling and isolation in fin whale populations. Nat Commun 2023; 14:5465. [PMID: 37699896 PMCID: PMC10497599 DOI: 10.1038/s41467-023-40052-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 07/10/2023] [Indexed: 09/14/2023] Open
Abstract
Twentieth century industrial whaling pushed several species to the brink of extinction, with fin whales being the most impacted. However, a small, resident population in the Gulf of California was not targeted by whaling. Here, we analyzed 50 whole-genomes from the Eastern North Pacific (ENP) and Gulf of California (GOC) fin whale populations to investigate their demographic history and the genomic effects of natural and human-induced bottlenecks. We show that the two populations diverged ~16,000 years ago, after which the ENP population expanded and then suffered a 99% reduction in effective size during the whaling period. In contrast, the GOC population remained small and isolated, receiving less than one migrant per generation. However, this low level of migration has been crucial for maintaining its viability. Our study exposes the severity of whaling, emphasizes the importance of migration, and demonstrates the use of genome-based analyses and simulations to inform conservation strategies.
Collapse
Affiliation(s)
- Sergio F Nigenda-Morales
- Advanced Genomics Unit, National Laboratory of Genomics for Biodiversity (Langebio), Center for Research and Advanced Studies (Cinvestav), Irapuato, Guanajuato, 36824, Mexico.
- Department of Biological Sciences, California State University San Marcos, San Marcos, CA, 92096, USA.
| | - Meixi Lin
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, 94305, USA.
| | - Paulina G Nuñez-Valencia
- Advanced Genomics Unit, National Laboratory of Genomics for Biodiversity (Langebio), Center for Research and Advanced Studies (Cinvestav), Irapuato, Guanajuato, 36824, Mexico
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México (UNAM), Cuernavaca, Morelos, México
| | - Christopher C Kyriazis
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Annabel C Beichman
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
| | - Jacqueline A Robinson
- Institute for Human Genetics, University of California, San Francisco (UCSF), San Francisco, CA, 94143, USA
| | - Aaron P Ragsdale
- Advanced Genomics Unit, National Laboratory of Genomics for Biodiversity (Langebio), Center for Research and Advanced Studies (Cinvestav), Irapuato, Guanajuato, 36824, Mexico
- Department of Integrative Biology, University of Wisconsin, Madison, WI, 53706, USA
| | - Jorge Urbán R
- Departamento de Ciencias Marinas y Costeras, Universidad Autónoma de Baja California Sur (UABCS), La Paz, Baja California Sur, Mexico
| | - Frederick I Archer
- Marine Mammal and Turtle Division, Southwest Fisheries Science Center, La Jolla, CA, 92037, USA
| | - Lorena Viloria-Gómora
- Departamento de Ciencias Marinas y Costeras, Universidad Autónoma de Baja California Sur (UABCS), La Paz, Baja California Sur, Mexico
| | - María José Pérez-Álvarez
- Escuela de Medicina Veterinaria, Facultad de Medicina y Ciencias de la Salud, Universidad Mayor, Santiago, Chile
- Millennium Institute Biodiversity of Antarctic and Subantarctic Ecosystems (BASE), Universidad de Chile, Santiago, Chile
| | - Elie Poulin
- Millennium Institute Biodiversity of Antarctic and Subantarctic Ecosystems (BASE), Universidad de Chile, Santiago, Chile
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, CA, 90095, USA.
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA.
| | - Andrés Moreno-Estrada
- Advanced Genomics Unit, National Laboratory of Genomics for Biodiversity (Langebio), Center for Research and Advanced Studies (Cinvestav), Irapuato, Guanajuato, 36824, Mexico.
| | - Robert K Wayne
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| |
Collapse
|
20
|
Miller SE, Sheehan MJ. Sex differences in deleterious genetic variants in a haplodiploid social insect. Mol Ecol 2023; 32:4546-4556. [PMID: 37350360 PMCID: PMC10528523 DOI: 10.1111/mec.17057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 06/01/2023] [Accepted: 06/12/2023] [Indexed: 06/24/2023]
Abstract
Deleterious variants are selected against but can linger in populations at low frequencies for long periods of time, decreasing fitness and contributing to disease burden in humans and other species. Deleterious variants occur at low frequency but distinguishing deleterious variants from low-frequency neutral variation is challenging based on population genomics data alone. As a result, we have little sense of the number and identity of deleterious variants in wild populations. For haplodiploid species, it has been hypothesised that deleterious alleles will be directly exposed to selection in haploid males, but selection can be masked in diploid females when deleterious variants are recessive, resulting in more efficient purging of deleterious mutations in males. Therefore, comparisons of the differences between haploid and diploid genomes from the same population may be a useful method for inferring rare deleterious variants. This study provides the first formal test of this hypothesis. Using wild populations of Northern paper wasps (Polistes fuscatus), we find that males have fewer missense and nonsense variants per generation than females from the same population. Allele frequency differences are especially pronounced for rare missense and nonsense variants and these differences lead to a lower mutational load in males than females. Based on these data we infer that many highly deleterious mutations are segregating in the paper wasp population. Stronger selection against deleterious alleles in haploid males may have implications for adaptation in other haplodiploid insects and provides evidence that wild populations harbour abundant deleterious variants.
Collapse
Affiliation(s)
- Sara E. Miller
- Laboratory for Animal Social Evolution and Recognition, Department of Neurobiology and Behavior, Cornell University, Ithaca, NY, USA
- Department of Biology, University of Missouri St. Louis, St. Louis, MO, USA
| | - Michael J. Sheehan
- Laboratory for Animal Social Evolution and Recognition, Department of Neurobiology and Behavior, Cornell University, Ithaca, NY, USA
| |
Collapse
|
21
|
Hauser BM, Luo Y, Nathan A, Gaiha GD, Vavvas D, Comander J, Pierce EA, Place EM, Bujakowska KM, Rossin EJ. Structure-based network analysis predicts mutations associated with inherited retinal disease. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.07.05.23292247. [PMID: 37461650 PMCID: PMC10350150 DOI: 10.1101/2023.07.05.23292247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/27/2023]
Abstract
With continued advances in gene sequencing technologies comes the need to develop better tools to understand which mutations cause disease. Here we validate structure-based network analysis (SBNA)1,2 in well-studied human proteins and report results of using SBNA to identify critical amino acids that may cause retinal disease if subject to missense mutation. We computed SBNA scores for genes with high-quality structural data, starting with validating the method using 4 well-studied human disease-associated proteins. We then analyzed 47 inherited retinal disease (IRD) genes. We compared SBNA scores to phenotype data from the ClinVar database and found a significant difference between benign and pathogenic mutations with respect to network score. Finally, we applied this approach to 65 patients at Massachusetts Eye and Ear (MEE) who were diagnosed with IRD but for whom no genetic cause was found. Multivariable logistic regression models built using SBNA scores for IRD-associated genes successfully predicted pathogenicity of novel mutations, allowing us to identify likely causative disease variants in 37 patients with IRD from our clinic. In conclusion, SBNA can be meaningfully applied to human proteins and may help predict mutations causative of IRD.
Collapse
Affiliation(s)
| | - Yuyang Luo
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | - Anusha Nathan
- Ragon Institute of Mass General, MIT, and Harvard, Cambridge, MA
| | - Gaurav D. Gaiha
- Ragon Institute of Mass General, MIT, and Harvard, Cambridge, MA
| | - Demetrios Vavvas
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | - Jason Comander
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | - Eric A. Pierce
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | - Emily M. Place
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | - Kinga M. Bujakowska
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| | - Elizabeth J. Rossin
- Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA
| |
Collapse
|
22
|
Kumaran M, Devarajan B. eyeVarP: A computational framework for the identification of pathogenic variants specific to eye disease. Genet Med 2023; 25:100862. [PMID: 37092535 DOI: 10.1016/j.gim.2023.100862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 04/11/2023] [Accepted: 04/13/2023] [Indexed: 04/25/2023] Open
Abstract
PURPOSE Disease-specific pathogenic variant prediction tools that differentiate pathogenic variants from benign have been improved through disease specificity recently. However, they have not been evaluated on disease-specific pathogenic variants compared with other diseases, which would help to prioritize disease-specific variants from several genes or novel genes. Thus, we hypothesize that features of pathogenic variants alone would provide a better model. METHODS We developed an eye disease-specific variant prioritization tool (eyeVarP), which applied the random forest algorithm to the data set of pathogenic variants of eye diseases and other diseases. We also developed the VarP tool and generalized pipeline to filter missense and insertion-deletion variants and predict their pathogenicity from exome or genome sequencing data, thus we provide a complete computational procedure. RESULTS eyeVarP outperformed pan disease-specific tools in identifying eye disease-specific pathogenic variants under the top 10. VarP outperformed 12 pathogenicity prediction tools with an accuracy of 95% in correctly identifying the pathogenicity of missense and insertion-deletion variants. The complete pipeline would help to develop disease-specific tools for other genetic disorders. CONCLUSION eyeVarP performs better in identifying eye disease-specific pathogenic variants using pathogenic variant features and gene features. Implementing such complete computational procedure would significantly improve the clinical variant interpretation for specific diseases.
Collapse
Affiliation(s)
- Manojkumar Kumaran
- Department of Bioinformatics, Aravind Medical Research Foundation, Madurai, Tamil Nadu, India; School of Chemical and Biotechnology, SASTRA (Deemed to be a university), Thanjavur, Tamil Nadu, India
| | - Bharanidharan Devarajan
- Department of Bioinformatics, Aravind Medical Research Foundation, Madurai, Tamil Nadu, India.
| |
Collapse
|
23
|
Liu Y, Mao L, Huang H, Li W, Man J, Zhang W, Wang L, Li L, Sun Y, Zhai T, Guo X, Du L, Huang J, Li H, Wan Y, Wei X. Clinical diagnosis of genetic disorders at both single-nucleotide and chromosomal levels based on BGISEQ-500 platform. Hum Genome Var 2023; 10:15. [PMID: 37217505 PMCID: PMC10203365 DOI: 10.1038/s41439-023-00238-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 02/05/2023] [Accepted: 02/19/2023] [Indexed: 05/24/2023] Open
Abstract
Most variations in the human genome refer to single-nucleotide variation (SNV), small fragment insertions and deletions, and genomic copy number variation (CNV). Many human diseases including genetic disorders are associated with variations in the genome. These disorders are often difficult to be diagnosed because of their complex clinical conditions, therefore, an effective detection method is needed to facilitate clinical diagnosis and prevent birth defects. With the development of high-throughput sequencing technology, the method of targeted sequence capture chip has been extensively used owing to its high throughput, high accuracy, fast speed, and low cost. In this study, we designed a chip that potentially captured the coding region of 3043 genes associated with 4013 monogenic diseases, with an addition of 148 chromosomal abnormalities that can be identified by targeting specific regions. To assess the efficiency, a strategy of combining the BGISEQ500 sequencing platform with the designed chip was utilized to screen variants in 63 patients. Eventually, 67 disease-associated variants were found, 31 of which were novel. The results of the evaluation test also show that this combined strategy complies with the requirements of clinical testing and has proper clinical application value.
Collapse
Affiliation(s)
- Yanqiu Liu
- Department of Genetics, Jiangxi Maternal and Child Health Hospital, 330006, Nanchang, China
| | - Liangwei Mao
- BGI-Anhui Clinical Laboratory, BGI-Shenzhen, 236000, Fuyang, China
- The State Key Laboratory of Biocatalysis and Enzyme Engineering, College of Life Sciences, Hubei University, 430062, Wuhan, China
| | - Hui Huang
- BGI Genomics, BGI-Shenzhen, 518083, Shenzhen, China
| | - Wei Li
- BGI Genomics, BGI-Shenzhen, 518083, Shenzhen, China
| | - Jianfen Man
- BGI-Wuhan Clinical Laboratory, BGI-Shenzhen, 430074, Wuhan, China
| | - Wenqian Zhang
- BGI Genomics, BGI-Shenzhen, 518083, Shenzhen, China
- BGI-Wuhan Clinical Laboratory, BGI-Shenzhen, 430074, Wuhan, China
- Department of Biology, University of Copenhagen, Copenhagen, DK-2200, Denmark
| | - Lina Wang
- BGI-Wuhan Clinical Laboratory, BGI-Shenzhen, 430074, Wuhan, China
| | - Long Li
- BGI-Wuhan Clinical Laboratory, BGI-Shenzhen, 430074, Wuhan, China
| | - Yan Sun
- BGI Genomics, BGI-Shenzhen, 518083, Shenzhen, China
| | - Teng Zhai
- BGI Genomics, BGI-Shenzhen, 518083, Shenzhen, China
| | - Xueqin Guo
- BGI-Wuhan Clinical Laboratory, BGI-Shenzhen, 430074, Wuhan, China
| | - Lique Du
- BGI-Wuhan Clinical Laboratory, BGI-Shenzhen, 430074, Wuhan, China
| | - Jin Huang
- BGI-Wuhan Clinical Laboratory, BGI-Shenzhen, 430074, Wuhan, China
| | - Hao Li
- BGI-Anhui Clinical Laboratory, BGI-Shenzhen, 236000, Fuyang, China
| | - Yang Wan
- Department of Obstetrics and Gynecology, Fuyang People's Hospital, 236000, Fuyang, China.
| | - Xiaoming Wei
- BGI-Wuhan Clinical Laboratory, BGI-Shenzhen, 430074, Wuhan, China.
| |
Collapse
|
24
|
Sullivan PF, Meadows JRS, Gazal S, Phan BN, Li X, Genereux DP, Dong MX, Bianchi M, Andrews G, Sakthikumar S, Nordin J, Roy A, Christmas MJ, Marinescu VD, Wang C, Wallerman O, Xue J, Yao S, Sun Q, Szatkiewicz J, Wen J, Huckins LM, Lawler A, Keough KC, Zheng Z, Zeng J, Wray NR, Li Y, Johnson J, Chen J, Paten B, Reilly SK, Hughes GM, Weng Z, Pollard KS, Pfenning AR, Forsberg-Nilsson K, Karlsson EK, Lindblad-Toh K, Andrews G, Armstrong JC, Bianchi M, Birren BW, Bredemeyer KR, Breit AM, Christmas MJ, Clawson H, Damas J, Di Palma F, Diekhans M, Dong MX, Eizirik E, Fan K, Fanter C, Foley NM, Forsberg-Nilsson K, Garcia CJ, Gatesy J, Gazal S, Genereux DP, Goodman L, Grimshaw J, Halsey MK, Harris AJ, Hickey G, Hiller M, Hindle AG, Hubley RM, Hughes GM, Johnson J, Juan D, Kaplow IM, Karlsson EK, Keough KC, Kirilenko B, Koepfli KP, Korstian JM, Kowalczyk A, Kozyrev SV, Lawler AJ, Lawless C, Lehmann T, Levesque DL, Lewin HA, Li X, Lind A, Lindblad-Toh K, Mackay-Smith A, Marinescu VD, Marques-Bonet T, Mason VC, Meadows JRS, Meyer WK, Moore JE, Moreira LR, Moreno-Santillan DD, Morrill KM, Muntané G, Murphy WJ, Navarro A, Nweeia M, Ortmann S, Osmanski A, Paten B, Paulat NS, Pfenning AR, Phan BN, Pollard KS, Pratt HE, Ray DA, Reilly SK, Rosen JR, Ruf I, Ryan L, Ryder OA, Sabeti PC, Schäffer DE, Serres A, Shapiro B, Smit AFA, Springer M, Srinivasan C, Steiner C, Storer JM, Sullivan KAM, Sullivan PF, Sundström E, Supple MA, Swofford R, Talbot JE, Teeling E, Turner-Maier J, Valenzuela A, Wagner F, Wallerman O, Wang C, Wang J, Weng Z, Wilder AP, Wirthlin ME, Xue JR, Zhang X. Leveraging base-pair mammalian constraint to understand genetic variation and human disease. Science 2023; 380:eabn2937. [PMID: 37104612 PMCID: PMC10259825 DOI: 10.1126/science.abn2937] [Citation(s) in RCA: 35] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 02/09/2023] [Indexed: 04/29/2023]
Abstract
Thousands of genomic regions have been associated with heritable human diseases, but attempts to elucidate biological mechanisms are impeded by an inability to discern which genomic positions are functionally important. Evolutionary constraint is a powerful predictor of function, agnostic to cell type or disease mechanism. Single-base phyloP scores from 240 mammals identified 3.3% of the human genome as significantly constrained and likely functional. We compared phyloP scores to genome annotation, association studies, copy-number variation, clinical genetics findings, and cancer data. Constrained positions are enriched for variants that explain common disease heritability more than other functional annotations. Our results improve variant annotation but also highlight that the regulatory landscape of the human genome still needs to be further explored and linked to disease.
Collapse
Affiliation(s)
- Patrick F Sullivan
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Jennifer R S Meadows
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Steven Gazal
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA
| | - BaDoi N Phan
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Xue Li
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Diane P Genereux
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Michael X Dong
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Matteo Bianchi
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Gregory Andrews
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Sharadha Sakthikumar
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | - Jessika Nordin
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Ananya Roy
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, 75185 Uppsala, Sweden
| | - Matthew J Christmas
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Voichita D Marinescu
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Chao Wang
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
| | - James Xue
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Center for System Biology, Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Shuyang Yao
- Department of Medical Epidemiology and Biostatistics, Karolinska Institute, 17177 Stockholm, Sweden
| | - Quan Sun
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
| | - Jin Szatkiewicz
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
| | - Jia Wen
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
| | - Laura M Huckins
- Department of Genetic and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Alyssa Lawler
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Kathleen C Keough
- Gladstone Institutes, San Francisco, CA 94158, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA 94158, USA
| | - Zhili Zheng
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
| | - Jian Zeng
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
| | - Naomi R Wray
- Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia
| | - Yun Li
- Department of Genetics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
| | - Jessica Johnson
- Department of Genetic and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Jiawen Chen
- Department of Biostatistics, University of North Carolina Medical School, Chapel Hill, NC 27599, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, Santa Cruz, CA 95064, USA
| | - Steven K Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Graham M Hughes
- School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Katherine S Pollard
- Gladstone Institutes, San Francisco, CA 94158, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - Andreas R Pfenning
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Karin Forsberg-Nilsson
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, 75185 Uppsala, Sweden
- Biodiscovery Institute, University of Nottingham, Nottingham NG7 2RD, UK
| | - Elinor K Karlsson
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
- Program in Molecular Medicine, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Kerstin Lindblad-Toh
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University, 75132 Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Sullivan PF, Meadows JRS, Gazal S, Phan BN, Li X, Genereux DP, Dong MX, Bianchi M, Andrews G, Sakthikumar S, Nordin J, Roy A, Christmas MJ, Marinescu VD, Wallerman O, Xue JR, Li Y, Yao S, Sun Q, Szatkiewicz J, Wen J, Huckins LM, Lawler AJ, Keough KC, Zheng Z, Zeng J, Wray NR, Johnson J, Chen J, Paten B, Reilly SK, Hughes GM, Weng Z, Pollard KS, Pfenning AR, Forsberg-Nilsson K, Karlsson EK, Lindblad-Toh K. Leveraging Base Pair Mammalian Constraint to Understand Genetic Variation and Human Disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.10.531987. [PMID: 36945512 PMCID: PMC10028973 DOI: 10.1101/2023.03.10.531987] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2023]
Abstract
Although thousands of genomic regions have been associated with heritable human diseases, attempts to elucidate biological mechanisms are impeded by a general inability to discern which genomic positions are functionally important. Evolutionary constraint is a powerful predictor of function that is agnostic to cell type or disease mechanism. Here, single base phyloP scores from the whole genome alignment of 240 placental mammals identified 3.5% of the human genome as significantly constrained, and likely functional. We compared these scores to large-scale genome annotation, genome-wide association studies (GWAS), copy number variation, clinical genetics findings, and cancer data sets. Evolutionarily constrained positions are enriched for variants explaining common disease heritability (more than any other functional annotation). Our results improve variant annotation but also highlight that the regulatory landscape of the human genome still needs to be further explored and linked to disease.
Collapse
Affiliation(s)
- Patrick F. Sullivan
- Department of Genetics, University of North Carolina Medical School; Chapel Hill, NC 27599, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet; Stockholm, Sweden
| | - Jennifer R. S. Meadows
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
| | - Steven Gazal
- Keck School of Medicine, University of Southern California; Los Angeles, CA 90033, USA
| | - BaDoi N. Phan
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University; Pittsburgh, PA 15213, USA
- Medical Scientist Training Program, University of Pittsburgh School of Medicine; Pittsburgh, PA 15261, USA
- Neuroscience Institute, Carnegie Mellon University; Pittsburgh, PA 15213, USA
| | - Xue Li
- Broad Institute of MIT and Harvard; Cambridge, MA 02139, USA
- Morningside Graduate School of Biomedical Sciences, UMass Chan Medical School; Worcester, MA 01605, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School; Worcester, MA 01605, USA
| | | | - Michael X. Dong
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
| | - Matteo Bianchi
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
| | - Gregory Andrews
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School; Worcester, MA 01605, USA
| | - Sharadha Sakthikumar
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
- Broad Institute of MIT and Harvard; Cambridge, MA 02139, USA
| | - Jessika Nordin
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University; Uppsala, 751 85, Sweden
| | - Ananya Roy
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University; Uppsala, 751 85, Sweden
| | - Matthew J. Christmas
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
| | - Voichita D. Marinescu
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
| | - James R. Xue
- Broad Institute of MIT and Harvard; Cambridge, MA 02139, USA
- Department of Organismic and Evolutionary Biology, Harvard University; Cambridge, MA 02138, USA
| | - Yun Li
- Department of Genetics, University of North Carolina Medical School; Chapel Hill, NC 27599, USA
| | - Shuyang Yao
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet; Stockholm, Sweden
| | - Quan Sun
- Department of Biostatistics, University of North Carolina at Chapel Hill; Chapel Hill, NC, USA
| | - Jin Szatkiewicz
- Department of Genetics, University of North Carolina Medical School; Chapel Hill, NC 27599, USA
| | - Jia Wen
- Department of Genetics, University of North Carolina Medical School; Chapel Hill, NC 27599, USA
| | - Laura M. Huckins
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai; New York, NY 10029, USA
| | - Alyssa J. Lawler
- Neuroscience Institute, Carnegie Mellon University; Pittsburgh, PA 15213, USA
- Broad Institute of MIT and Harvard; Cambridge, MA 02139, USA
- Department of Biological Sciences, Mellon College of Science, Carnegie Mellon University; Pittsburgh, PA 15213, USA
| | - Kathleen C. Keough
- Department of Epidemiology & Biostatistics, University of California San Francisco; San Francisco, CA 94158, USA
- Fauna Bio Incorporated; Emeryville, CA 94608, USA
- Gladstone Institutes; San Francisco, CA 94158, USA
| | - Zhili Zheng
- Institute for Molecular Bioscience, University of Queensland; Brisbane, Queensland, Australia
| | - Jian Zeng
- Institute for Molecular Bioscience, University of Queensland; Brisbane, Queensland, Australia
| | - Naomi R. Wray
- Institute for Molecular Bioscience, University of Queensland; Brisbane, Queensland, Australia
- Queensland Brain Institute, University of Queensland; Brisbane, Queensland, Australia
| | - Jessica Johnson
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai; New York, NY 10029, USA
| | - Jiawen Chen
- Department of Biostatistics, University of North Carolina at Chapel Hill; Chapel Hill, NC, USA
| | | | - Benedict Paten
- Genomics Institute, University of California Santa Cruz; Santa Cruz, CA 95064, USA
| | - Steven K. Reilly
- Department of Genetics, Yale School of Medicine; New Haven, CT 06510, USA
| | - Graham M. Hughes
- School of Biology and Environmental Science, University College Dublin; Belfield, Dublin 4, Ireland
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School; Worcester, MA 01605, USA
| | - Katherine S. Pollard
- Department of Epidemiology & Biostatistics, University of California San Francisco; San Francisco, CA 94158, USA
- Gladstone Institutes; San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub; San Francisco, CA 94158, USA
| | - Andreas R. Pfenning
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University; Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University; Pittsburgh, PA 15213, USA
| | - Karin Forsberg-Nilsson
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University; Uppsala, 751 85, Sweden
- Biodiscovery Institute, University of Nottingham; Nottingham, UK
| | - Elinor K. Karlsson
- Broad Institute of MIT and Harvard; Cambridge, MA 02139, USA
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School; Worcester, MA 01605, USA
- Program in Molecular Medicine, UMass Chan Medical School; Worcester, MA 01605, USA
| | - Kerstin Lindblad-Toh
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory, Uppsala University; Uppsala, 751 32, Sweden
- Broad Institute of MIT and Harvard; Cambridge, MA 02139, USA
| |
Collapse
|
26
|
Boßelmann CM, Hedrich UBS, Lerche H, Pfeifer N. Predicting functional effects of ion channel variants using new phenotypic machine learning methods. PLoS Comput Biol 2023; 19:e1010959. [PMID: 36877742 PMCID: PMC10019634 DOI: 10.1371/journal.pcbi.1010959] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 03/16/2023] [Accepted: 02/19/2023] [Indexed: 03/07/2023] Open
Abstract
Missense variants in genes encoding ion channels are associated with a spectrum of severe diseases. Variant effects on biophysical function correlate with clinical features and can be categorized as gain- or loss-of-function. This information enables a timely diagnosis, facilitates precision therapy, and guides prognosis. Functional characterization presents a bottleneck in translational medicine. Machine learning models may be able to rapidly generate supporting evidence by predicting variant functional effects. Here, we describe a multi-task multi-kernel learning framework capable of harmonizing functional results and structural information with clinical phenotypes. This novel approach extends the human phenotype ontology towards kernel-based supervised machine learning. Our gain- or loss-of-function classifier achieves high performance (mean accuracy 0.853 SD 0.016, mean AU-ROC 0.912 SD 0.025), outperforming both conventional baseline and state-of-the-art methods. Performance is robust across different phenotypic similarity measures and largely insensitive to phenotypic noise or sparsity. Localized multi-kernel learning offered biological insight and interpretability by highlighting channels with implicit genotype-phenotype correlations or latent task similarity for downstream analysis.
Collapse
Affiliation(s)
- Christian Malte Boßelmann
- Department of Neurology and Epileptology, Hertie Institute for Clinical Brain Research, University of Tuebingen, Tuebingen, Germany
- Methods in Medical Informatics, Department of Computer Science, University of Tuebingen, Tuebingen, Germany
| | - Ulrike B. S. Hedrich
- Department of Neurology and Epileptology, Hertie Institute for Clinical Brain Research, University of Tuebingen, Tuebingen, Germany
| | - Holger Lerche
- Department of Neurology and Epileptology, Hertie Institute for Clinical Brain Research, University of Tuebingen, Tuebingen, Germany
- * E-mail: (HL); (NP)
| | - Nico Pfeifer
- Methods in Medical Informatics, Department of Computer Science, University of Tuebingen, Tuebingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tuebingen, Tuebingen, Germany
- * E-mail: (HL); (NP)
| |
Collapse
|
27
|
Kyriazis CC, Beichman AC, Brzeski KE, Hoy SR, Peterson RO, Vucetich JA, Vucetich LM, Lohmueller KE, Wayne RK. Genomic Underpinnings of Population Persistence in Isle Royale Moose. Mol Biol Evol 2023; 40:msad021. [PMID: 36729989 PMCID: PMC9927576 DOI: 10.1093/molbev/msad021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 01/20/2023] [Accepted: 01/25/2023] [Indexed: 02/03/2023] Open
Abstract
Island ecosystems provide natural laboratories to assess the impacts of isolation on population persistence. However, most studies of persistence have focused on a single species, without comparisons to other organisms they interact with in the ecosystem. The case study of moose and gray wolves on Isle Royale allows for a direct contrast of genetic variation in isolated populations that have experienced dramatically differing population trajectories over the past decade. Whereas the Isle Royale wolf population recently declined nearly to extinction due to severe inbreeding depression, the moose population has thrived and continues to persist, despite having low genetic diversity and being isolated for ∼120 years. Here, we examine the patterns of genomic variation underlying the continued persistence of the Isle Royale moose population. We document high levels of inbreeding in the population, roughly as high as the wolf population at the time of its decline. However, inbreeding in the moose population manifests in the form of intermediate-length runs of homozygosity suggestive of historical inbreeding and purging, contrasting with the long runs of homozygosity observed in the smaller wolf population. Using simulations, we confirm that substantial purging has likely occurred in the moose population. However, we also document notable increases in genetic load, which could eventually threaten population viability over the long term. Overall, our results demonstrate a complex relationship between inbreeding, genetic diversity, and population viability that highlights the use of genomic datasets and computational simulation tools for understanding the factors enabling persistence in isolated populations.
Collapse
Affiliation(s)
- Christopher C Kyriazis
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA
| | | | - Kristin E Brzeski
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI
| | - Sarah R Hoy
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI
| | - Rolf O Peterson
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI
| | - John A Vucetich
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI
| | - Leah M Vucetich
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, CA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA
| | - Robert K Wayne
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA
| |
Collapse
|
28
|
Eng ZH, Abdullah MI, Ng KL, Abdul Aziz A, Arba’ie NH, Mat Rashid N, Mat Junit S. Whole-exome sequencing and bioinformatic analyses revealed differences in gene mutation profiles in papillary thyroid cancer patients with and without benign thyroid goitre background. Front Endocrinol (Lausanne) 2023; 13:1039494. [PMID: 36686473 PMCID: PMC9846740 DOI: 10.3389/fendo.2022.1039494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 12/07/2022] [Indexed: 01/05/2023] Open
Abstract
Background Papillary thyroid cancer (PTC) is the most common thyroid malignancy. Concurrent presence of cytomorphological benign thyroid goitre (BTG) and PTC lesion is often detected. Aberrant protein profiles were previously reported in patients with and without BTG cytomorphological background. This study aimed to evaluate gene mutation profiles to further understand the molecular mechanism underlying BTG, PTC without BTG background and PTC with BTG background. Methods Patients were grouped according to the histopathological examination results: (i) BTG patients (n = 9), (ii) PTC patients without BTG background (PTCa, n = 8), and (iii) PTC patients with BTG background (PTCb, n = 5). Whole-exome sequencing (WES) was performed on genomic DNA extracted from thyroid tissue specimens. Nonsynonymous and splice-site variants with MAF of ≤ 1% in the 1000 Genomes Project were subjected to principal component analysis (PCA). PTC-specific SNVs were filtered against OncoKB and COSMIC while novel SNVs were screened through dbSNP and COSMIC databases. Functional impacts of the SNVs were predicted using PolyPhen-2 and SIFT. Protein-protein interaction (PPI) enrichment of the tumour-related genes was analysed using Metascape and MCODE algorithm. Results PCA plots showed distinctive SNV profiles among the three groups. OncoKB and COSMIC database screening identified 36 tumour-related genes including BRCA2 and FANCD2 in all groups. BRAF and 19 additional genes were found only in PTCa and PTCb. "Pathways in cancer", "DNA repair" and "Fanconi anaemia pathway" were among the top networks shared by all groups. However, signalling pathways related to tyrosine kinases were the most significantly enriched in PTCa while "Jak-STAT signalling pathway" and "Notch signalling pathway" were the only significantly enriched in PTCb. Ten SNVs were PTC-specific of which two were novel; DCTN1 c.2786C>G (p.Ala929Gly) and TRRAP c.8735G>C (p.Ser2912Thr). Four out of the ten SNVs were unique to PTCa. Conclusion Distinctive gene mutation patterns detected in this study corroborated the previous protein profile findings. We hypothesised that the PTCa and PTCb subtypes differed in the underlying molecular mechanisms involving tyrosine kinase, Jak-STAT and Notch signalling pathways. The potential applications of the SNVs in differentiating the benign from the PTC subtypes requires further validation in a larger sample size.
Collapse
Affiliation(s)
- Zing Hong Eng
- Department of Molecular Medicine, Faculty of Medicine, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Mardiaty Iryani Abdullah
- Department of Molecular Medicine, Faculty of Medicine, Universiti Malaya, Kuala Lumpur, Malaysia
- Department of Biomedical Science, Kulliyyah of Allied Health Sciences, International Islamic University Malaysia, Kuantan, Pahang, Malaysia
| | - Khoon Leong Ng
- Department of Surgery, Faculty of Medicine, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Azlina Abdul Aziz
- Department of Molecular Medicine, Faculty of Medicine, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Nurul Hannis Arba’ie
- Department of Surgery, Faculty of Medicine, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Nurullainy Mat Rashid
- Department of Molecular Medicine, Faculty of Medicine, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Sarni Mat Junit
- Department of Molecular Medicine, Faculty of Medicine, Universiti Malaya, Kuala Lumpur, Malaysia
| |
Collapse
|
29
|
Chan AP, Choi Y, Rangan A, Zhang G, Podder A, Berens M, Sharma S, Pirrotte P, Byron S, Duggan D, Schork NJ. Interrogating the Human Diplome: Computational Methods, Emerging Applications, and Challenges. Methods Mol Biol 2023; 2590:1-30. [PMID: 36335489 DOI: 10.1007/978-1-0716-2819-5_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
Human DNA sequencing protocols have revolutionized human biology, biomedical science, and clinical practice, but still have very important limitations. One limitation is that most protocols do not separate or assemble (i.e., "phase") the nucleotide content of each of the maternally and paternally derived chromosomal homologs making up the 22 autosomal pairs and the chromosomal pair making up the pseudo-autosomal region of the sex chromosomes. This has led to a dearth of studies and a consequent underappreciation of many phenomena of fundamental importance to basic and clinical genomic science. We discuss a few protocols for obtaining phase information as well as their limitations, including those that could be used in tumor phasing settings. We then describe a number of biological and clinical phenomena that require phase information. These include phenomena that require precise knowledge of the nucleotide sequence in a chromosomal segment from germline or somatic cells, such as DNA binding events, and insight into unique cis vs. trans-acting functionally impactful variant combinations-for example, variants implicated in a phenotype governed by compound heterozygosity. In addition, we also comment on the need for reliable and consensus-based diploid-context computational workflows for variant identification as well as the need for laboratory-based functional verification strategies for validating cis vs. trans effects of variant combinations. We also briefly describe available resources, example studies, as well as areas of further research, and ultimately argue that the science behind the study of human diploidy, referred to as "diplomics," which will be enabled by nucleotide-level resolution of phased genomes, is a logical next step in the analysis of human genome biology.
Collapse
Affiliation(s)
- Agnes P Chan
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
| | - Yongwook Choi
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
| | - Aditya Rangan
- Courant Institute of Mathematical Sciences at New York University, New York, NY, USA
| | - Guangfa Zhang
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
| | - Avijit Podder
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
| | - Michael Berens
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
- The City of Hope National Medical Center, Duarte, CA, USA
| | - Sunil Sharma
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
- The City of Hope National Medical Center, Duarte, CA, USA
| | - Patrick Pirrotte
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
- The City of Hope National Medical Center, Duarte, CA, USA
| | - Sara Byron
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
- The City of Hope National Medical Center, Duarte, CA, USA
| | - Dave Duggan
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA
- The City of Hope National Medical Center, Duarte, CA, USA
| | - Nicholas J Schork
- The Translational Genomics Research Institute (TGen), part of the City of Hope National Medical Center, Phoenix, AZ, USA.
- The City of Hope National Medical Center, Duarte, CA, USA.
| |
Collapse
|
30
|
Block T, Zezulinski D, Kaplan DE, Lu J, Zanine S, Zhan T, Doria C, Sayeed A. Circulating messenger RNA variants as a potential biomarker for surveillance of hepatocellular carcinoma. Front Oncol 2022; 12:963641. [PMID: 36582804 PMCID: PMC9793749 DOI: 10.3389/fonc.2022.963641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 11/18/2022] [Indexed: 12/15/2022] Open
Abstract
Background and rationale Liver derived messenger ribonucleic acid (mRNA) transcripts were reported to be elevated in the circulation of hepatocellular carcinoma (HCC) patients. We now report the detection of high-risk mRNA variants exclusively in the circulation of HCC patients. Numerous genomic alleles such as single nucleotide polymorphisms (SNPs), nucleotide insertions and deletions (called Indels), splicing variants in many genes, have been associated with elevated risk of cancer. Our findings potentially offer a novel non-invasive platform for HCC surveillance and early detection. Approach RNAseq analysis was carried out in the plasma of 14 individuals with a diagnosis of HCC, 8 with LC and no HCC, and 6 with no liver disease diagnosis. RNA from 6 matching tumors and 5 circulating extracellular vesicle (EV) samples from 14 of those with HCC was also analyzed. Specimens from two cholangiocarcinoma (CCA) patients were also included in our study. HCC specific SNPs and Indels referred as "variants" were identified using GATK HaplotypeCaller and annotated by SnpEff to filter out high risk variants. Results The variant calling on all RNA samples enabled the detection of 5.2 million SNPs, 0.91 million insertions and 0.81 million deletions. RNAseq analyses in tumors, normal liver tissue, plasma, and plasma derived EVs led to the detection of 5480 high-risk tumor specific mRNA variants in the circulation of HCC patients. These variants are concurrently detected in tumors and plasma samples or tumors and EVs from HCC patients, but none of these were detected in normal liver, plasma of LC patients or normal healthy individuals. Our results demonstrate selective detection of concordant high-risk HCC-specific mRNA variants in free plasma, plasma derived EVs and tumors of HCC patients. The variants comprise of splicing, frameshift, fusion and single nucleotide alterations and correspond to cancer and tumor metabolism pathways. Detection of these high-risk variants in matching specimens from same subjects with an enrichment in circulating EVs is remarkable. Validation of these HCC selective ctmRNA variants in larger patient cohorts is likely to identify a predictive set of ctmRNA with high diagnostic performance and thus offer a novel non-invasive serology-based biomarker for HCC.
Collapse
Affiliation(s)
- Timothy Block
- Department of Translational Medicine, Baruch S. Blumberg Institute, Doylestown, PA, United States
| | - Daniel Zezulinski
- Department of Translational Medicine, Baruch S. Blumberg Institute, Doylestown, PA, United States
| | - David E. Kaplan
- Division of Gastroenterology and Hepatology, University of Pennsylvania Perelman School of Medicine and The Corporal Michael J. Crescenz Veterans Administration Hospital, Philadelphia, PA, United States
| | - Jingqiao Lu
- Ray Biotech Life Inc., Peachtree Corners, GA, United States
| | - Samantha Zanine
- Department of Mechanical Engineering, Pennsylvania State University, PA, United States
| | - Tingting Zhan
- Division of Biostatistics, Department of Pharmacology and Experimental Therapeutics, Thomas Jefferson University, Philadelphia PA, United States
| | - Cataldo Doria
- CHS Liver and Pancreas Centers of Excellence, Capital Health Cancer Center, One Capital Way, Pennington, NJ, United States
| | - Aejaz Sayeed
- Department of Translational Medicine, Baruch S. Blumberg Institute, Doylestown, PA, United States
| |
Collapse
|
31
|
Bentz EJ, Ophir AG. Chromosome-scale genome assembly of the African giant pouched rat (Cricetomys ansorgei) and evolutionary analysis reveals evidence of olfactory specialization. Genomics 2022; 114:110521. [PMID: 36351561 DOI: 10.1016/j.ygeno.2022.110521] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 10/28/2022] [Accepted: 11/04/2022] [Indexed: 11/07/2022]
Abstract
The Southern giant pouched rat, Cricetomys ansorgei, is a large rodent best known for its ability to detect landmines using its impressive sense of smell. Their powerful chemosensory abilities enable subtle discrimination of chemical social signals, and female pouched rats demonstrate a unique reproductive physiology hypothesized to be mediated by pheromonal mechanisms. Thus, C. ansorgei represents a novel mammalian model for chemosensory physiology, social behavior, and pheromonal control of reproductive physiology. We present the first chromosome-scale genomic sequence of the pouched rat encoding 22,671 protein coding genes, including 1571 olfactory receptors, and provide a glance into the evolutionary history of this species. Functional enrichment analysis reveals genetic expansions specific to the pouched rat are enriched for functions related to olfactory specialization. Overall, this assembly is of reference-quality, and will serve as a useful and informative genomic sequence on which we can confidently base future molecular research involving the pouched rat.
Collapse
Affiliation(s)
- Ehren J Bentz
- Department of Psychology, Cornell University, Ithaca, NY, USA.
| | | |
Collapse
|
32
|
Katsonis P, Wilhelm K, Williams A, Lichtarge O. Genome interpretation using in silico predictors of variant impact. Hum Genet 2022; 141:1549-1577. [PMID: 35488922 PMCID: PMC9055222 DOI: 10.1007/s00439-022-02457-6] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 04/17/2022] [Indexed: 02/06/2023]
Abstract
Estimating the effects of variants found in disease driver genes opens the door to personalized therapeutic opportunities. Clinical associations and laboratory experiments can only characterize a tiny fraction of all the available variants, leaving the majority as variants of unknown significance (VUS). In silico methods bridge this gap by providing instant estimates on a large scale, most often based on the numerous genetic differences between species. Despite concerns that these methods may lack reliability in individual subjects, their numerous practical applications over cohorts suggest they are already helpful and have a role to play in genome interpretation when used at the proper scale and context. In this review, we aim to gain insights into the training and validation of these variant effect predicting methods and illustrate representative types of experimental and clinical applications. Objective performance assessments using various datasets that are not yet published indicate the strengths and limitations of each method. These show that cautious use of in silico variant impact predictors is essential for addressing genome interpretation challenges.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| | - Kevin Wilhelm
- Graduate School of Biomedical Sciences, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Amanda Williams
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Biochemistry, Human Genetics and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Pharmacology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| |
Collapse
|
33
|
Liu Y, Yeung WSB, Chiu PCN, Cao D. Computational approaches for predicting variant impact: An overview from resources, principles to applications. Front Genet 2022; 13:981005. [PMID: 36246661 PMCID: PMC9559863 DOI: 10.3389/fgene.2022.981005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
One objective of human genetics is to unveil the variants that contribute to human diseases. With the rapid development and wide use of next-generation sequencing (NGS), massive genomic sequence data have been created, making personal genetic information available. Conventional experimental evidence is critical in establishing the relationship between sequence variants and phenotype but with low efficiency. Due to the lack of comprehensive databases and resources which present clinical and experimental evidence on genotype-phenotype relationship, as well as accumulating variants found from NGS, different computational tools that can predict the impact of the variants on phenotype have been greatly developed to bridge the gap. In this review, we present a brief introduction and discussion about the computational approaches for variant impact prediction. Following an innovative manner, we mainly focus on approaches for non-synonymous variants (nsSNVs) impact prediction and categorize them into six classes. Their underlying rationale and constraints, together with the concerns and remedies raised from comparative studies are discussed. We also present how the predictive approaches employed in different research. Although diverse constraints exist, the computational predictive approaches are indispensable in exploring genotype-phenotype relationship.
Collapse
Affiliation(s)
- Ye Liu
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| | - William S. B. Yeung
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- Department of Obstetrics and Gynaecology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Philip C. N. Chiu
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- Department of Obstetrics and Gynaecology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Dandan Cao
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| |
Collapse
|
34
|
Tuteja S, Kadri S, Yap KL. A performance evaluation study: Variant annotation tools - The enigma of clinical next generation sequencing (NGS) based genetic testing. J Pathol Inform 2022; 13:100130. [PMID: 36268089 PMCID: PMC9577137 DOI: 10.1016/j.jpi.2022.100130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 07/25/2022] [Accepted: 07/25/2022] [Indexed: 12/03/2022] Open
Abstract
Dramatically expanding our ability for clinical genetic testing for inherited conditions and complex diseases such as cancer, next generation sequencing (NGS) technologies are allowing for rapid interrogation of thousands of genes and identification of millions of variants. Variant annotation, the process of assigning functional information to DNA variants based on the standardized Human Genome Variation Society (HGVS) nomenclature, is a fundamental challenge in the analysis of NGS data that has led to the development of many bioinformatic algorithms. In this study, we evaluated the performance of 3 variant annotation tools: Alamut® Batch, Ensembl Variant Effect Predictor (VEP), and ANNOVAR, benchmarked by a manually curated ground-truth set of 298 variants from the medical exome database at the Molecular Diagnostics Laboratory at Lurie Children's Hospital. Of the 3 tools, VEP produces the most accurate variant annotations (HGVS nomenclature for 297 of the 298 variants) due to usage of updated gene transcript versions within the algorithm. Alamut® Batch called 296 of the 298 variants correctly; strikingly, ANNOVAR exhibited the greatest number of discrepancies (20 of the 298 variants, 93.3% concordance with ground-truth set). Adoption of validated methods of variant annotation is critical in post-analytical phases of clinical testing.
Collapse
Affiliation(s)
- Sachleen Tuteja
- Illinois Mathematics and Science Academy, 1500 Sullivan Road, Aurora, IL 60506, USA
| | - Sabah Kadri
- Department of Pathology and Laboratory Medicine, Ann and Robert H. Lurie Children's Hospital of Chicago, 225 E. Chicago Ave, Chicago, IL 60611, USA
- Department of Pathology, Northwestern University Feinberg School of Medicine, 420 E. Superior St, Chicago, IL 606011, USA
| | - Kai Lee Yap
- Department of Pathology and Laboratory Medicine, Ann and Robert H. Lurie Children's Hospital of Chicago, 225 E. Chicago Ave, Chicago, IL 60611, USA
- Department of Pathology, Northwestern University Feinberg School of Medicine, 420 E. Superior St, Chicago, IL 606011, USA
- Corresponding author at: Molecular Diagnostics, Department of Pathology & Laboratory Medicine, Ann & Robert H. Lurie Children's Hospital of Chicago, Northwestern Feinberg School of Medicine, 225 E. Chicago Ave, Box 82, Chicago, IL 60611, USA.
| |
Collapse
|
35
|
Boßelmann CM, Hedrich UBS, Müller P, Sonnenberg L, Parthasarathy S, Helbig I, Lerche H, Pfeifer N. Predicting the functional effects of voltage-gated potassium channel missense variants with multi-task learning. EBioMedicine 2022; 81:104115. [PMID: 35759918 PMCID: PMC9250003 DOI: 10.1016/j.ebiom.2022.104115] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 05/30/2022] [Accepted: 05/31/2022] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND Variants in genes encoding voltage-gated potassium channels are associated with a broad spectrum of neurological diseases including epilepsy, ataxia, and intellectual disability. Knowledge of the resulting functional changes, characterized as overall ion channel gain- or loss-of-function, is essential to guide clinical management including precision medicine therapies. However, for an increasing number of variants, little to no experimental data is available. New tools are needed to evaluate variant functional effects. METHODS We catalogued a comprehensive dataset of 959 functional experiments across 19 voltage-gated potassium channels, leveraging data from 782 unique disease-associated and synthetic variants. We used these data to train a taxonomy-based multi-task learning support vector machine (MTL-SVM), and compared performance to several baseline methods. FINDINGS MTL-SVM maintains channel family structure during model training, improving overall predictive performance (mean balanced accuracy 0·718 ± 0·041, AU-ROC 0·761 ± 0·063) over baseline (mean balanced accuracy 0·620 ± 0·045, AU-ROC 0·711 ± 0·022). We can obtain meaningful predictions even for channels with few known variants (KCNC1, KCNQ5). INTERPRETATION Our model enables functional variant prediction for voltage-gated potassium channels. It may assist in tailoring current and future precision therapies for the increasing number of patients with ion channel disorders. FUNDING This work was supported by intramural funding of the Medical Faculty, University of Tuebingen (PATE F.1315137.1), the Federal Ministry for Education and Research (Treat-ION, 01GM1907A/B/G/H) and the German Research Foundation (FOR-2715, Le1030/16-2, He8155/1-2).
Collapse
Affiliation(s)
- Christian Malte Boßelmann
- Department of Neurology and Epileptology, Hertie Institute for Clinical Brain Research, University of Tuebingen, Hoppe-Seyler-Str. 3, D-72076 Tuebingen, Germany; Methods in Medical Informatics, Department of Computer Science, University of Tuebingen, Sand 14, D-72076 Tuebingen, Germany
| | - Ulrike B S Hedrich
- Department of Neurology and Epileptology, Hertie Institute for Clinical Brain Research, University of Tuebingen, Hoppe-Seyler-Str. 3, D-72076 Tuebingen, Germany
| | - Peter Müller
- Department of Neurology and Epileptology, Hertie Institute for Clinical Brain Research, University of Tuebingen, Hoppe-Seyler-Str. 3, D-72076 Tuebingen, Germany
| | - Lukas Sonnenberg
- Institute for Neurobiology, University of Tuebingen, Tuebingen, Germany
| | - Shridhar Parthasarathy
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Ingo Helbig
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Department of Neurology, University of Pennsylvania, Philadelphia, PA, USA
| | - Holger Lerche
- Department of Neurology and Epileptology, Hertie Institute for Clinical Brain Research, University of Tuebingen, Hoppe-Seyler-Str. 3, D-72076 Tuebingen, Germany.
| | - Nico Pfeifer
- Methods in Medical Informatics, Department of Computer Science, University of Tuebingen, Sand 14, D-72076 Tuebingen, Germany; Interfaculty Institute for Biomedical Informatics (IBMI), University of Tuebingen, Tuebingen, Germany; Faculty of Medicine, University of Tuebingen, Tuebingen, Germany; German Center for Infection Research, Partner Site Tuebingen, Tuebingen, Germany.
| |
Collapse
|
36
|
Li B, Roden DM, Capra JA. The 3D mutational constraint on amino acid sites in the human proteome. Nat Commun 2022; 13:3273. [PMID: 35672414 PMCID: PMC9174330 DOI: 10.1038/s41467-022-30936-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 05/19/2022] [Indexed: 12/16/2022] Open
Abstract
Quantification of the tolerance of protein sites to genetic variation has become a cornerstone of variant interpretation. We hypothesize that the constraint on missense variation at individual amino acid sites is largely shaped by direct interactions with 3D neighboring sites. To quantify this constraint, we introduce a framework called COntact Set MISsense tolerance (or COSMIS) and comprehensively map the landscape of 3D mutational constraint on 6.1 million amino acid sites covering 16,533 human proteins. We show that 3D mutational constraint is pervasive and that the level of constraint is strongly associated with disease relevance both at the site and the protein level. We demonstrate that COSMIS performs significantly better at variant interpretation tasks than other population-based constraint metrics while also providing structural insight into the functional roles of constrained sites. We anticipate that COSMIS will facilitate the interpretation of protein-coding variation in evolution and prioritization of sites for mechanistic investigation.
Collapse
Affiliation(s)
- Bian Li
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, 37203, USA.
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA.
| | - Dan M Roden
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
- Departments of Pharmacology and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - John A Capra
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, 37203, USA.
- Bakar Computational Health Sciences Institute and Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, 94143, USA.
| |
Collapse
|
37
|
Ozturk K, Carter H. Predicting functional consequences of mutations using molecular interaction network features. Hum Genet 2022; 141:1195-1210. [PMID: 34432150 PMCID: PMC8873243 DOI: 10.1007/s00439-021-02329-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Accepted: 07/31/2021] [Indexed: 12/13/2022]
Abstract
Variant interpretation remains a central challenge for precision medicine. Missense variants are particularly difficult to understand as they change only a single amino acid in a protein sequence yet can have large and varied effects on protein activity. Numerous tools have been developed to identify missense variants with putative disease consequences from protein sequence and structure. However, biological function arises through higher order interactions among proteins and molecules within cells. We therefore sought to capture information about the potential of missense mutations to perturb protein interaction networks by integrating protein structure and interaction data. We developed 16 network-based annotations for missense mutations that provide orthogonal information to features classically used to prioritize variants. We then evaluated them in the context of a proven machine-learning framework for variant effect prediction across multiple benchmark datasets to demonstrate their potential to improve variant classification. Interestingly, network features resulted in larger performance gains for classifying somatic mutations than for germline variants, possibly due to different constraints on what mutations are tolerated at the cellular versus organismal level. Our results suggest that modeling variant potential to perturb context-specific interactome networks is a fruitful strategy to advance in silico variant effect prediction.
Collapse
Affiliation(s)
- Kivilcim Ozturk
- Division of Medical Genetics, Department of Medicine, University of California San Diego, La Jolla, CA, USA
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, USA
| | - Hannah Carter
- Division of Medical Genetics, Department of Medicine, University of California San Diego, La Jolla, CA, USA.
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, USA.
- Moores Cancer Center, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
38
|
Li B, Jin B, Capra JA, Bush WS. Integration of Protein Structure and Population-Scale DNA Sequence Data for Disease Gene Discovery and Variant Interpretation. Annu Rev Biomed Data Sci 2022; 5:141-161. [PMID: 35508071 DOI: 10.1146/annurev-biodatasci-122220-112147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The experimental and computational techniques for capturing information about protein structures and genetic variation within the human genome have advanced dramatically in the past 20 years, generating extensive new data resources. In this review, we discuss these advances, along with new approaches for determining the impact a genetic variant has on protein function. We focus on the potential of new methods that integrate human genetic variation into protein structures to discover relationships to disease, including the discovery of mutational hotspots in cancer-related proteins, the localization of protein-altering variants within protein regions for common complex diseases, and the assessment of variants of unknown significance for Mendelian traits. We expect that approaches that integrate these data sources will play increasingly important roles in disease gene discovery and variant interpretation. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Bian Li
- Department of Biological Sciences and Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, USA
| | - Bowen Jin
- Graduate Program in Systems Biology and Bioinformatics, Department of Nutrition, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - John A Capra
- Bakar Computational Health Sciences Institute and Department of Epidemiology and Biostatistics, University of California, San Francisco, California, USA;
| | - William S Bush
- Cleveland Institute for Computational Biology, Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, Ohio, USA;
| |
Collapse
|
39
|
Insights into National Laboratory Newborn Screening and Future Prospects. Medicina (B Aires) 2022; 58:medicina58020272. [PMID: 35208595 PMCID: PMC8879506 DOI: 10.3390/medicina58020272] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Revised: 01/26/2022] [Accepted: 02/09/2022] [Indexed: 11/17/2022] Open
Abstract
Newborn screening (NBS) is a group of tests that check all newborns for certain rare conditions, covering several genetic or metabolic disorders. The laboratory NBS is performed through blood testing. However, the conditions that newborn babies are screened for vary from one country to another. Since NBS began in the 1960s, technological advances have enabled its expansion to include an increasing number of disorders, and there is a national trend to further expand the NBS program. The use of mass spectrometry (MS) for the diagnosis of inborn errors of metabolism (IEM) obviously helps in the expansion of the screening panels. This technology allows the detection of different metabolic disorders at one run, replacing the use of traditional techniques. Analysis of the targeted pathogenic gene variant is a routine application in the molecular techniques for the NBS program as a confirmatory testing to the positive laboratory screening results. Recently, a lot of molecular investigations, such as next generation sequencing (NGS), have been introduced in the routine NBS program. Nowadays, NGS techniques are widely used in the diagnosis of IMD where its results are rapid, confirmed and reliable, but, due to its uncertainties and the nature of IEM, it necessitates a holistic approach for diagnosis. However, various characteristics found in NGS make it a potentially powerful tool for NBS. A range of disorders can be analyzed with a single assay directly, and samples can reduce costs and can largely be automated. For the implementation of a robust technology such as NGS in a mass NBS program, the main focus should not be just technologically biased; it should also be tested for its long- and short-term impact on the family and the child. The crucial question here is whether large-scale genomic sequencing can provide useful medical information beyond what current NBS is already providing and at what economical and emotional cost? Currently, the topic of newborn genome sequencing as a public health initiative remains argumentative. Thus, this article seeks the answer to the question: NGS for newborn screening- are we there yet?
Collapse
|
40
|
Novel PRMT7 mutation in a rare case of dysmorphism and intellectual disability. J Hum Genet 2022; 67:19-26. [PMID: 34244600 DOI: 10.1038/s10038-021-00955-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 06/05/2021] [Accepted: 06/20/2021] [Indexed: 02/06/2023]
Abstract
Protein arginine N-methyltransferase 7 (PRMT7) encodes an arginine methyltransferase central to a number of fundamental biological processes, mutations in which result in an autosomal recessive developmental disorder characterized by short stature, brachydactyly, intellectual developmental disability and seizures (SBIDDS). To date, fewer than 15 patients with biallelic mutations in PRMT7 have been documented. Here we report brothers from a consanguineous Iraqi family presenting with a developmental disorder characterized by global developmental delay, shortened stature, facial dysmorphisms, brachydactyly, and kidney dysfunction. In both affected brothers, whole genome sequencing (WGS) identified a novel homozygous substitution in PRMT7 (ENST00000339507.5), c.1097 G > A (p.Cys366Tyr), considered to account for the majority of the phenotypic presentation. Rare compound heterozygous mutations in the dysplasia-associated perlecan-encoding HSPG2 gene (ENST00000374695.3) were also found (c.10721-2dupA, p.Ser71Asn and c.212 G > A), potentially accounting for the kidney dysfunction. In addition to expanding the known mutational spectrum of variably expressive PRMT7 mutations alongside potential digenic inheritance with HSPG2, this report underlines the diagnostic utility of a WGS-guided analysis in the detection of rare genetic disorders.
Collapse
|
41
|
Zeng Z, Aptekmann AA, Bromberg Y. Decoding the effects of synonymous variants. Nucleic Acids Res 2021; 49:12673-12691. [PMID: 34850938 PMCID: PMC8682775 DOI: 10.1093/nar/gkab1159] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 11/02/2021] [Accepted: 11/08/2021] [Indexed: 12/12/2022] Open
Abstract
Synonymous single nucleotide variants (sSNVs) are common in the human genome but are often overlooked. However, sSNVs can have significant biological impact and may lead to disease. Existing computational methods for evaluating the effect of sSNVs suffer from the lack of gold-standard training/evaluation data and exhibit over-reliance on sequence conservation signals. We developed synVep (synonymous Variant effect predictor), a machine learning-based method that overcomes both of these limitations. Our training data was a combination of variants reported by gnomAD (observed) and those unreported, but possible in the human genome (generated). We used positive-unlabeled learning to purify the generated variant set of any likely unobservable variants. We then trained two sequential extreme gradient boosting models to identify subsets of the remaining variants putatively enriched and depleted in effect. Our method attained 90% precision/recall on a previously unseen set of variants. Furthermore, although synVep does not explicitly use conservation, its scores correlated with evolutionary distances between orthologs in cross-species variation analysis. synVep was also able to differentiate pathogenic vs. benign variants, as well as splice-site disrupting variants (SDV) vs. non-SDVs. Thus, synVep provides an important improvement in annotation of sSNVs, allowing users to focus on variants that most likely harbor effects.
Collapse
Affiliation(s)
- Zishuo Zeng
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
| | - Ariel A Aptekmann
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
- Department of Genetics, Rutgers University, Piscataway, NJ 08854, USA
| |
Collapse
|
42
|
Ruscheinski A, Reimler AL, Ewald R, Uhrmacher AM. VPMBench: a test bench for variant prioritization methods. BMC Bioinformatics 2021; 22:543. [PMID: 34749640 PMCID: PMC8576923 DOI: 10.1186/s12859-021-04458-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 10/23/2021] [Indexed: 11/18/2022] Open
Abstract
Background Clinical diagnostics of whole-exome and whole-genome sequencing data requires geneticists to consider thousands of genetic variants for each patient. Various variant prioritization methods have been developed over the last years to aid clinicians in identifying variants that are likely disease-causing. Each time a new method is developed, its effectiveness must be evaluated and compared to other approaches based on the most recently available evaluation data. Doing so in an unbiased, systematic, and replicable manner requires significant effort. Results The open-source test bench “VPMBench” automates the evaluation of variant prioritization methods. VPMBench introduces a standardized interface for prioritization methods and provides a plugin system that makes it easy to evaluate new methods. It supports different input data formats and custom output data preparation. VPMBench exploits declaratively specified information about the methods, e.g., the variants supported by the methods. Plugins may also be provided in a technology-agnostic manner via containerization. Conclusions VPMBench significantly simplifies the evaluation of both custom and published variant prioritization methods. As we expect variant prioritization methods to become ever more critical with the advent of whole-genome sequencing in clinical diagnostics, such tool support is crucial to facilitate methodological research.
Collapse
Affiliation(s)
- Andreas Ruscheinski
- Modeling and Simulation Group, Institute for Visual and Analytic Computing, University of Rostock, Albert-Einstein-Straße 22, 18051, Rostock, Germany.
| | - Anna Lena Reimler
- Modeling and Simulation Group, Institute for Visual and Analytic Computing, University of Rostock, Albert-Einstein-Straße 22, 18051, Rostock, Germany
| | - Roland Ewald
- Limbus Medical Technologies GmbH, Lindenstraße 2, 18055, Rostock, Germany
| | - Adelinde M Uhrmacher
- Modeling and Simulation Group, Institute for Visual and Analytic Computing, University of Rostock, Albert-Einstein-Straße 22, 18051, Rostock, Germany
| |
Collapse
|
43
|
Tangaro MA, Mandreoli P, Chiara M, Donvito G, Antonacci M, Parisi A, Bianco A, Romano A, Bianchi DM, Cangelosi D, Uva P, Molineris I, Nosi V, Calogero RA, Alessandri L, Pedrini E, Mordenti M, Bonetti E, Sangiorgi L, Pesole G, Zambelli F. Laniakea@ReCaS: exploring the potential of customisable Galaxy on-demand instances as a cloud-based service. BMC Bioinformatics 2021; 22:544. [PMID: 34749633 PMCID: PMC8574934 DOI: 10.1186/s12859-021-04401-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 09/24/2021] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Improving the availability and usability of data and analytical tools is a critical precondition for further advancing modern biological and biomedical research. For instance, one of the many ramifications of the COVID-19 global pandemic has been to make even more evident the importance of having bioinformatics tools and data readily actionable by researchers through convenient access points and supported by adequate IT infrastructures. One of the most successful efforts in improving the availability and usability of bioinformatics tools and data is represented by the Galaxy workflow manager and its thriving community. In 2020 we introduced Laniakea, a software platform conceived to streamline the configuration and deployment of "on-demand" Galaxy instances over the cloud. By facilitating the set-up and configuration of Galaxy web servers, Laniakea provides researchers with a powerful and highly customisable platform for executing complex bioinformatics analyses. The system can be accessed through a dedicated and user-friendly web interface that allows the Galaxy web server's initial configuration and deployment. RESULTS "Laniakea@ReCaS", the first instance of a Laniakea-based service, is managed by ELIXIR-IT and was officially launched in February 2020, after about one year of development and testing that involved several users. Researchers can request access to Laniakea@ReCaS through an open-ended call for use-cases. Ten project proposals have been accepted since then, totalling 18 Galaxy on-demand virtual servers that employ ~ 100 CPUs, ~ 250 GB of RAM and ~ 5 TB of storage and serve several different communities and purposes. Herein, we present eight use cases demonstrating the versatility of the platform. CONCLUSIONS During this first year of activity, the Laniakea-based service emerged as a flexible platform that facilitated the rapid development of bioinformatics tools, the efficient delivery of training activities, and the provision of public bioinformatics services in different settings, including food safety and clinical research. Laniakea@ReCaS provides a proof of concept of how enabling access to appropriate, reliable IT resources and ready-to-use bioinformatics tools can considerably streamline researchers' work.
Collapse
Affiliation(s)
- Marco Antonio Tangaro
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy
- National Institute for Nuclear Physics (INFN), Section of Bari, Via Orabona 4, 70126, Bari, Italy
| | - Pietro Mandreoli
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milano, Italy
| | - Matteo Chiara
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milano, Italy
| | - Giacinto Donvito
- National Institute for Nuclear Physics (INFN), Section of Bari, Via Orabona 4, 70126, Bari, Italy
| | - Marica Antonacci
- National Institute for Nuclear Physics (INFN), Section of Bari, Via Orabona 4, 70126, Bari, Italy
| | - Antonio Parisi
- Istituto Zooprofilattico Sperimentale Della Puglia e Della Basilicata, Via Manfredonia 20, 71121, Foggia, Italy
| | - Angelica Bianco
- Istituto Zooprofilattico Sperimentale Della Puglia e Della Basilicata, Via Manfredonia 20, 71121, Foggia, Italy
| | - Angelo Romano
- National Reference Laboratory for Coagulase-Positive Staphylococci Including Staphylococcus Aureus, Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy
| | - Daniela Manila Bianchi
- National Reference Laboratory for Coagulase-Positive Staphylococci Including Staphylococcus Aureus, Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy
| | - Davide Cangelosi
- Clinical Bioinformatics Unit, Scientific Direction, IRCCS Istituto Giannina Gaslini, Via Gerolamo Gaslini 5, 16147, Genova, Italy
| | - Paolo Uva
- Clinical Bioinformatics Unit, Scientific Direction, IRCCS Istituto Giannina Gaslini, Via Gerolamo Gaslini 5, 16147, Genova, Italy
- Italian Institute of Technology, Via Morego 30, 16163, Genova, Italy
| | - Ivan Molineris
- Department of Life Science and System Biology, University of Turin, Via Accademia Albertina, 13-1023, Turin, Italy
| | - Vladimir Nosi
- Department of Computer Science, University of Turin, Via Pessinetto 12, 10049, Turin, Italy
| | - Raffaele A Calogero
- Department of Molecular Biotechnology and Health Sciences, Via Nizza 52, 10126, Turin, Italy
| | - Luca Alessandri
- Department of Molecular Biotechnology and Health Sciences, Via Nizza 52, 10126, Turin, Italy
| | - Elena Pedrini
- Department of Rare Skeletal Disorders, IRCCS Istituto Ortopedico Rizzoli, Via di Barbiano 1/10, 40136, Bologna, Italy
| | - Marina Mordenti
- Department of Rare Skeletal Disorders, IRCCS Istituto Ortopedico Rizzoli, Via di Barbiano 1/10, 40136, Bologna, Italy
| | - Emanuele Bonetti
- Department of Rare Skeletal Disorders, IRCCS Istituto Ortopedico Rizzoli, Via di Barbiano 1/10, 40136, Bologna, Italy
- Department of Experimental Oncology, European Institute of Oncology, Via Adamello 16, 20139, Milan, Italy
| | - Luca Sangiorgi
- Department of Rare Skeletal Disorders, IRCCS Istituto Ortopedico Rizzoli, Via di Barbiano 1/10, 40136, Bologna, Italy
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy.
- Department of Biosciences, Biotechnologies and Biopharmaceutics, University of Bari, Via Orabona 4, 70126, Bari, Italy.
| | - Federico Zambelli
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR), Via Giovanni Amendola 122/O, 70126, Bari, Italy.
- Department of Biosciences, University of Milan, Via Celoria 26, 20133, Milano, Italy.
| |
Collapse
|
44
|
Vieira SRL, Schapira AHV. Glucocerebrosidase mutations: A paradigm for neurodegeneration pathways. Free Radic Biol Med 2021; 175:42-55. [PMID: 34450264 DOI: 10.1016/j.freeradbiomed.2021.08.230] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 08/06/2021] [Accepted: 08/12/2021] [Indexed: 02/07/2023]
Abstract
Biallelic (homozygous or compound heterozygous) glucocerebrosidase gene (GBA) mutations cause Gaucher disease, whereas heterozygous mutations are numerically the most important genetic risk factor for Parkinson disease (PD) and are associated with the development of other synucleinopathies, notably Dementia with Lewy Bodies. This phenomenon is not limited to GBA, with converging evidence highlighting further examples of autosomal recessive disease genes increasing neurodegeneration risk in heterozygous mutation carriers. Nevertheless, despite extensive research, the cellular mechanisms by which mutations in GBA, encoding lysosomal enzyme β-glucocerebrosidase (GCase), predispose to neurodegeneration remain incompletely understood. Alpha-synuclein (A-SYN) accumulation, autophagic lysosomal dysfunction, mitochondrial abnormalities, ER stress and neuroinflammation have been proposed as candidate pathogenic pathways in GBA-linked PD. The observation of GCase and A-SYN interactions in PD initiated the development and evaluation of GCase-targeted therapeutics in PD clinical trials.
Collapse
Affiliation(s)
- Sophia R L Vieira
- Department of Clinical and Movement Neurosciences, University College London Queen Square Institute of Neurology, London, United Kingdom
| | - Anthony H V Schapira
- Department of Clinical and Movement Neurosciences, University College London Queen Square Institute of Neurology, London, United Kingdom.
| |
Collapse
|
45
|
Vorsteveld EE, Hoischen A, van der Made CI. Next-Generation Sequencing in the Field of Primary Immunodeficiencies: Current Yield, Challenges, and Future Perspectives. Clin Rev Allergy Immunol 2021; 61:212-225. [PMID: 33666867 PMCID: PMC7934351 DOI: 10.1007/s12016-021-08838-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/07/2021] [Indexed: 12/18/2022]
Abstract
Primary immunodeficiencies comprise a group of inborn errors of immunity that display significant clinical and genetic heterogeneity. Next-generation sequencing techniques and predominantly whole exome sequencing have revolutionized the understanding of the genetic and molecular basis of genetic diseases, thereby also leading to a sharp increase in the discovery of new genes associated with primary immunodeficiencies. In this review, we discuss the current diagnostic yield of this generic diagnostic approach by evaluating the studies that have employed next-generation sequencing techniques in cohorts of patients with primary immunodeficiencies. The average diagnostic yield for primary immunodeficiencies is determined to be 29% (range 10-79%) and 38% specifically for whole-exome sequencing (range 15-70%). The significant variation between studies is mainly the result of differences in clinical characteristics of the studied cohorts but is also influenced by varying sequencing approaches and (in silico) gene panel selection. We further discuss other factors contributing to the relatively low yield, including the inherent limitations of whole-exome sequencing, challenges in the interpretation of novel candidate genetic variants, and promises of exploring the non-coding part of the genome. We propose strategies to improve the diagnostic yield leading the way towards expanded personalized treatment in PIDs.
Collapse
Affiliation(s)
- Emil E Vorsteveld
- Department of Human Genetics, Radboud University Medical Center, P.O. Box 9101, 6500 HB, Nijmegen, The Netherlands
| | - Alexander Hoischen
- Department of Human Genetics, Radboud University Medical Center, P.O. Box 9101, 6500 HB, Nijmegen, The Netherlands.
- Department of Internal Medicine, Radboudumc Center for Infectious Diseases (RCI), Radboudumc, Nijmegen, The Netherlands.
- Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands.
| | - Caspar I van der Made
- Department of Human Genetics, Radboud University Medical Center, P.O. Box 9101, 6500 HB, Nijmegen, The Netherlands
- Department of Internal Medicine, Radboudumc Center for Infectious Diseases (RCI), Radboudumc, Nijmegen, The Netherlands
- Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
| |
Collapse
|
46
|
Hsu RH, Chien YH, Hwu WL, Lee NC. Diversity in heritable disorders of connective tissue at a single center. Connect Tissue Res 2021; 62:580-585. [PMID: 32862725 DOI: 10.1080/03008207.2020.1816994] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
BACKGROUND Heritable disorders of connective tissue (HDCT) is a heterogeneous group of conditions caused by defects in genes responsible for extracellular matrix elements. Although next-generation sequencing (NGS) technology can be used to analyze many genes at a time, precisely diagnosing HDCT is still challenging because of the overlapping phenotypes and genotypes. METHODS A 67-gene NGS targeted panel or whole-exome sequencing was employed for the diagnosis of HDCT over 4 years. Phenotypes and genotypes of patients were analyzed retrospectively. RESULTS Mutations in 16 genes were discovered in 34 patients with the suspicion of Ehlers-Danlos syndrome (n = 7), Marfan syndrome (n = 2), osteogenesis imperfecta (n = 3), skeletal dysplasia (n = 18), and others (n = 4). Eighteen patients were found to have mutations in collagen genes, three had SERPINF1 mutations, two had TRPV4 mutations, two had FBN1 mutations, two had COMP mutations, and mutations in seven other genes were found in one patient each. The eight patients with COL1A1 mutations had a wide variation in phenotype. Patients with COL3A1 and COL5A1 mutations presented with classic EDS, those with SERPINF1 mutations presented with typical OI type VI, those with TRPV4 mutations presented with severe spinal deformity, and those with COL2A1 mutations presented with syndromic or nonsyndromic bone dysplasia or only short stature. CONCLUSION A wide diversity in HDCT was observed. Therefore, knowledge about the phenotype-genotype correlation in HDCT is still crucial in the diagnosis of this group of diseases, and an improvement in the screening tool will be needed.
Collapse
Affiliation(s)
- Rai-Hseng Hsu
- Department of Pediatrics and Medical Genetics, National Taiwan University Hospital, Taipei, Taiwan.,Department of Pediatrics, Taipei Medical University Hospital, Taipei, Taiwan
| | - Yin-Hsiu Chien
- Department of Pediatrics and Medical Genetics, National Taiwan University Hospital, Taipei, Taiwan
| | - Wuh-Liang Hwu
- Department of Pediatrics and Medical Genetics, National Taiwan University Hospital, Taipei, Taiwan
| | - Ni-Chung Lee
- Department of Pediatrics and Medical Genetics, National Taiwan University Hospital, Taipei, Taiwan
| |
Collapse
|
47
|
Torabi Dalivandan S, Plummer J, Gayther SA. Risks and Function of Breast Cancer Susceptibility Alleles. Cancers (Basel) 2021; 13:3953. [PMID: 34439109 PMCID: PMC8393346 DOI: 10.3390/cancers13163953] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 07/30/2021] [Accepted: 07/31/2021] [Indexed: 12/22/2022] Open
Abstract
Family history remains one of the strongest risk factors for breast cancer. It is well established that women with a first-degree relative affected by breast cancer are twice as likely to develop the disease themselves. Twins studies indicate that this is most likely due to shared genetics rather than shared epidemiological/lifestyle risk factors. Linkage and targeted sequencing studies have shown that rare high- and moderate-penetrance germline variants in genes involved in the DNA damage response (DDR) including BRCA1, BRCA2, PALB2, ATM, and TP53 are responsible for a proportion of breast cancer cases. However, breast cancer is a heterogeneous disease, and there is now strong evidence that different risk alleles can predispose to different subtypes of breast cancer. Here, we review the associations between the different genes and subtype-specificity of breast cancer based on the most comprehensive genetic studies published. Genome-wide association studies (GWAS) have also been used to identify an additional hereditary component of breast cancer, and have identified hundreds of common, low-penetrance susceptibility alleles. The combination of these low penetrance risk variants, summed as a polygenic risk score (PRS), can identify individuals across the spectrum of disease risk. However, there remains a substantial bottleneck between the discovery of GWAS-risk variants and their contribution to tumorigenesis mainly because the majority of these variants map to the non-protein coding genome. A range of functional genomic approaches are needed to identify the causal risk variants and target susceptibility genes and establish their underlying role in disease biology. We discuss how the application of these multidisciplinary approaches to understand genetic risk for breast cancer can be used to identify individuals in the population that may benefit from clinical interventions including screening for early detection and prevention, and treatment strategies to reduce breast cancer-related mortalities.
Collapse
Affiliation(s)
| | | | - Simon A. Gayther
- Center for Bioinformatics and Functional Genomics, Department of Biomedical Sciences, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA; (S.T.D.); (J.P.)
| |
Collapse
|
48
|
Arani AA, Sehhati M, Tabatabaiefar MA. Genetic variant effect prediction by supervised nonnegative matrix tri-factorization. Mol Omics 2021; 17:740-751. [PMID: 34164638 DOI: 10.1039/d1mo00038a] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Discriminating between deleterious and neutral mutations among numerous non-synonymous single nucleotide variants (nsSNVs) that may be observed through whole exome sequencing (WES) is considered a great challenge. In this regard, many machine learning methods have been developed for the prediction of variant consequences based on the analysis of either protein amino acid sequences or protein structures or their integration with features extracted from various gene level data and phenotype information. Due to the availability of a high number of features and heterogeneity of sources, implementing a suitable integration method plays an important role in predictive models. In this study, we proposed a novel supervised nonnegative matrix tri-factorization (sNMTF) algorithm to integrate current variant prediction scores into the gene level data and disease networks. In this regard, a new feature space was constructed by the integration of all input data using sNMTF to provide appropriate inputs for training a classifier. For the assessment of the proposed model, we utilized two benchmark datasets. The first one contained 11 207 deleterious and 19 839 neutral nsSNPs, whereas for the other dataset we used 4416 and 4960 deleterious and neutral nsSNPs, respectively. In general, the evaluation of our proposed supervised NMTF method on both datasets indicated that, in comparison with the existing nsSNV effect prediction approaches, regardless of whether they are ensemble-based or not, our method exhibited a better performance, which resulted in a higher prediction accuracy on average of 15% than other ensemble scores. In addition, excluding any kind of data that were integrated into the final model led to a substantial decrease in deleterious variant prediction. The proposed model can be used as an extensible framework for integrating more hetergeneous sources.
Collapse
Affiliation(s)
- Asieh Amousoltani Arani
- Department of Bioelectric and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Mohammadreza Sehhati
- Department of Bioinformatics, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.
| | - Mohammad Amin Tabatabaiefar
- Department of Genetics and Molecular Biology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran and GTaC Corp., Deputy of Research and Technology, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|
49
|
Richard D, Capellini TD. Shifting epigenetic contexts influence regulatory variation and disease risk. Aging (Albany NY) 2021; 13:15699-15749. [PMID: 34138751 PMCID: PMC8266365 DOI: 10.18632/aging.203194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 06/01/2021] [Indexed: 11/25/2022]
Abstract
Epigenetic shifts are a hallmark of aging that impact transcriptional networks at regulatory level. These shifts may modify the effects of genetic regulatory variants during aging and contribute to disease pathomechanism. However, these shifts occur on the backdrop of epigenetic changes experienced throughout an individual's development into adulthood; thus, the phenotypic, and ultimately fitness, effects of regulatory variants subject to developmental- versus aging-related epigenetic shifts may differ considerably. Natural selection therefore may act differently on variants depending on their changing epigenetic context, which we propose as a novel lens through which to consider regulatory sequence evolution and phenotypic effects. Here, we define genomic regions subjected to altered chromatin accessibility as tissues transition from their fetal to adult forms, and subsequently from early to late adulthood. Based on these epigenomic datasets, we examine patterns of evolutionary constraint and potential functional impacts of sequence variation (e.g., genetic disease risk associations). We find that while the signals observed with developmental epigenetic changes are consistent with stronger fitness consequences (i.e., negative selection pressures), they tend to have weaker effects on genetic risk associations for aging-related diseases. Conversely, we see stronger effects of variants with increased local accessibility in adult tissues, strongest in young adult when compared to old. We propose a model for how epigenetic status of a region may influence the effects of evolutionary relevant sequence variation, and suggest that such a perspective on gene regulatory networks may elucidate our understanding of aging biology.
Collapse
Affiliation(s)
- Daniel Richard
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Terence D Capellini
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA.,Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
50
|
van Spelde AM, Schroeder H, Kjellström A, Lidén K. Approaches to osteoporosis in paleopathology: How did methodology shape bone loss research? INTERNATIONAL JOURNAL OF PALEOPATHOLOGY 2021; 33:245-257. [PMID: 34044198 DOI: 10.1016/j.ijpp.2021.05.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 05/03/2021] [Accepted: 05/03/2021] [Indexed: 06/12/2023]
Abstract
OBJECTIVE This paper will review how different methods employed to study bone loss in the past were used to explore different questions and aspects of bone loss, how methodology has changed over time, and how these different approaches have informed our understanding of bone loss in the past. MATERIALS AND METHODS A review and discussion is conducted on research protocols and results of 84 paleopathology publications on bone loss in archaeological skeletal collections published between 1969 and 2021. CONCLUSIONS The variety in research protocols confounds accurate meta-analysis of previously published research; however, more recent publications incorporate a combination of bone mass and bone quality based methods. Biased sample selection has resulted in a predominance of European and Medieval publications, limiting more general observations on bone loss in the past. Collection of dietary or paleopathological covariables is underemployed in the effort to interpret bone loss patterns. SIGNIFICANCE Paleopathology publications have demonstrated differences in bone loss between distinct archaeological populations, between sex and age groups, and have suggested factors underlying observed differences. However, a lack of a gold standard has encouraged the use of a wide range of methods. Understanding how this array of methods effects results is crucial in contextualizing our knowledge of bone loss in the past. LIMITATIONS The development of a research protocol is also influenced by available expertise, available equipment, restrictions imposed by the curator, and site-specific taphonomic aspects. These factors will likely continue to cause (minor) biases even if a best practice can be established. SUGGESTIONS FOR FUTURE RESEARCH Greater effort to develop uniform terminology and operational definitions of osteoporosis in skeletal remains, as well as the expansion of time scale and geographical areas studied. The Next-Generation Sequencing revolution has also opened up the possibility of ancient DNA analyses to study genetic predisposition to bone loss in the past.
Collapse
Affiliation(s)
- Anne-Marijn van Spelde
- Archaeological Research Laboratory, Department of Archaeology and Classical Studies, Stockholm University, Lilla Frescativägen 7, 114 18 Stockholm, Sweden; The Globe Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5, 1353 Copenhagen, Denmark.
| | - Hannes Schroeder
- The Globe Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5, 1353 Copenhagen, Denmark
| | - Anna Kjellström
- Osteological Research Laboratory, Department of Archaeology and Classical Studies, Stockholm University, Lilla Frescativägen 7, 114 18 Stockholm, Sweden
| | - Kerstin Lidén
- Archaeological Research Laboratory, Department of Archaeology and Classical Studies, Stockholm University, Lilla Frescativägen 7, 114 18 Stockholm, Sweden
| |
Collapse
|