1
|
Argov CM, Shneyour A, Jubran J, Sabag E, Mansbach A, Sepunaru Y, Filtzer E, Gruber G, Volozhinsky M, Yogev Y, Birk O, Chalifa-Caspi V, Rokach L, Yeger-Lotem E. Tissue-aware interpretation of genetic variants advances the etiology of rare diseases. Mol Syst Biol 2024:10.1038/s44320-024-00061-6. [PMID: 39285047 DOI: 10.1038/s44320-024-00061-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 08/08/2024] [Accepted: 08/09/2024] [Indexed: 09/19/2024] Open
Abstract
Pathogenic variants underlying Mendelian diseases often disrupt the normal physiology of a few tissues and organs. However, variant effect prediction tools that aim to identify pathogenic variants are typically oblivious to tissue contexts. Here we report a machine-learning framework, denoted "Tissue Risk Assessment of Causality by Expression for variants" (TRACEvar, https://netbio.bgu.ac.il/TRACEvar/ ), that offers two advancements. First, TRACEvar predicts pathogenic variants that disrupt the normal physiology of specific tissues. This was achieved by creating 14 tissue-specific models that were trained on over 14,000 variants and combined 84 attributes of genetic variants with 495 attributes derived from tissue omics. TRACEvar outperformed 10 well-established and tissue-oblivious variant effect prediction tools. Second, the resulting models are interpretable, thereby illuminating variants' mode of action. Application of TRACEvar to variants of 52 rare-disease patients highlighted pathogenicity mechanisms and relevant disease processes. Lastly, the interpretation of all tissue models revealed that top-ranking determinants of pathogenicity included attributes of disease-affected tissues, particularly cellular process activities. Collectively, these results show that tissue contexts and interpretable machine-learning models can greatly enhance the etiology of rare diseases.
Collapse
Affiliation(s)
- Chanan M Argov
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel
| | - Ariel Shneyour
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel
| | - Juman Jubran
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel
| | - Eric Sabag
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel
| | - Avigdor Mansbach
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel
| | - Yair Sepunaru
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel
| | - Emmi Filtzer
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel
| | - Gil Gruber
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel
| | - Miri Volozhinsky
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel
| | - Yuval Yogev
- Morris Kahn Laboratory of Human Genetics and the Genetics Institute at Soroka Medical Center, Faculty of Health Sciences, Ben Gurion University of the Negev, Beer Sheva, 84105, Israel
| | - Ohad Birk
- Morris Kahn Laboratory of Human Genetics and the Genetics Institute at Soroka Medical Center, Faculty of Health Sciences, Ben Gurion University of the Negev, Beer Sheva, 84105, Israel
- The National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel
| | - Vered Chalifa-Caspi
- Ilse Katz Institute for Nanoscale Science & Technology, Ben-Gurion University of the Negev, Beer-Sheva, 84105, Israel
| | - Lior Rokach
- Department of Software & Information Systems Engineering, Faculty of Engineering Sciences, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel
| | - Esti Yeger-Lotem
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel.
- The National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer Sheva, 84105, Israel.
| |
Collapse
|
2
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors. Hum Genomics 2024; 18:90. [PMID: 39198917 PMCID: PMC11360829 DOI: 10.1186/s40246-024-00663-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Accepted: 08/19/2024] [Indexed: 09/01/2024] Open
Abstract
BACKGROUND Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). RESULTS The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past three decades, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 190 VIPs, resulting in a total of 407 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. CONCLUSIONS VIPdb version 2 summarizes 407 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. VIPdb is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA
| | - Arul S Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA
- Illumina, Foster City, CA, 94404, USA
| | - Steven E Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA.
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA.
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA.
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA.
| |
Collapse
|
3
|
Tammen I, Mather M, Leeb T, Nicholas FW. Online Mendelian Inheritance in Animals (OMIA): a genetic resource for vertebrate animals. Mamm Genome 2024:10.1007/s00335-024-10059-y. [PMID: 39143381 DOI: 10.1007/s00335-024-10059-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Accepted: 08/01/2024] [Indexed: 08/16/2024]
Abstract
Online Mendelian Inheritance in Animals (OMIA) is a freely available curated knowledgebase that contains information and facilitates research on inherited traits and diseases in animals. For the past 29 years, OMIA has been used by animal geneticists, breeders, and veterinarians worldwide as a definitive source of information. Recent increases in curation capacity and funding for software engineering support have resulted in software upgrades and commencement of several initiatives, which include the enhancement of variant information and links to human data resources, and the introduction of ontology-based breed information and categories. We provide an overview of current information and recent enhancements to OMIA and discuss how we are expanding the integration of OMIA into other resources and databases via the use of ontologies and the adaptation of tools used in human genetics.
Collapse
Affiliation(s)
- Imke Tammen
- Sydney School of Veterinary Science, The University of Sydney, Sydney, NSW, 2006, Australia.
| | - Marius Mather
- Sydney Informatics Hub, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Tosso Leeb
- Institute of Genetics, Vetsuisse Faculty, University of Bern, Bern, 3001, Switzerland
| | - Frank W Nicholas
- Sydney School of Veterinary Science, The University of Sydney, Sydney, NSW, 2006, Australia
| |
Collapse
|
4
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.25.600283. [PMID: 38979289 PMCID: PMC11230257 DOI: 10.1101/2024.06.25.600283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). Results The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past 25 years, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 186 VIPs, resulting in a total of 403 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. Conclusions VIPdb version 2 summarizes 403 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. Availability VIPdb version 2 is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
| | - Arul S. Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Currently at: Illumina, Foster City, California 94404, USA
| | - Steven E. Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| |
Collapse
|
5
|
Lee B, Nasanovsky L, Shen L, Maglinte DT, Pan Y, Gai X, Schmidt RJ, Raca G, Biegel JA, Roytman M, An P, Saunders CJ, Farrow EG, Shams S, Ji J. Significance Associated with Phenotype Score Aids in Variant Prioritization for Exome Sequencing Analysis. J Mol Diagn 2024; 26:337-348. [PMID: 38360210 DOI: 10.1016/j.jmoldx.2024.01.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 12/04/2023] [Accepted: 01/29/2024] [Indexed: 02/17/2024] Open
Abstract
Several in silico annotation-based methods have been developed to prioritize variants in exome sequencing analysis. This study introduced a novel metric Significance Associated with Phenotypes (SAP) score, which generates a statistical score by comparing an individual's observed phenotypes against existing gene-phenotype associations. To evaluate the SAP score, a retrospective analysis was performed on 219 exomes. Among them, 82 family-based and 35 singleton exomes had at least one disease-causing variant that explained the patient's clinical features. SAP scores were calculated, and the rank of the disease-causing variant was compared with a known method, Exomiser. Using the SAP score, the known causative variant was ranked in the top 10 retained variants for 94% (77 of 82) of the family-based exomes and in first place for 73% of these cases. For singleton exomes, the SAP score analysis ranked the known pathogenic variants within the top 10 for 80% (28 of 35) of cases. The SAP score, which is independent of detected variants, demonstrates comparable performance with Exomiser, which considers both phenotype and variant-level evidence simultaneously. Among 102 cases with negative results or variants of uncertain significance, SAP score analysis revealed two cases with a potential new diagnosis based on rank. The SAP score, a phenotypic quantitative metric, can be used in conjunction with standard variant filtration and annotation to enhance variant prioritization in exome analysis.
Collapse
Affiliation(s)
- Brian Lee
- Bionano Genomics, San Diego, California
| | | | - Lishuang Shen
- Center for Personalized Medicine, Department of Pathology and Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, California
| | - Dennis T Maglinte
- Center for Personalized Medicine, Department of Pathology and Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, California
| | - Yachen Pan
- Center for Personalized Medicine, Department of Pathology and Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, California
| | - Xiaowu Gai
- Center for Personalized Medicine, Department of Pathology and Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, California; Department of Pathology, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Ryan J Schmidt
- Center for Personalized Medicine, Department of Pathology and Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, California; Department of Pathology, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Gordana Raca
- Center for Personalized Medicine, Department of Pathology and Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, California; Department of Pathology, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Jaclyn A Biegel
- Center for Personalized Medicine, Department of Pathology and Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, California; Department of Pathology, Keck School of Medicine, University of Southern California, Los Angeles, California
| | | | - Paul An
- Bionano Genomics, San Diego, California
| | - Carol J Saunders
- Department of Pathology and Laboratory Medicine, Children's Mercy Hospital, Kansas City, Missouri; University of Missouri-Kansas City School of Medicine, Kansas City, Missouri
| | - Emily G Farrow
- Department of Pathology and Laboratory Medicine, Children's Mercy Hospital, Kansas City, Missouri; University of Missouri-Kansas City School of Medicine, Kansas City, Missouri
| | | | - Jianling Ji
- Center for Personalized Medicine, Department of Pathology and Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, California; Department of Pathology, Keck School of Medicine, University of Southern California, Los Angeles, California.
| |
Collapse
|
6
|
Jerschow E, Dubin R, Chen CC, iAkushev A, Sehanobish E, Asad M, Chiarella SE, Porcelli SA, Greally J. Aspirin-exacerbated respiratory disease is associated with variants in filaggrin, epithelial integrity, and cellular interactions. THE JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY. GLOBAL 2024; 3:100205. [PMID: 38317805 PMCID: PMC10838899 DOI: 10.1016/j.jacig.2024.100205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 11/15/2023] [Accepted: 12/01/2023] [Indexed: 02/07/2024]
Abstract
Background Previous studies have determined that up to 6% of patients with aspirin-exacerbated respiratory disease (AERD) have family history of AERD, indicating a possible link with genetic polymorphisms. However, whole exome sequencing (WES) studies of such associations are currently lacking. Objectives We sought to examine whether WES can identify pathogenic variants associated with AERD. Methods Diagnoses of AERD were confirmed in patients with nasal polyps and asthma. WES was performed using an Illumina sequencing platform. Human Phenotype Ontology terms were used to define the patients' phenotypes. Exomiser was used to annotate, filter, and prioritize possible disease-causing genetic variants. Results Of 39 patients with AERD, 41% reported a family history of asthma and 5% reported a family history of AERD. Pathogenic exome variants in the filaggrin gene (FLG) were found in 2 patients (5%). Other variants not known to be pathogenic were detected in an additional 16 patients (41%) in genes related to epithelial integrity and cellular interactions, including genes encoding desmoglein 3 (DSG3), dynein axonemal heavy chain 9 (DNAH9), collagen type VII alpha 1 chain (COL7A1), collagen type XVII alpha 1 chain (COL17A1), chromodomain helicase DNA binding protein-7 (CHD7), TSC complex subunit 2/tuberous sclerosis-2 protein (TSC2), P-selectin (SELP), and platelet-derived growth factor receptor-alpha (PDGFRA). Conclusion WES identified a monogenic susceptibility to AERD in 5% of patients with FLG pathogenic variants. Other variants not previously identified as pathogenic were found in genes relevant to epithelial integrity and cellular interactions and may further reveal genetic factors that contribute to this condition.
Collapse
Affiliation(s)
- Elina Jerschow
- Mayo Clinic, Rochester, Minn
- Albert Einstein College of Medicine, Bronx, NY
| | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Schmidtke J, Koch S, Krawczak M. Diagnostic elusiveness of pathogenic variants in cases of autosomal recessive diseases. Eur J Hum Genet 2024; 32:474-476. [PMID: 38443546 PMCID: PMC11061158 DOI: 10.1038/s41431-024-01574-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 02/20/2024] [Indexed: 03/07/2024] Open
Affiliation(s)
- Jörg Schmidtke
- Hannover Medical School, Carl-Neuberg-Str. 1, 30625, Hannover, Germany.
- amedes MVZ wagnerstibbe, Georgstrasse 50, 30159, Hannover, Germany.
| | - Sebastian Koch
- Institute of Medical Informatics and Statistics, Kiel University, Brunswiker Strasse 10, 24105, Kiel, Germany
| | - Michael Krawczak
- Institute of Medical Informatics and Statistics, Kiel University, Brunswiker Strasse 10, 24105, Kiel, Germany
| |
Collapse
|
8
|
Kingsmore SF, Nofsinger R, Ellsworth K. Rapid genomic sequencing for genetic disease diagnosis and therapy in intensive care units: a review. NPJ Genom Med 2024; 9:17. [PMID: 38413639 PMCID: PMC10899612 DOI: 10.1038/s41525-024-00404-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 02/15/2024] [Indexed: 02/29/2024] Open
Abstract
Single locus (Mendelian) diseases are a leading cause of childhood hospitalization, intensive care unit (ICU) admission, mortality, and healthcare cost. Rapid genome sequencing (RGS), ultra-rapid genome sequencing (URGS), and rapid exome sequencing (RES) are diagnostic tests for genetic diseases for ICU patients. In 44 studies of children in ICUs with diseases of unknown etiology, 37% received a genetic diagnosis, 26% had consequent changes in management, and net healthcare costs were reduced by $14,265 per child tested by URGS, RGS, or RES. URGS outperformed RGS and RES with faster time to diagnosis, and higher rate of diagnosis and clinical utility. Diagnostic and clinical outcomes will improve as methods evolve, costs decrease, and testing is implemented within precision medicine delivery systems attuned to ICU needs. URGS, RGS, and RES are currently performed in <5% of the ~200,000 children likely to benefit annually due to lack of payor coverage, inadequate reimbursement, hospital policies, hospitalist unfamiliarity, under-recognition of possible genetic diseases, and current formatting as tests rather than as a rapid precision medicine delivery system. The gap between actual and optimal outcomes in children in ICUs is currently increasing since expanded use of URGS, RGS, and RES lags growth in those likely to benefit through new therapies. There is sufficient evidence to conclude that URGS, RGS, or RES should be considered in all children with diseases of uncertain etiology at ICU admission. Minimally, diagnostic URGS, RGS, or RES should be ordered early during admissions of critically ill infants and children with suspected genetic diseases.
Collapse
Affiliation(s)
- Stephen F Kingsmore
- Rady Children's Institute for Genomic Medicine, Rady Children's Hospital, San Diego, CA, USA.
| | - Russell Nofsinger
- Rady Children's Institute for Genomic Medicine, Rady Children's Hospital, San Diego, CA, USA
| | - Kasia Ellsworth
- Rady Children's Institute for Genomic Medicine, Rady Children's Hospital, San Diego, CA, USA
| |
Collapse
|
9
|
Putman TE, Schaper K, Matentzoglu N, Rubinetti V, Alquaddoomi F, Cox C, Caufield JH, Elsarboukh G, Gehrke S, Hegde H, Reese J, Braun I, Bruskiewich R, Cappelletti L, Carbon S, Caron A, Chan L, Chute C, Cortes K, De Souza V, Fontana T, Harris N, Hartley E, Hurwitz E, Jacobsen JB, Krishnamurthy M, Laraway B, McLaughlin J, McMurry J, Moxon ST, Mullen K, O’Neil S, Shefchek K, Stefancsik R, Toro S, Vasilevsky N, Walls R, Whetzel P, Osumi-Sutherland D, Smedley D, Robinson P, Mungall C, Haendel M, Munoz-Torres M. The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species. Nucleic Acids Res 2024; 52:D938-D949. [PMID: 38000386 PMCID: PMC10767791 DOI: 10.1093/nar/gkad1082] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 10/21/2023] [Accepted: 11/02/2023] [Indexed: 11/26/2023] Open
Abstract
Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research. The Monarch App is an integrated platform combining data about genes, phenotypes, and diseases across species. Monarch's APIs enable access to carefully curated datasets and advanced analysis tools that support the understanding and diagnosis of disease for diverse applications such as variant prioritization, deep phenotyping, and patient profile-matching. We have migrated our system into a scalable, cloud-based infrastructure; simplified Monarch's data ingestion and knowledge graph integration systems; enhanced data mapping and integration standards; and developed a new user interface with novel search and graph navigation features. Furthermore, we advanced Monarch's analytic tools by developing a customized plugin for OpenAI's ChatGPT to increase the reliability of its responses about phenotypic data, allowing us to interrogate the knowledge in the Monarch graph using state-of-the-art Large Language Models. The resources of the Monarch Initiative can be found at monarchinitiative.org and its corresponding code repository at github.com/monarch-initiative/monarch-app.
Collapse
Affiliation(s)
- Tim E Putman
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Kevin Schaper
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | | | - Vincent P Rubinetti
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Faisal S Alquaddoomi
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Corey Cox
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - J Harry Caufield
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Glass Elsarboukh
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Sarah Gehrke
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Harshad Hegde
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Justin T Reese
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ian Braun
- Data Collaboration Center, Critical Path Institute, Tucson, AZ 85718, USA
| | | | | | - Seth Carbon
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Anita R Caron
- European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Lauren E Chan
- College of Public Health and Human Sciences, Oregon State University, Corvallis, OR 97331, USA
| | - Christopher G Chute
- Schools of Medicine, Public Health, and Nursing, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Katherina G Cortes
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | | | - Tommaso Fontana
- Dipartimento di Informatica, Università degli Studi di Milano Statale, Milano, Italy
| | - Nomi L Harris
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Emily L Hartley
- Data Collaboration Center, Critical Path Institute, Tucson, AZ 85718, USA
| | - Eric Hurwitz
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Julius O B Jacobsen
- William Harvey Research Institute, Queen Mary University of London, London EC1M 6BQ, UK
| | - Madan Krishnamurthy
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Bryan J Laraway
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | | | - Julie A McMurry
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Sierra A T Moxon
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Kathleen R Mullen
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Shawn T O’Neil
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Kent A Shefchek
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Ray Stefancsik
- European Bioinformatics Institute (EMBL-EBI), Hinxton CB10 1SD, UK
| | - Sabrina Toro
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | | | - Ramona L Walls
- Data Collaboration Center, Critical Path Institute, Tucson, AZ 85718, USA
| | - Patricia L Whetzel
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | | | - Damian Smedley
- William Harvey Research Institute, Queen Mary University of London, London EC1M 6BQ, UK
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 6032, USA
| | - Christopher J Mungall
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Melissa A Haendel
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Monica C Munoz-Torres
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| |
Collapse
|
10
|
Lin Z, Liu L, Li X, Huang S, Zhao H, Zeng S, Yang H, Xie Y, Zhang R. Phenotype-driven reanalysis reveals five novel pathogenic variants in 40 exome-negative families with Charcot-Marie-Tooth Disease. J Neurol 2024; 271:497-503. [PMID: 37776383 DOI: 10.1007/s00415-023-11991-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Revised: 09/01/2023] [Accepted: 09/06/2023] [Indexed: 10/02/2023]
Abstract
BACKGROUND To identify genetic causes in 40 whole exome sequencing (WES)-negative Charcot-Marie-Tooth (CMT) families and provide a summary of the clinical and genetic features of the diagnosed patients. METHODS The clinical information and sequencing data of 40 WES-negative families out of 131 CMT families were collected, and phenotype-driven reanalysis was conducted using the Exomiser software. RESULTS The molecular diagnosis was regained in 4 families, increasing the overall diagnosis rate by 3.0%. One family with adolescent-onset pure CMT1 was diagnosed [POLR3B: c.2810G>A (p.R937Q)] due to the novel genotype-phenotype association. One infantile-onset, severe CMT1 family with deep sensory disturbance was diagnosed by screening the BAM file and harbored c.1174C>T (p.R392*) and 875_927delinsCTGCCCACTCTGCCCACTCTGCCCACTCTG (p.V292Afs53) of PRX. Two families were diagnosed due to characteristic phenotypes, including an infantile-onset ICMT family with renal dysfunction harboring c.213_233delinsGAGGAGCA (p.S72Rfs34) of INF2 and an adolescent-onset CMT2 family with optic atrophy harboring c.560C>T (p.P187L) and c.616A>G (p.K206E) of SLC25A46. According to the American College of Medical Genetics and Genomics (ACMG) guidelines, the variants of POLR3B and SLC25A46 were classified as likely pathogenic, and the variants of INF2 and PRX were pathogenic. All these variants were first reported worldwide except for p.R392* of PRX. CONCLUSIONS We identified five novel pathogenic variants in POLR3B, PRX, INF2, and SLC25A46, which broaden their phenotypic and genotypic spectrums. Regular phenotype-driven reanalysis is a powerful strategy for increasing the diagnostic yield of WES-negative CMT patients, and long-term follow-up and screening BAM files for contiguous deletion and missense variants are both essential for reanalysis.
Collapse
Affiliation(s)
- Zhiqiang Lin
- Department of Neurology, The Third Xiangya Hospital, Central South University, Changsha, 410013, China
- Department of Neurology, Shenzhen Hospital, Southern Medical University, Shenzhen, China
| | - Lei Liu
- Department of Neurology, The Third Xiangya Hospital, Central South University, Changsha, 410013, China
- Health Management Center, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Xiaobo Li
- Department of Neurology, The Third Xiangya Hospital, Central South University, Changsha, 410013, China
| | - Shunxiang Huang
- Department of Neurology, The Third Xiangya Hospital, Central South University, Changsha, 410013, China
| | - Huadong Zhao
- Department of Neurology, The Third Xiangya Hospital, Central South University, Changsha, 410013, China
| | - Sen Zeng
- Department of Neurology, The Third Xiangya Hospital, Central South University, Changsha, 410013, China
| | - Honglan Yang
- Department of Neurology, The Third Xiangya Hospital, Central South University, Changsha, 410013, China
| | - Yongzhi Xie
- Department of Neurology, The Third Xiangya Hospital, Central South University, Changsha, 410013, China
- Department of Radiology, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Ruxu Zhang
- Department of Neurology, The Third Xiangya Hospital, Central South University, Changsha, 410013, China.
| |
Collapse
|
11
|
Danzi MC, Dohrn MF, Fazal S, Beijer D, Rebelo AP, Cintra V, Züchner S. Deep structured learning for variant prioritization in Mendelian diseases. Nat Commun 2023; 14:4167. [PMID: 37443090 PMCID: PMC10345112 DOI: 10.1038/s41467-023-39306-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 06/07/2023] [Indexed: 07/15/2023] Open
Abstract
Effective computer-aided or automated variant evaluations for monogenic diseases will expedite clinical diagnostic and research efforts of known and novel disease-causing genes. Here we introduce MAVERICK: a Mendelian Approach to Variant Effect pRedICtion built in Keras. MAVERICK is an ensemble of transformer-based neural networks that can classify a wide range of protein-altering single nucleotide variants (SNVs) and indels and assesses whether a variant would be pathogenic in the context of dominant or recessive inheritance. We demonstrate that MAVERICK outperforms all other major programs that assess pathogenicity in a Mendelian context. In a cohort of 644 previously solved patients with Mendelian diseases, MAVERICK ranks the causative pathogenic variant within the top five variants in over 95% of cases. Seventy-six percent of cases were solved by the top-ranked variant. MAVERICK ranks the causative pathogenic variant in hitherto novel disease genes within the first five candidate variants in 70% of cases. MAVERICK has already facilitated the identification of a novel disease gene causing a degenerative motor neuron disease. These results represent a significant step towards automated identification of causal variants in patients with Mendelian diseases.
Collapse
Affiliation(s)
- Matt C Danzi
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Maike F Dohrn
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
- Department of Neurology, Medical Faculty of the RWTH Aachen University, Aachen, Germany
| | - Sarah Fazal
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Danique Beijer
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Adriana P Rebelo
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Vivian Cintra
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Stephan Züchner
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA.
| |
Collapse
|
12
|
Henry OJ, Stödberg T, Båtelson S, Rasi C, Stranneheim H, Wedell A. Individualised human phenotype ontology gene panels improve clinical whole exome and genome sequencing analytical efficacy in a cohort of developmental and epileptic encephalopathies. Mol Genet Genomic Med 2023; 11:e2167. [PMID: 36967109 PMCID: PMC10337286 DOI: 10.1002/mgg3.2167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 02/21/2023] [Accepted: 03/01/2023] [Indexed: 07/20/2023] Open
Abstract
BACKGROUND The majority of genetic epilepsies remain unsolved in terms of specific genotype. Phenotype-based genomic analyses have shown potential to strengthen genomic analysis in various ways, including improving analytical efficacy. METHODS We have tested a standardised phenotyping method termed 'Phenomodels' for integrating deep-phenotyping information with our in-house developed clinical whole exome/genome sequencing analytical pipeline. Phenomodels includes a user-friendly epilepsy phenotyping template and an objective measure for selecting which template terms to include in individualised Human Phenotype Ontology (HPO) gene panels. In a pilot study of 38 previously solved cases of developmental and epileptic encephalopathies, we compared the sensitivity and specificity of the individualised HPO gene panels with the clinical epilepsy gene panel. RESULTS The Phenomodels template showed high sensitivity for capturing relevant phenotypic information, where 37/38 individuals' HPO gene panels included the causative gene. The HPO gene panels also had far fewer variants to assess than the epilepsy gene panel. CONCLUSION We have demonstrated a viable approach for incorporating standardised phenotype information into clinical genomic analyses, which may enable more efficient analysis.
Collapse
Affiliation(s)
- Olivia J. Henry
- Department of Molecular Medicine and SurgeryKarolinska InstitutetStockholmSweden
| | - Tommy Stödberg
- Department of Women's and Children's HealthKarolinska InstitutetStockholmSweden
- Department of Pediatric NeurologyKarolinska University HospitalStockholmSweden
| | - Sofia Båtelson
- Department of Pediatric NeurologyKarolinska University HospitalStockholmSweden
| | - Chiara Rasi
- Science for Life Laboratory, Department of Microbiology, Tumour and Cell BiologyKarolinska InstitutetStockholmSweden
| | - Henrik Stranneheim
- Department of Molecular Medicine and SurgeryKarolinska InstitutetStockholmSweden
- Science for Life Laboratory, Department of Microbiology, Tumour and Cell BiologyKarolinska InstitutetStockholmSweden
- Centre for Inherited Metabolic DiseasesKarolinska University HospitalStockholmSweden
| | - Anna Wedell
- Department of Molecular Medicine and SurgeryKarolinska InstitutetStockholmSweden
- Centre for Inherited Metabolic DiseasesKarolinska University HospitalStockholmSweden
| |
Collapse
|
13
|
Reiley J, Botas P, Miller CE, Zhao J, Malone Jenkins S, Best H, Grubb PH, Mao R, Isla J, Brunelli L. Open-Source Artificial Intelligence System Supports Diagnosis of Mendelian Diseases in Acutely Ill Infants. CHILDREN (BASEL, SWITZERLAND) 2023; 10:991. [PMID: 37371223 PMCID: PMC10296792 DOI: 10.3390/children10060991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 05/18/2023] [Accepted: 05/30/2023] [Indexed: 06/29/2023]
Abstract
Mendelian disorders are prevalent in neonatal and pediatric intensive care units and are a leading cause of morbidity and mortality in these settings. Current diagnostic pipelines that integrate phenotypic and genotypic data are expert-dependent and time-intensive. Artificial intelligence (AI) tools may help address these challenges. Dx29 is an open-source AI tool designed for use by clinicians. It analyzes the patient's phenotype and genotype to generate a ranked differential diagnosis. We used Dx29 to retrospectively analyze 25 acutely ill infants who had been diagnosed with a Mendelian disorder, using a targeted panel of ~5000 genes. For each case, a trio (proband and both parents) file containing gene variant information was analyzed, alongside patient phenotype, which was provided to Dx29 by three approaches: (1) AI extraction from medical records, (2) AI extraction with manual review/editing, and (3) manual entry. We then identified the rank of the correct diagnosis in Dx29's differential diagnosis. With these three approaches, Dx29 ranked the correct diagnosis in the top 10 in 92-96% of cases. These results suggest that non-expert use of Dx29's automated phenotyping and subsequent data analysis may compare favorably to standard workflows utilized by bioinformatics experts to analyze genomic data and diagnose Mendelian diseases.
Collapse
Affiliation(s)
- Joseph Reiley
- Division of Neonatology, Department of Pediatrics, University of Utah School of Medicine, Salt Lake City, UT 84108, USA
| | - Pablo Botas
- Foundation Twenty-Nine, 28223 Madrid, Spain
- Nostos Genomics, 10625 Berlin, Germany
| | - Christine E. Miller
- ARUP Laboratories, University of Utah Health Sciences Center, Salt Lake City, UT 84108, USA
- Valley Children’s Healthcare, Madera, CA 93636, USA
| | - Jian Zhao
- ARUP Laboratories, University of Utah Health Sciences Center, Salt Lake City, UT 84108, USA
- Department of Pathology, University of Utah School of Medicine, Salt Lake City, UT 84132, USA
| | - Sabrina Malone Jenkins
- Division of Neonatology, Department of Pediatrics, University of Utah School of Medicine, Salt Lake City, UT 84108, USA
| | - Hunter Best
- ARUP Laboratories, University of Utah Health Sciences Center, Salt Lake City, UT 84108, USA
- Department of Pathology, University of Utah School of Medicine, Salt Lake City, UT 84132, USA
| | - Peter H. Grubb
- Division of Neonatology, Department of Pediatrics, University of Utah School of Medicine, Salt Lake City, UT 84108, USA
| | - Rong Mao
- ARUP Laboratories, University of Utah Health Sciences Center, Salt Lake City, UT 84108, USA
- Department of Pathology, University of Utah School of Medicine, Salt Lake City, UT 84132, USA
| | | | - Luca Brunelli
- Division of Neonatology, Department of Pediatrics, University of Utah School of Medicine, Salt Lake City, UT 84108, USA
| |
Collapse
|
14
|
Fan Y, Zhou Y, Liu H, Luo X, Xu T, Sun Y, Yang T, Chen L, Gu X, Yu Y. Improving variant prioritization in exome analysis by entropy-weighted ensemble of multiple tools. Clin Genet 2023; 103:190-199. [PMID: 36309956 DOI: 10.1111/cge.14257] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 10/09/2022] [Accepted: 10/22/2022] [Indexed: 01/07/2023]
Abstract
Variant prioritization is a crucial step in the analysis of exome and genome sequencing. Multiple phenotype-driven tools have been developed to automate the variant prioritization process, but the efficacy of these tools in clinical setting with fuzzy phenotypic information and whether ensemble of these tools could outperform single algorithm remains to be assessed. A large rare disease cohort with heterogeneous phenotypic information, including a primary cohort of 1614 patients and a replication cohort of 1904 patients referred to exome sequencing, were recruited to assess the efficacy of variant prioritization and their ensemble. Three freely available tools-Exomiser, Xrare, and DeepPVP-and their ensemble were evaluated. The performance of all three tools was influenced by the attributes of phenotypic input. When combining these three tools by weighted-sum entropy method (EWE3), the ensemble outperformed any single algorithm, achieving a rate of 78% diagnostic variants in top 3 (13% improvement over current best performer, compared to Exomiser: 63%, Xrare: 65%, and DeepPVP: 51%), 88% in top 10 and 96% in top 30. The results were replicated in another independent cohort. Our study supports using entropy-weighted ensemble of multiple tools to improve variant prioritization and accelerate molecular diagnosis in exome/genome sequencing.
Collapse
Affiliation(s)
- Yanjie Fan
- Shanghai Institute of Pediatric Research, Xinhua Hospital affiliated to Shanghai Jiaotong University School of Medicine, Shanghai, China
| | | | - Huili Liu
- Shanghai Institute of Pediatric Research, Xinhua Hospital affiliated to Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Xiaomei Luo
- Shanghai Institute of Pediatric Research, Xinhua Hospital affiliated to Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Ting Xu
- Shanghai Institute of Pediatric Research, Xinhua Hospital affiliated to Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Yu Sun
- Shanghai Institute of Pediatric Research, Xinhua Hospital affiliated to Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Tingting Yang
- Shanghai Institute of Pediatric Research, Xinhua Hospital affiliated to Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Linlin Chen
- Shanghai Institute of Pediatric Research, Xinhua Hospital affiliated to Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Xuefan Gu
- Shanghai Institute of Pediatric Research, Xinhua Hospital affiliated to Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Yongguo Yu
- Shanghai Institute of Pediatric Research, Xinhua Hospital affiliated to Shanghai Jiaotong University School of Medicine, Shanghai, China
| |
Collapse
|
15
|
Tosco-Herrera E, Muñoz-Barrera A, Jáspez D, Rubio-Rodríguez LA, Mendoza-Alvarez A, Rodriguez-Perez H, Jou J, Iñigo-Campos A, Corrales A, Ciuffreda L, Martinez-Bugallo F, Prieto-Morin C, García-Olivares V, González-Montelongo R, Lorenzo-Salazar JM, Marcelino-Rodriguez I, Flores C. Evaluation of a whole-exome sequencing pipeline and benchmarking of causal germline variant prioritizers. Hum Mutat 2022; 43:2010-2020. [PMID: 36054330 DOI: 10.1002/humu.24459] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 08/20/2022] [Accepted: 08/30/2022] [Indexed: 01/25/2023]
Abstract
Most causal variants of Mendelian diseases are exonic. Whole-exome sequencing (WES) has become the diagnostic gold standard, but causative variant prioritization constitutes a bottleneck. Here we assessed an in-house sample-to-sequence pipeline and benchmarked free prioritization tools for germline causal variants from WES data. WES of 61 unselected patients with a known genetic disease cause was obtained. Variant prioritizations were performed by diverse tools and recorded to obtain a diagnostic yield when the causal variant was present in the first, fifth, and 10th top rankings. A fraction of causal variants was not captured by WES (8.2%) or did not pass the quality control criteria (13.1%). Most of the applications inspected were unavailable or had technical limitations, leaving nine tools for complete evaluation. Exomiser performed best in the top first rankings, while LIRICAL led in the top fifth rankings. Based on the more conservative top 10th rankings, Xrare had the highest diagnostic yield, followed by a three-way tie among Exomiser, LIRICAL, and PhenIX, then followed by AMELIE, TAPES, Phen-Gen, AIVar, and VarNote-PAT. Xrare, Exomiser, LIRICAL, and PhenIX are the most efficient options for variant prioritization in real patient WES data.
Collapse
Affiliation(s)
- Eva Tosco-Herrera
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain.,Escuela de Doctorado y Estudios de Posgrado de la Universidad de La Laguna (EDEPULL), Universidad de La Laguna (ULL), San Cristóbal de La Laguna, Spain
| | - Adrián Muñoz-Barrera
- Escuela de Doctorado y Estudios de Posgrado de la Universidad de La Laguna (EDEPULL), Universidad de La Laguna (ULL), San Cristóbal de La Laguna, Spain.,Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain
| | - David Jáspez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain
| | - Luis A Rubio-Rodríguez
- Escuela de Doctorado y Estudios de Posgrado de la Universidad de La Laguna (EDEPULL), Universidad de La Laguna (ULL), San Cristóbal de La Laguna, Spain.,Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain
| | - Alejandro Mendoza-Alvarez
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain.,Escuela de Doctorado y Estudios de Posgrado de la Universidad de La Laguna (EDEPULL), Universidad de La Laguna (ULL), San Cristóbal de La Laguna, Spain
| | - Hector Rodriguez-Perez
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain.,Escuela de Doctorado y Estudios de Posgrado de la Universidad de La Laguna (EDEPULL), Universidad de La Laguna (ULL), San Cristóbal de La Laguna, Spain
| | - Jonathan Jou
- Department of Surgery, University of Illinois College of Medicine, Peoria, Illinois, USA
| | - Antonio Iñigo-Campos
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain
| | - Almudena Corrales
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain.,CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
| | - Laura Ciuffreda
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain
| | - Francisco Martinez-Bugallo
- Clinical Analysis Service, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain
| | - Carol Prieto-Morin
- Clinical Analysis Service, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain
| | - Víctor García-Olivares
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain
| | | | - Jose Miguel Lorenzo-Salazar
- Escuela de Doctorado y Estudios de Posgrado de la Universidad de La Laguna (EDEPULL), Universidad de La Laguna (ULL), San Cristóbal de La Laguna, Spain.,Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain
| | | | - Carlos Flores
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain.,Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain.,CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain.,Facultad de Ciencias de la Salud, Universidad Fernando Pessoa Canarias, Las Palmas de Gran Canaria, Spain
| |
Collapse
|
16
|
Phenotype-aware prioritisation of rare Mendelian disease variants. Trends Genet 2022; 38:1271-1283. [PMID: 35934592 PMCID: PMC9950798 DOI: 10.1016/j.tig.2022.07.002] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 06/06/2022] [Accepted: 07/05/2022] [Indexed: 01/24/2023]
Abstract
A molecular diagnosis from the analysis of sequencing data in rare Mendelian diseases has a huge impact on the management of patients and their families. Numerous patient phenotype-aware variant prioritisation (VP) tools have been developed to help automate this process, and shorten the diagnostic odyssey, but performance statistics on real patient data are limited. Here we identify, assess, and compare the performance of all up-to-date, freely available, and programmatically accessible tools using a whole-exome, retinal disease dataset from 134 individuals with a molecular diagnosis. All tools were able to identify around two-thirds of the genetic diagnoses as the top-ranked candidate, with LIRICAL performing best overall. Finally, we discuss the challenges to overcome most cases remaining undiagnosed after current, state-of-the-art practices.
Collapse
|
17
|
Raimondi D, Orlando G, Verplaetse N, Fariselli P, Moreau Y. Editorial: Towards genome interpretation: Computational methods to model the genotype-phenotype relationship. FRONTIERS IN BIOINFORMATICS 2022; 2:1098941. [PMID: 36530385 PMCID: PMC9749061 DOI: 10.3389/fbinf.2022.1098941] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 11/17/2022] [Indexed: 11/12/2023] Open
Affiliation(s)
| | | | | | - Piero Fariselli
- Department of Medical Sciences, University of Torino, Torino, Italy
| | | |
Collapse
|
18
|
Exploration of Tools for the Interpretation of Human Non-Coding Variants. Int J Mol Sci 2022; 23:ijms232112977. [PMID: 36361767 PMCID: PMC9654743 DOI: 10.3390/ijms232112977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 10/17/2022] [Accepted: 10/23/2022] [Indexed: 02/01/2023] Open
Abstract
The advent of Whole Genome Sequencing (WGS) broadened the genetic variation detection range, revealing the presence of variants even in non-coding regions of the genome, which would have been missed using targeted approaches. One of the most challenging issues in WGS analysis regards the interpretation of annotated variants. This review focuses on tools suitable for the functional annotation of variants falling into non-coding regions. It couples the description of non-coding genomic areas with the results and performance of existing tools for a functional interpretation of the effect of variants in these regions. Tools were tested in a controlled genomic scenario, representing the ground-truth and allowing us to determine software performance.
Collapse
|
19
|
Johnson B, Ouyang K, Frank L, Truty R, Rojahn S, Morales A, Aradhya S, Nykamp K. Systematic use of phenotype evidence in clinical genetic testing reduces the frequency of variants of uncertain significance. Am J Med Genet A 2022; 188:2642-2651. [PMID: 35570716 PMCID: PMC9544038 DOI: 10.1002/ajmg.a.62779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 04/23/2022] [Indexed: 01/24/2023]
Abstract
Guidelines for variant interpretation include criteria for incorporating phenotype evidence, but this evidence is inconsistently applied. Systematic approaches to using phenotype evidence are needed. We developed a method for curating disease phenotypes as highly or moderately predictive of variant pathogenicity based on the frequency of their association with disease-causing variants. To evaluate this method's accuracy, we retrospectively reviewed variants with clinical classifications that had evolved from uncertain to definitive in genes associated with curated predictive phenotypes. To demonstrate the clinical validity and utility of this approach, we compared variant classifications determined with and without predictive phenotype evidence. The curation method was accurate for 93%-98% of eligible variants. Among variants interpreted using highly predictive phenotype evidence, the percentage classified as pathogenic or likely pathogenic was 80%, compared with 46%-54% had the evidence not been used. Positive results among individuals harboring variants with highly predictive phenotype-guided interpretations would have been missed in 25%-37% of diagnostic tests and 39%-50% of carrier screens had other approaches to phenotype evidence been used. In summary, predictive phenotype evidence associated with specific curated genes can be systematically incorporated into variant interpretation to reduce uncertainty and increase the clinical utility of genetic testing.
Collapse
Affiliation(s)
| | | | | | | | | | - Ana Morales
- Invitae CorporationSan FranciscoCaliforniaUSA
| | | | | |
Collapse
|
20
|
Liu C, Ta CN, Havrilla JM, Nestor JG, Spotnitz ME, Geneslaw AS, Hu Y, Chung WK, Wang K, Weng C. OARD: Open annotations for rare diseases and their phenotypes based on real-world data. Am J Hum Genet 2022; 109:1591-1604. [PMID: 35998640 PMCID: PMC9502051 DOI: 10.1016/j.ajhg.2022.08.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 08/01/2022] [Indexed: 11/23/2022] Open
Abstract
Diagnosis for rare genetic diseases often relies on phenotype-driven methods, which hinge on the accuracy and completeness of the rare disease phenotypes in the underlying annotation knowledgebase. Existing knowledgebases are often manually curated with additional annotations found in published case reports. Despite their potential, real-world data such as electronic health records (EHRs) have not been fully exploited to derive rare disease annotations. Here, we present open annotation for rare diseases (OARD), a real-world-data-derived resource with annotation for rare-disease-related phenotypes. This resource is derived from the EHRs of two academic health institutions containing more than 10 million individuals spanning wide age ranges and different disease subgroups. By leveraging ontology mapping and advanced natural-language-processing (NLP) methods, OARD automatically and efficiently extracts concepts for both rare diseases and their phenotypic traits from billing codes and lab tests as well as over 100 million clinical narratives. The rare disease prevalence derived by OARD is highly correlated with those annotated in the original rare disease knowledgebase. By performing association analysis, we identified more than 1 million novel disease-phenotype association pairs that were previously missed by human annotation, and >60% were confirmed true associations via manual review of a list of sampled pairs. Compared to the manual curated annotation, OARD is 100% data driven and its pipeline can be shared across different institutions. By supporting privacy-preserving sharing of aggregated summary statistics, such as term frequencies and disease-phenotype associations, it fills an important gap to facilitate data-driven research in the rare disease community.
Collapse
Affiliation(s)
- Cong Liu
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
| | - Casey N Ta
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
| | - Jim M Havrilla
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Jordan G Nestor
- Division of Nephrology, Department of Medicine, Columbia University, New York, NY 10032, USA
| | - Matthew E Spotnitz
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
| | - Andrew S Geneslaw
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Yu Hu
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Wendy K Chung
- Department of Pediatrics, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Kai Wang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA.
| |
Collapse
|
21
|
Chin HL, Gazzaz N, Huynh S, Handra I, Warnock L, Moller-Hansen A, Boerkoel P, Jacobsen JOB, du Souich C, Zhang N, Shefchek K, Prentice LM, Washington N, Haendel M, Armstrong L, Clarke L, Li WL, Smedley D, Robinson PN, Boerkoel CF. The Clinical Variant Analysis Tool: Analyzing the evidence supporting reported genomic variation in clinical practice. Genet Med 2022; 24:1512-1522. [PMID: 35442193 PMCID: PMC9363005 DOI: 10.1016/j.gim.2022.03.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 03/15/2022] [Accepted: 03/16/2022] [Indexed: 01/03/2023] Open
Abstract
PURPOSE Genomic test results, regardless of laboratory variant classification, require clinical practitioners to judge the applicability of a variant for medical decisions. Teaching and standardizing clinical interpretation of genomic variation calls for a methodology or tool. METHODS To generate such a tool, we distilled the Clinical Genome Resource framework of causality and the American College of Medical Genetics/Association of Molecular Pathology and Quest Diagnostic Laboratory scoring of variant deleteriousness into the Clinical Variant Analysis Tool (CVAT). Applying this to 289 clinical exome reports, we compared the performance of junior practitioners with that of experienced medical geneticists and assessed the utility of reported variants. RESULTS CVAT enabled performance comparable to that of experienced medical geneticists. In total, 124 of 289 (42.9%) exome reports and 146 of 382 (38.2%) reported variants supported a diagnosis. Overall, 10.5% (1 pathogenic [P] or likely pathogenic [LP] variant and 39 variants of uncertain significance [VUS]) of variants were reported in genes without established disease association; 20.2% (23 P/LP and 54 VUS) were in genes without sufficient phenotypic concordance; 7.3% (15 P/LP and 13 VUS) conflicted with the known molecular disease mechanism; and 24% (91 VUS) had insufficient evidence for deleteriousness. CONCLUSION Implementation of CVAT standardized clinical interpretation of genomic variation and emphasized the need for collaborative and transparent reporting of genomic variation.
Collapse
Affiliation(s)
- Hui-Lin Chin
- Department of Medical Genetics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada; Provincial Medical Genetics Program, Women's Hospital of British Columbia, Vancouver, British Columbia, Canada; Khoo Teck Puat-National University Children's Medical Institute, National University Hospital, Singapore, Singapore
| | - Nour Gazzaz
- Department of Medical Genetics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada; Provincial Medical Genetics Program, Women's Hospital of British Columbia, Vancouver, British Columbia, Canada; Department of Pediatrics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada; Department of Pediatrics, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Stephanie Huynh
- Department of Medical Genetics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada; Provincial Medical Genetics Program, Women's Hospital of British Columbia, Vancouver, British Columbia, Canada
| | - Iulia Handra
- Department of Medical Genetics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada; Provincial Medical Genetics Program, Women's Hospital of British Columbia, Vancouver, British Columbia, Canada
| | - Lynn Warnock
- Provincial Medical Genetics Program, Women's Hospital of British Columbia, Vancouver, British Columbia, Canada
| | - Ashley Moller-Hansen
- Provincial Medical Genetics Program, Women's Hospital of British Columbia, Vancouver, British Columbia, Canada
| | - Pierre Boerkoel
- MD Undergraduate Program, Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Julius O B Jacobsen
- William Harvey Research Institute, Barts & The London School of Medicine & Dentistry, Queen Mary University of London, London, United Kingdom
| | - Christèle du Souich
- Department of Medical Genetics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada
| | | | - Kent Shefchek
- Oregon Clinical and Translational Science Institute, Oregon Health & Science University, Portland, OR
| | - Leah M Prentice
- Provincial Laboratory Medicine Services, Provincial Health Services Authority, Vancouver, British Columbia, Canada
| | | | - Melissa Haendel
- Oregon Clinical and Translational Science Institute, Oregon Health & Science University, Portland, OR
| | - Linlea Armstrong
- Department of Medical Genetics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada; Provincial Medical Genetics Program, Women's Hospital of British Columbia, Vancouver, British Columbia, Canada
| | - Lorne Clarke
- Department of Medical Genetics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada; Provincial Medical Genetics Program, Women's Hospital of British Columbia, Vancouver, British Columbia, Canada
| | | | - Damian Smedley
- William Harvey Research Institute, Barts & The London School of Medicine & Dentistry, Queen Mary University of London, London, United Kingdom
| | | | - Cornelius F Boerkoel
- Department of Medical Genetics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada; Provincial Medical Genetics Program, Women's Hospital of British Columbia, Vancouver, British Columbia, Canada.
| |
Collapse
|
22
|
Alghamdi SM, Schofield PN, Hoehndorf R. How much do model organism phenotypes contribute to the computational identification of human disease genes? Dis Model Mech 2022; 15:275986. [PMID: 35758016 PMCID: PMC9366895 DOI: 10.1242/dmm.049441] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 06/13/2022] [Indexed: 12/04/2022] Open
Abstract
Computing phenotypic similarity helps identify new disease genes and diagnose rare diseases. Genotype–phenotype data from orthologous genes in model organisms can compensate for lack of human data and increase genome coverage. In the past decade, cross-species phenotype comparisons have proven valuble, and several ontologies have been developed for this purpose. The relative contribution of different model organisms to computational identification of disease-associated genes is not fully explored. We used phenotype ontologies to semantically relate phenotypes resulting from loss-of-function mutations in model organisms to disease-associated phenotypes in humans. Semantic machine learning methods were used to measure the contribution of different model organisms to the identification of known human gene–disease associations. We found that mouse genotype–phenotype data provided the most important dataset in the identification of human disease genes by semantic similarity and machine learning over phenotype ontologies. Other model organisms' data did not improve identification over that obtained using the mouse alone, and therefore did not contribute significantly to this task. Our work impacts on the development of integrated phenotype ontologies, as well as for the use of model organism phenotypes in human genetic variant interpretation. This article has an associated First Person interview with the first author of the paper. Editor's choice: We investigated the use of model organism phenotypes in the computational identification of disease genes, identifying several data biases and concluding that mouse model phenotypes contribute most to computational disease gene identification.
Collapse
Affiliation(s)
- Sarah M Alghamdi
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, 4700 KAUST, 23955 Thuwal, Saudi Arabia
| | - Paul N Schofield
- Department of Physiology, Development & Neuroscience, University of Cambridge, Downing Street, CB2 3EG, Cambridge, UK
| | - Robert Hoehndorf
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, 4700 KAUST, 23955 Thuwal, Saudi Arabia
| |
Collapse
|
23
|
A Formative Study of the Implementation of Whole Genome Sequencing in Northern Ireland. Genes (Basel) 2022; 13:genes13071104. [PMID: 35885887 PMCID: PMC9316942 DOI: 10.3390/genes13071104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 06/13/2022] [Accepted: 06/14/2022] [Indexed: 02/05/2023] Open
Abstract
Background: The UK 100,000 Genomes Project was a transformational research project which facilitated whole genome sequencing (WGS) diagnostics for rare diseases. We evaluated experiences of introducing WGS in Northern Ireland, providing recommendations for future projects. Methods: This formative evaluation included (1) an appraisal of the logistics of implementing and delivering WGS, (2) a survey of participant self-reported views and experiences, (3) semi-structured interviews with healthcare staff as key informants who were involved in the delivery of WGS and (4) a workshop discussion about interprofessional collaboration with respect to molecular diagnostics. Results: We engaged with >400 participants, with detailed reflections obtained from 74 participants including patients, caregivers, key National Health Service (NHS) informants, and researchers (patient survey n = 42; semi-structured interviews n = 19; attendees of the discussion workshop n = 13). Overarching themes included the need to improve rare disease awareness, education, and support services, as well as interprofessional collaboration being central to an effective, mainstreamed molecular diagnostic service. Conclusions: Recommendations for streamlining precision medicine for patients with rare diseases include administrative improvements (e.g., streamlining of the consent process), educational improvements (e.g., rare disease training provided from undergraduate to postgraduate education alongside genomics training for non-genetic specialists) and analytical improvements (e.g., multidisciplinary collaboration and improved computational infrastructure).
Collapse
|
24
|
Cohen ASA, Farrow EG, Abdelmoity AT, Alaimo JT, Amudhavalli SM, Anderson JT, Bansal L, Bartik L, Baybayan P, Belden B, Berrios CD, Biswell RL, Buczkowicz P, Buske O, Chakraborty S, Cheung WA, Coffman KA, Cooper AM, Cross LA, Curran T, Dang TTT, Elfrink MM, Engleman KL, Fecske ED, Fieser C, Fitzgerald K, Fleming EA, Gadea RN, Gannon JL, Gelineau-Morel RN, Gibson M, Goldstein J, Grundberg E, Halpin K, Harvey BS, Heese BA, Hein W, Herd SM, Hughes SS, Ilyas M, Jacobson J, Jenkins JL, Jiang S, Johnston JJ, Keeler K, Korlach J, Kussmann J, Lambert C, Lawson C, Le Pichon JB, Leeder JS, Little VC, Louiselle DA, Lypka M, McDonald BD, Miller N, Modrcin A, Nair A, Neal SH, Oermann CM, Pacicca DM, Pawar K, Posey NL, Price N, Puckett LMB, Quezada JF, Raje N, Rowell WJ, Rush ET, Sampath V, Saunders CJ, Schwager C, Schwend RM, Shaffer E, Smail C, Soden S, Strenk ME, Sullivan BR, Sweeney BR, Tam-Williams JB, Walter AM, Welsh H, Wenger AM, Willig LK, Yan Y, Younger ST, Zhou D, Zion TN, Thiffault I, Pastinen T. Genomic answers for children: Dynamic analyses of >1000 pediatric rare disease genomes. Genet Med 2022; 24:1336-1348. [PMID: 35305867 DOI: 10.1016/j.gim.2022.02.007] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 02/05/2022] [Accepted: 02/07/2022] [Indexed: 12/17/2022] Open
Abstract
PURPOSE This study aimed to provide comprehensive diagnostic and candidate analyses in a pediatric rare disease cohort through the Genomic Answers for Kids program. METHODS Extensive analyses of 960 families with suspected genetic disorders included short-read exome sequencing and short-read genome sequencing (srGS); PacBio HiFi long-read genome sequencing (HiFi-GS); variant calling for single nucleotide variants (SNV), structural variant (SV), and repeat variants; and machine-learning variant prioritization. Structured phenotypes, prioritized variants, and pedigrees were stored in PhenoTips database, with data sharing through controlled access the database of Genotypes and Phenotypes. RESULTS Diagnostic rates ranged from 11% in patients with prior negative genetic testing to 34.5% in naive patients. Incorporating SVs from genome sequencing added up to 13% of new diagnoses in previously unsolved cases. HiFi-GS yielded increased discovery rate with >4-fold more rare coding SVs compared with srGS. Variants and genes of unknown significance remain the most common finding (58% of nondiagnostic cases). CONCLUSION Computational prioritization is efficient for diagnostic SNVs. Thorough identification of non-SNVs remains challenging and is partly mitigated using HiFi-GS sequencing. Importantly, community research is supported by sharing real-time data to accelerate gene validation and by providing HiFi variant (SNV/SV) resources from >1000 human alleles to facilitate implementation of new sequencing platforms for rare disease diagnoses.
Collapse
Affiliation(s)
- Ana S A Cohen
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO; Department of Pathology and Laboratory Medicine, Children's Mercy Kansas City, Kansas City, MO; UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO
| | - Emily G Farrow
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO; UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | | | - Joseph T Alaimo
- Department of Pathology and Laboratory Medicine, Children's Mercy Kansas City, Kansas City, MO; UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO
| | - Shivarajan M Amudhavalli
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | - John T Anderson
- Department of Orthopaedic Surgery, Children's Mercy Kansas City, Kansas City, MO
| | - Lalit Bansal
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Lauren Bartik
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | | | - Bradley Belden
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | | | - Rebecca L Biswell
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | | | | | | | - Warren A Cheung
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | - Keith A Coffman
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Ashley M Cooper
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Laura A Cross
- Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | - Tom Curran
- Children's Mercy Research Institute, Kansas City, MO
| | - Thuy Tien T Dang
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Mary M Elfrink
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | | | - Erin D Fecske
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Cynthia Fieser
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Keely Fitzgerald
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Emily A Fleming
- Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | - Randi N Gadea
- Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | | | - Rose N Gelineau-Morel
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Margaret Gibson
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | - Jeffrey Goldstein
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Elin Grundberg
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | - Kelsee Halpin
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Brian S Harvey
- Department of Orthopaedic Surgery, Children's Mercy Kansas City, Kansas City, MO
| | - Bryce A Heese
- Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | - Wendy Hein
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Suzanne M Herd
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | - Susan S Hughes
- Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | - Mohammed Ilyas
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Jill Jacobson
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Janda L Jenkins
- Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | | | | | - Kathryn Keeler
- Department of Orthopaedic Surgery, Children's Mercy Kansas City, Kansas City, MO
| | - Jonas Korlach
- Pacific Biosciences of California, Inc, Menlo Park, CA
| | | | | | - Caitlin Lawson
- Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | | | | | - Vicki C Little
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | | | | | | | - Neil Miller
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO; UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Division of Allergy Immunology Pulmonary and Sleep Medicine, Children's Mercy Kansas City, Kansas City, MO
| | - Ann Modrcin
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Annapoorna Nair
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | - Shelby H Neal
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | | | - Donna M Pacicca
- Department of Orthopaedic Surgery, Children's Mercy Kansas City, Kansas City, MO
| | - Kailash Pawar
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Nyshele L Posey
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | - Nigel Price
- Department of Orthopaedic Surgery, Children's Mercy Kansas City, Kansas City, MO
| | - Laura M B Puckett
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | - Julio F Quezada
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Nikita Raje
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Division of Neonatology, Children's Mercy Kansas City, Kansas City, MO
| | | | - Eric T Rush
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Division of Genetics, Children's Mercy Kansas City, Kansas City, MO; Department of Internal Medicine, University of Kansas School of Medicine, Kansas City, MO
| | - Venkatesh Sampath
- Division of Neonatology, Children's Mercy Hospital Kansas City, Kansas City, MO
| | - Carol J Saunders
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO; Department of Pathology and Laboratory Medicine, Children's Mercy Kansas City, Kansas City, MO; UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO
| | - Caitlin Schwager
- Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | - Richard M Schwend
- Department of Orthopaedic Surgery, Children's Mercy Kansas City, Kansas City, MO
| | - Elizabeth Shaffer
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Craig Smail
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | - Sarah Soden
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Meghan E Strenk
- Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | | | - Brooke R Sweeney
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | | | - Adam M Walter
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | - Holly Welsh
- Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | | | - Laurel K Willig
- Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Yun Yan
- UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO
| | - Scott T Younger
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO
| | - Dihong Zhou
- Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | - Tricia N Zion
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO; UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Department of Pediatrics, Children's Mercy Kansas City, Kansas City, MO; Division of Genetics, Children's Mercy Kansas City, Kansas City, MO
| | - Isabelle Thiffault
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO; Department of Pathology and Laboratory Medicine, Children's Mercy Kansas City, Kansas City, MO; UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO.
| | - Tomi Pastinen
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO; UKMC School of Medicine, University of Missouri Kansas City, Kansas City, MO; Children's Mercy Research Institute, Kansas City, MO.
| |
Collapse
|
25
|
Jacobsen JOB, Kelly C, Cipriani V, Robinson PN, Smedley D. Evaluation of phenotype-driven gene prioritization methods for Mendelian diseases. Brief Bioinform 2022; 23:6589867. [PMID: 35595299 PMCID: PMC9487604 DOI: 10.1093/bib/bbac188] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 03/18/2022] [Accepted: 04/25/2022] [Indexed: 11/14/2022] Open
Abstract
Yuan et al. recently described an independent evaluation of several phenotype-driven gene prioritization methods for Mendelian disease on two separate, clinical datasets. Although they attempted to use default settings for each tool, we describe three key differences from those we currently recommend for our Exomiser and PhenIX tools. These influence how variant frequency, quality and predicted pathogenicity are used for filtering and prioritization. We propose that these differences account for much of the discrepancy in performance between that reported by them (15–26% diagnoses ranked top by Exomiser) and previously published reports by us and others (72–77%). On a set of 161 singleton samples, we show using these settings increases performance from 34% to 72% and suggest a reassessment of Exomiser and PhenIX on their datasets using these would show a similar uplift.
Collapse
Affiliation(s)
- Julius O B Jacobsen
- William Harvey Research Institute, Charterhouse Square, Barts and the London School of Medicine and Dentistry Queen, Queen Mary University of London, EC1M 6BQ London, UK
| | - Catherine Kelly
- William Harvey Research Institute, Charterhouse Square, Barts and the London School of Medicine and Dentistry Queen, Queen Mary University of London, EC1M 6BQ London, UK
| | - Valentina Cipriani
- William Harvey Research Institute, Charterhouse Square, Barts and the London School of Medicine and Dentistry Queen, Queen Mary University of London, EC1M 6BQ London, UK
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | - Damian Smedley
- William Harvey Research Institute, Charterhouse Square, Barts and the London School of Medicine and Dentistry Queen, Queen Mary University of London, EC1M 6BQ London, UK
| |
Collapse
|
26
|
Marques P, Korbonits M. Approach to the Patient With Pseudoacromegaly. J Clin Endocrinol Metab 2022; 107:1767-1788. [PMID: 34792134 DOI: 10.1210/clinem/dgab789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Indexed: 11/19/2022]
Abstract
Pseudoacromegaly encompasses a heterogeneous group of conditions in which patients have clinical features of acromegaly or gigantism, but no excess of GH or IGF-1. Acromegaloid physical features or accelerated growth in a patient may prompt referral to endocrinologists. Because pseudoacromegaly conditions are rare and heterogeneous, often with overlapping clinical features, the underlying diagnosis may be challenging to establish. As many of these have a genetic origin, such as pachydermoperiostosis, Sotos syndrome, Weaver syndrome, or Cantú syndrome, collaboration is key with clinical geneticists in the diagnosis of these patients. Although rare, awareness of these uncommon conditions and their characteristic features will help their timely recognition.
Collapse
Affiliation(s)
- Pedro Marques
- Centre for Endocrinology, William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, EC1M 6BQ London, UK
- Endocrinology Department, Hospital de Santa Maria, Centro Hospitalar Universitário de Lisboa Norte, Lisboa, Portugal
| | - Márta Korbonits
- Centre for Endocrinology, William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, EC1M 6BQ London, UK
| |
Collapse
|
27
|
Austin-Tse CA, Jobanputra V, Perry DL, Bick D, Taft RJ, Venner E, Gibbs RA, Young T, Barnett S, Belmont JW, Boczek N, Chowdhury S, Ellsworth KA, Guha S, Kulkarni S, Marcou C, Meng L, Murdock DR, Rehman AU, Spiteri E, Thomas-Wilson A, Kearney HM, Rehm HL. Best practices for the interpretation and reporting of clinical whole genome sequencing. NPJ Genom Med 2022; 7:27. [PMID: 35395838 PMCID: PMC8993917 DOI: 10.1038/s41525-022-00295-z] [Citation(s) in RCA: 52] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 02/17/2022] [Indexed: 01/19/2023] Open
Abstract
Whole genome sequencing (WGS) shows promise as a first-tier diagnostic test for patients with rare genetic disorders. However, standards addressing the definition and deployment practice of a best-in-class test are lacking. To address these gaps, the Medical Genome Initiative, a consortium of leading health care and research organizations in the US and Canada, was formed to expand access to high quality clinical WGS by convening experts and publishing best practices. Here, we present best practice recommendations for the interpretation and reporting of clinical diagnostic WGS, including discussion of challenges and emerging approaches that will be critical to harness the full potential of this comprehensive test.
Collapse
Affiliation(s)
- Christina A Austin-Tse
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Laboratory for Molecular Medicine, Mass General Brigham Personalized Medicine, Cambridge, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - Vaidehi Jobanputra
- Molecular Diagnostics Laboratory, New York Genome Center, New York, NY, USA
- Department of Pathology and Cell Biology, Columbia University Irving Medical Center, New York, NY, USA
| | | | - David Bick
- HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA
| | | | - Eric Venner
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Richard A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Ted Young
- Genome Diagnostics, Department of Paediatric Laboratory Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Sarah Barnett
- Division of Laboratory Genetics and Genomics, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
| | | | - Nicole Boczek
- Division of Laboratory Genetics and Genomics, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
- Center for Individualized Medicine, College of Medicine, Mayo Clinic, Rochester, MN, USA
| | - Shimul Chowdhury
- Rady Children's Institute for Genomic Medicine, San Diego, CA, USA
| | | | - Saurav Guha
- Molecular Diagnostics Laboratory, New York Genome Center, New York, NY, USA
| | - Shashikant Kulkarni
- Baylor Genetics and Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Cherisse Marcou
- Division of Laboratory Genetics and Genomics, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
| | - Linyan Meng
- Baylor Genetics and Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - David R Murdock
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Atteeq U Rehman
- Molecular Diagnostics Laboratory, New York Genome Center, New York, NY, USA
| | - Elizabeth Spiteri
- Department of Pathology, Stanford Medicine, Stanford University, Stanford, CA, USA
| | | | - Hutton M Kearney
- Division of Laboratory Genetics and Genomics, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
| | - Heidi L Rehm
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
28
|
Marwaha S, Knowles JW, Ashley EA. A guide for the diagnosis of rare and undiagnosed disease: beyond the exome. Genome Med 2022; 14:23. [PMID: 35220969 PMCID: PMC8883622 DOI: 10.1186/s13073-022-01026-w] [Citation(s) in RCA: 113] [Impact Index Per Article: 56.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 02/10/2022] [Indexed: 02/07/2023] Open
Abstract
Rare diseases affect 30 million people in the USA and more than 300-400 million worldwide, often causing chronic illness, disability, and premature death. Traditional diagnostic techniques rely heavily on heuristic approaches, coupling clinical experience from prior rare disease presentations with the medical literature. A large number of rare disease patients remain undiagnosed for years and many even die without an accurate diagnosis. In recent years, gene panels, microarrays, and exome sequencing have helped to identify the molecular cause of such rare and undiagnosed diseases. These technologies have allowed diagnoses for a sizable proportion (25-35%) of undiagnosed patients, often with actionable findings. However, a large proportion of these patients remain undiagnosed. In this review, we focus on technologies that can be adopted if exome sequencing is unrevealing. We discuss the benefits of sequencing the whole genome and the additional benefit that may be offered by long-read technology, pan-genome reference, transcriptomics, metabolomics, proteomics, and methyl profiling. We highlight computational methods to help identify regionally distant patients with similar phenotypes or similar genetic mutations. Finally, we describe approaches to automate and accelerate genomic analysis. The strategies discussed here are intended to serve as a guide for clinicians and researchers in the next steps when encountering patients with non-diagnostic exomes.
Collapse
Affiliation(s)
- Shruti Marwaha
- Department of Medicine, Division of Cardiovascular Medicine, School of Medicine, Stanford University, Stanford, CA, USA.
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA.
| | - Joshua W Knowles
- Department of Medicine, Division of Cardiovascular Medicine, School of Medicine, Stanford University, Stanford, CA, USA
- Department of Medicine, Diabetes Research Center, Cardiovascular Institute and Prevention Research Center, Stanford, CA, USA
| | - Euan A Ashley
- Department of Medicine, Division of Cardiovascular Medicine, School of Medicine, Stanford University, Stanford, CA, USA.
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA.
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA.
| |
Collapse
|
29
|
Methods to Improve Molecular Diagnosis in Genomic Cold Cases in Pediatric Neurology. Genes (Basel) 2022; 13:genes13020333. [PMID: 35205378 PMCID: PMC8871714 DOI: 10.3390/genes13020333] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 02/06/2022] [Accepted: 02/07/2022] [Indexed: 02/04/2023] Open
Abstract
During the last decade, genetic testing has emerged as an important etiological diagnostic tool for Mendelian diseases, including pediatric neurological conditions. A genetic diagnosis has a considerable impact on disease management and treatment; however, many cases remain undiagnosed after applying standard diagnostic sequencing techniques. This review discusses various methods to improve the molecular diagnostic rates in these genomic cold cases. We discuss extended analysis methods to consider, non-Mendelian inheritance models, mosaicism, dual/multiple diagnoses, periodic re-analysis, artificial intelligence tools, and deep phenotyping, in addition to integrating various omics methods to improve variant prioritization. Last, novel genomic technologies, including long-read sequencing, artificial long-read sequencing, and optical genome mapping are discussed. In conclusion, a more comprehensive molecular analysis and a timely re-analysis of unsolved cases are imperative to improve diagnostic rates. In addition, our current understanding of the human genome is still limited due to restrictions in technologies. Novel technologies are now available that improve upon some of these limitations and can capture all human genomic variation more accurately. Last, we recommend a more routine implementation of high molecular weight DNA extraction methods that is coherent with the ability to use and/or optimally benefit from these novel genomic methods.
Collapse
|
30
|
Leung ML, Ji J, Baker S, Buchan JG, Sivakumaran TA, Krock BL, Hutchins R, Bayrak-Toydemir P, Pfeifer J, Cremona ML, Funke B, Santani AB. A Framework of Critical Considerations in Clinical Exome Reanalyses by Clinical and Laboratory Standards Institute. J Mol Diagn 2022; 24:177-188. [PMID: 35074075 DOI: 10.1016/j.jmoldx.2021.11.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 10/20/2021] [Accepted: 11/02/2021] [Indexed: 11/28/2022] Open
Abstract
Exome reanalysis is useful for providing molecular diagnoses for previously uninformative samples. However, challenges exist in implementing a practical solution for clinicians and laboratories. This study complements the current literature by providing practical considerations for patient-level and cohort-level reanalyses. The Clinical and Laboratory Standards Institute assembled the Document Development Committee and an interpretation working group that developed the framework for reevaluation of exome-based data. We describe two distinct but complementary approaches toward exome reanalyses: clinician-initiated patient-level reanalysis, and laboratory-initiated cohort-level reanalysis. We highlight the advantages and constraints for both approaches, and provide a high-level conceptual guide for ordering clinicians and laboratories through the critical decision pathways. Because clinical exome sequencing continues to be the standard of care in genetics, exome reanalysis would be critical in increasing the overall diagnostic yield. A systematic guide will facilitate the efficient adoption of reevaluation of exome data for laboratories, health care professionals, genetic counselors, and clinicians.
Collapse
Affiliation(s)
- Marco L Leung
- Departments of Pathology and Pediatrics, The Ohio State University College of Medicine, Columbus, Ohio; The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, Ohio; Department of Pathology and Laboratory Medicine, Nationwide Children's Hospital, Columbus, Ohio
| | - Jianling Ji
- Department of Pathology, Keck School of Medicine, University of Southern California, Los Angeles, California; Center for Personalized Medicine, Department of Pathology and Laboratory Medicine, Children's Hospital Los Angeles, Los Angeles, California
| | - Samuel Baker
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Jillian G Buchan
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, Washington
| | - Theru A Sivakumaran
- Division of Pathology and Laboratory Medicine, Phoenix Children's Hospital, Phoenix, Arizona
| | | | | | - Pinar Bayrak-Toydemir
- Department of Pathology, The University of Utah, Salt Lake City, Utah; ARUP Laboratories, Salt Lake City, Utah
| | - John Pfeifer
- Department of Pathology and Immunology, Washington University School of Medicine in St. Louis, St. Louis, Missouri
| | | | | | - Avni B Santani
- Center for Applied Genomics, Children's Hospital of Philadelphia, Pennsylvania; Veritas Genetics, Boston, Massachusetts.
| |
Collapse
|
31
|
Raimondi D, Corso M, Fariselli P, Moreau Y. From genotype to phenotype in Arabidopsis thaliana: in-silico genome interpretation predicts 288 phenotypes from sequencing data. Nucleic Acids Res 2021; 50:e16. [PMID: 34792168 PMCID: PMC8860592 DOI: 10.1093/nar/gkab1099] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 10/06/2021] [Accepted: 10/22/2021] [Indexed: 01/09/2023] Open
Abstract
In many cases, the unprecedented availability of data provided by high-throughput sequencing has shifted the bottleneck from a data availability issue to a data interpretation issue, thus delaying the promised breakthroughs in genetics and precision medicine, for what concerns Human genetics, and phenotype prediction to improve plant adaptation to climate change and resistance to bioagressors, for what concerns plant sciences. In this paper, we propose a novel Genome Interpretation paradigm, which aims at directly modeling the genotype-to-phenotype relationship, and we focus on A. thaliana since it is the best studied model organism in plant genetics. Our model, called Galiana, is the first end-to-end Neural Network (NN) approach following the genomes in/phenotypes out paradigm and it is trained to predict 288 real-valued Arabidopsis thaliana phenotypes from Whole Genome sequencing data. We show that 75 of these phenotypes are predicted with a Pearson correlation ≥0.4, and are mostly related to flowering traits. We show that our end-to-end NN approach achieves better performances and larger phenotype coverage than models predicting single phenotypes from the GWAS-derived known associated genes. Galiana is also fully interpretable, thanks to the Saliency Maps gradient-based approaches. We followed this interpretation approach to identify 36 novel genes that are likely to be associated with flowering traits, finding evidence for 6 of them in the existing literature.
Collapse
Affiliation(s)
| | - Massimiliano Corso
- Institut Jean-Pierre Bourgin, Université Paris-Saclay, INRAE, AgroParisTech, 78000 Versailles, France
| | - Piero Fariselli
- Department of Medical Sciences, University of Torino, 10123 Torino, Italy
| | - Yves Moreau
- ESAT-STADIUS, KU Leuven, 3001 Leuven, Belgium
| |
Collapse
|
32
|
De La Vega FM, Chowdhury S, Moore B, Frise E, McCarthy J, Hernandez EJ, Wong T, James K, Guidugli L, Agrawal PB, Genetti CA, Brownstein CA, Beggs AH, Löscher BS, Franke A, Boone B, Levy SE, Õunap K, Pajusalu S, Huentelman M, Ramsey K, Naymik M, Narayanan V, Veeraraghavan N, Billings P, Reese MG, Yandell M, Kingsmore SF. Artificial intelligence enables comprehensive genome interpretation and nomination of candidate diagnoses for rare genetic diseases. Genome Med 2021; 13:153. [PMID: 34645491 PMCID: PMC8515723 DOI: 10.1186/s13073-021-00965-0] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 08/27/2021] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Clinical interpretation of genetic variants in the context of the patient's phenotype is becoming the largest component of cost and time expenditure for genome-based diagnosis of rare genetic diseases. Artificial intelligence (AI) holds promise to greatly simplify and speed genome interpretation by integrating predictive methods with the growing knowledge of genetic disease. Here we assess the diagnostic performance of Fabric GEM, a new, AI-based, clinical decision support tool for expediting genome interpretation. METHODS We benchmarked GEM in a retrospective cohort of 119 probands, mostly NICU infants, diagnosed with rare genetic diseases, who received whole-genome or whole-exome sequencing (WGS, WES). We replicated our analyses in a separate cohort of 60 cases collected from five academic medical centers. For comparison, we also analyzed these cases with current state-of-the-art variant prioritization tools. Included in the comparisons were trio, duo, and singleton cases. Variants underpinning diagnoses spanned diverse modes of inheritance and types, including structural variants (SVs). Patient phenotypes were extracted from clinical notes by two means: manually and using an automated clinical natural language processing (CNLP) tool. Finally, 14 previously unsolved cases were reanalyzed. RESULTS GEM ranked over 90% of the causal genes among the top or second candidate and prioritized for review a median of 3 candidate genes per case, using either manually curated or CNLP-derived phenotype descriptions. Ranking of trios and duos was unchanged when analyzed as singletons. In 17 of 20 cases with diagnostic SVs, GEM identified the causal SVs as the top candidate and in 19/20 within the top five, irrespective of whether SV calls were provided or inferred ab initio by GEM using its own internal SV detection algorithm. GEM showed similar performance in absence of parental genotypes. Analysis of 14 previously unsolved cases resulted in a novel finding for one case, candidates ultimately not advanced upon manual review for 3 cases, and no new findings for 10 cases. CONCLUSIONS GEM enabled diagnostic interpretation inclusive of all variant types through automated nomination of a very short list of candidate genes and disorders for final review and reporting. In combination with deep phenotyping by CNLP, GEM enables substantial automation of genetic disease diagnosis, potentially decreasing cost and expediting case review.
Collapse
Affiliation(s)
- Francisco M. De La Vega
- Fabric Genomics Inc., Oakland, CA USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA USA
- Current Address: Tempus Labs Inc., Redwood City, CA 94065 USA
| | - Shimul Chowdhury
- Rady Children’s Institute for Genomic Medicine, San Diego, CA USA
| | - Barry Moore
- Department of Human Genetics, Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT USA
| | | | | | - Edgar Javier Hernandez
- Department of Human Genetics, Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT USA
| | - Terence Wong
- Rady Children’s Institute for Genomic Medicine, San Diego, CA USA
| | - Kiely James
- Rady Children’s Institute for Genomic Medicine, San Diego, CA USA
| | - Lucia Guidugli
- Rady Children’s Institute for Genomic Medicine, San Diego, CA USA
| | - Pankaj B. Agrawal
- Division of Genetics and Genomics, The Manton Center for Orphan Disease Research, Boston Children’s Hospital, Harvard Medical School, Boston, MA USA
- Division of Newborn Medicine, Boston Children’s Hospital, Boston, MA USA
| | - Casie A. Genetti
- Division of Genetics and Genomics, The Manton Center for Orphan Disease Research, Boston Children’s Hospital, Harvard Medical School, Boston, MA USA
| | - Catherine A. Brownstein
- Division of Genetics and Genomics, The Manton Center for Orphan Disease Research, Boston Children’s Hospital, Harvard Medical School, Boston, MA USA
| | - Alan H. Beggs
- Division of Genetics and Genomics, The Manton Center for Orphan Disease Research, Boston Children’s Hospital, Harvard Medical School, Boston, MA USA
| | - Britt-Sabina Löscher
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel & University Hospital Schleswig-Holstein, Kiel, Germany
| | - Andre Franke
- Institute of Clinical Molecular Biology, Christian-Albrechts-University of Kiel & University Hospital Schleswig-Holstein, Kiel, Germany
| | - Braden Boone
- HudsonAlpha Institute for Biotechnology, Huntsville, AL USA
| | - Shawn E. Levy
- HudsonAlpha Institute for Biotechnology, Huntsville, AL USA
| | - Katrin Õunap
- Department of Clinical Genetics, United Laboratories, Tartu University Hospital, Tartu, Estonia
- Department of Clinical Genetics, Institute of Clinical Medicine, University of Tartu, Tartu, Estonia
| | - Sander Pajusalu
- Department of Clinical Genetics, United Laboratories, Tartu University Hospital, Tartu, Estonia
- Department of Clinical Genetics, Institute of Clinical Medicine, University of Tartu, Tartu, Estonia
| | - Matt Huentelman
- Center for Rare Childhood Disorders, Translational Genomics Research Institute, Phoenix, AZ USA
| | - Keri Ramsey
- Center for Rare Childhood Disorders, Translational Genomics Research Institute, Phoenix, AZ USA
| | - Marcus Naymik
- Center for Rare Childhood Disorders, Translational Genomics Research Institute, Phoenix, AZ USA
| | - Vinodh Narayanan
- Center for Rare Childhood Disorders, Translational Genomics Research Institute, Phoenix, AZ USA
| | | | | | | | - Mark Yandell
- Fabric Genomics Inc., Oakland, CA USA
- Department of Human Genetics, Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT USA
| | | |
Collapse
|
33
|
Hay E, Wilson LC, Hoskins B, Samuels M, Munot P, Rahman S. Biallelic P4HTM variants associated with HIDEA syndrome and mitochondrial respiratory chain complex I deficiency. Eur J Hum Genet 2021; 29:1536-1541. [PMID: 34285383 PMCID: PMC8484625 DOI: 10.1038/s41431-021-00932-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 06/06/2021] [Accepted: 06/21/2021] [Indexed: 02/07/2023] Open
Abstract
We report a patient with profound congenital hypotonia, central hypoventilation, poor visual behaviour with retinal hypopigmentation, and significantly decreased mitochondrial respiratory chain complex I activity in muscle, who died at 7 months of age having made minimal developmental progress. Biallelic predicted truncating P4HTM variants were identified following trio whole-genome sequencing, consistent with a diagnosis of hypotonia, hypoventilation, intellectual disability, dysautonomia, epilepsy and eye abnormalities (HIDEA) syndrome. Very few patients with HIDEA syndrome have been reported previously and mitochondrial abnormalities were observed in three of four previous cases who had a muscle biopsy, suggesting the possibility that HIDEA syndrome represents a primary mitochondrial disorder. P4HTM encodes a transmembrane prolyl 4-hydroxylase with putative targets including hypoxia inducible factors, RNA polymerase II and activating transcription factor 4, which has been implicated in the integrated stress response observed in cell and animal models of mitochondrial disease, and may explain the mitochondrial dysfunction observed in HIDEA syndrome.
Collapse
Affiliation(s)
- Eleanor Hay
- grid.420468.cDepartment of Clinical Genetics, Great Ormond Street Hospital, London, UK
| | - Louise C. Wilson
- grid.420468.cDepartment of Clinical Genetics, Great Ormond Street Hospital, London, UK
| | - Bethan Hoskins
- grid.420468.cNorth Thames Regional Genetic laboratory, Great Ormond Street Hospital, London, UK
| | - Martin Samuels
- grid.420468.cDepartment of Respiratory Medicine, Great Ormond Street Hospital, London, UK
| | - Pinki Munot
- grid.420468.cDepartment of Neurosciences, Dubowitz Neuromuscular Centre, Great Ormond Street Hospital, London, UK
| | - Shamima Rahman
- grid.83440.3b0000000121901201UCL Great Ormond Street Institute of Child Health, UCL, 30 Guilford Street, London, WC1N 1EH, UK
| |
Collapse
|
34
|
Kafkas Ş, Althubaiti S, Gkoutos GV, Hoehndorf R, Schofield PN. Linking common human diseases to their phenotypes; development of a resource for human phenomics. J Biomed Semantics 2021; 12:17. [PMID: 34425897 PMCID: PMC8383460 DOI: 10.1186/s13326-021-00249-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Accepted: 07/30/2021] [Indexed: 11/11/2022] Open
Abstract
Background In recent years a large volume of clinical genomics data has become available due to rapid advances in sequencing technologies. Efficient exploitation of this genomics data requires linkage to patient phenotype profiles. Current resources providing disease-phenotype associations are not comprehensive, and they often do not have broad coverage of the disease terminologies, particularly ICD-10, which is still the primary terminology used in clinical settings. Methods We developed two approaches to gather disease-phenotype associations. First, we used a text mining method that utilizes semantic relations in phenotype ontologies, and applies statistical methods to extract associations between diseases in ICD-10 and phenotype ontology classes from the literature. Second, we developed a semi-automatic way to collect ICD-10–phenotype associations from existing resources containing known relationships. Results We generated four datasets. Two of them are independent datasets linking diseases to their phenotypes based on text mining and semi-automatic strategies. The remaining two datasets are generated from these datasets and cover a subset of ICD-10 classes of common diseases contained in UK Biobank. We extensively validated our text mined and semi-automatically curated datasets by: comparing them against an expert-curated validation dataset containing disease–phenotype associations, measuring their similarity to disease–phenotype associations found in public databases, and assessing how well they could be used to recover gene–disease associations using phenotype similarity. Conclusion We find that our text mining method can produce phenotype annotations of diseases that are correct but often too general to have significant information content, or too specific to accurately reflect the typical manifestations of the sporadic disease. On the other hand, the datasets generated from integrating multiple knowledgebases are more complete (i.e., cover more of the required phenotype annotations for a given disease). We make all data freely available at 10.5281/zenodo.4726713. Supplementary Information The online version contains supplementary material available at (10.1186/s13326-021-00249-x).
Collapse
Affiliation(s)
- Şenay Kafkas
- Computational Bioscience Research Center (CBRC), Computer, Electrical, and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955, Saudi Arabia
| | - Sara Althubaiti
- Computational Bioscience Research Center (CBRC), Computer, Electrical, and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955, Saudi Arabia
| | - Georgios V Gkoutos
- Health Data Research UK, Midlands site, Edgbaston, Birmingham, B15 2TT, United Kingdom.,Institute of Cancer and Genomic Sciences, University of Birmingham, Edgbaston, Birmingham, B15 2TT, United Kingdom
| | - Robert Hoehndorf
- Computational Bioscience Research Center (CBRC), Computer, Electrical, and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955, Saudi Arabia.
| | - Paul N Schofield
- Department of Physiology, Development & Neuroscience, University of Cambridge, Downing Street, Cambridge, CB2 3EG, United Kingdom
| |
Collapse
|