1
|
Huang DL, Zeng Q, Xiong Y, Liu S, Pang C, Xia M, Fang T, Ma Y, Qiang C, Zhang Y, Zhang Y, Li H, Yuan Y. A Combined Manual Annotation and Deep-Learning Natural Language Processing Study on Accurate Entity Extraction in Hereditary Disease Related Biomedical Literature. Interdiscip Sci 2024:10.1007/s12539-024-00605-2. [PMID: 38340264 DOI: 10.1007/s12539-024-00605-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 01/02/2024] [Accepted: 01/03/2024] [Indexed: 02/12/2024]
Abstract
We report a combined manual annotation and deep-learning natural language processing study to make accurate entity extraction in hereditary disease related biomedical literature. A total of 400 full articles were manually annotated based on published guidelines by experienced genetic interpreters at Beijing Genomics Institute (BGI). The performance of our manual annotations was assessed by comparing our re-annotated results with those publicly available. The overall Jaccard index was calculated to be 0.866 for the four entity types-gene, variant, disease and species. Both a BERT-based large name entity recognition (NER) model and a DistilBERT-based simplified NER model were trained, validated and tested, respectively. Due to the limited manually annotated corpus, Such NER models were fine-tuned with two phases. The F1-scores of BERT-based NER for gene, variant, disease and species are 97.28%, 93.52%, 92.54% and 95.76%, respectively, while those of DistilBERT-based NER are 95.14%, 86.26%, 91.37% and 89.92%, respectively. Most importantly, the entity type of variant has been extracted by a large language model for the first time and a comparable F1-score with the state-of-the-art variant extraction model tmVar has been achieved.
Collapse
Affiliation(s)
- Dao-Ling Huang
- BGI Research, Shenzhen, 518083, China.
- Clinical Laboratory of BGI Health, BGI-Shenzhen, Shenzhen, 518083, China.
| | - Quanlei Zeng
- BGI-Wuhan Clinical Laboratories, BGI-Shenzhen, Wuhan, 430074, China
| | - Yun Xiong
- BGI-Wuhan Clinical Laboratories, BGI-Shenzhen, Wuhan, 430074, China
| | - Shuixia Liu
- BGI-Wuhan Clinical Laboratories, BGI-Shenzhen, Wuhan, 430074, China
| | - Chaoqun Pang
- BGI-Wuhan Clinical Laboratories, BGI-Shenzhen, Wuhan, 430074, China
| | - Menglei Xia
- BGI-Wuhan Clinical Laboratories, BGI-Shenzhen, Wuhan, 430074, China
| | - Ting Fang
- BGI-Wuhan Clinical Laboratories, BGI-Shenzhen, Wuhan, 430074, China
| | - Yanli Ma
- BGI-Wuhan Clinical Laboratories, BGI-Shenzhen, Wuhan, 430074, China
| | - Cuicui Qiang
- BGI-Wuhan Clinical Laboratories, BGI-Shenzhen, Wuhan, 430074, China
| | - Yi Zhang
- BGI-Wuhan Clinical Laboratories, BGI-Shenzhen, Wuhan, 430074, China
| | - Yu Zhang
- BGI-Wuhan Clinical Laboratories, BGI-Shenzhen, Wuhan, 430074, China
| | - Hong Li
- BGI-Wuhan Clinical Laboratories, BGI-Shenzhen, Wuhan, 430074, China
| | - Yuying Yuan
- Clinical Laboratory of BGI Health, BGI-Shenzhen, Shenzhen, 518083, China
| |
Collapse
|
2
|
Henderson ML, Zieba JK, Li X, Campbell DB, Williams MR, Vogt DL, Bupp CP, Edgerly YM, Rajasekaran S, Hartog NL, Prokop JW, Krueger JM. Gene Therapy for Genetic Syndromes: Understanding the Current State to Guide Future Care. BIOTECH 2024; 13:1. [PMID: 38247731 PMCID: PMC10801589 DOI: 10.3390/biotech13010001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 12/08/2023] [Accepted: 12/21/2023] [Indexed: 01/23/2024] Open
Abstract
Gene therapy holds promise as a life-changing option for individuals with genetic variants that give rise to disease. FDA-approved gene therapies for Spinal Muscular Atrophy (SMA), cerebral adrenoleukodystrophy, β-Thalassemia, hemophilia A/B, retinal dystrophy, and Duchenne Muscular Dystrophy have generated buzz around the ability to change the course of genetic syndromes. However, this excitement risks over-expansion into areas of genetic disease that may not fit the current state of gene therapy. While in situ (targeted to an area) and ex vivo (removal of cells, delivery, and administration of cells) approaches show promise, they have a limited target ability. Broader in vivo gene therapy trials have shown various continued challenges, including immune response, use of immune suppressants correlating to secondary infections, unknown outcomes of overexpression, and challenges in driving tissue-specific corrections. Viral delivery systems can be associated with adverse outcomes such as hepatotoxicity and lethality if uncontrolled. In some cases, these risks are far outweighed by the potentially lethal syndromes for which these systems are being developed. Therefore, it is critical to evaluate the field of genetic diseases to perform cost-benefit analyses for gene therapy. In this work, we present the current state while setting forth tools and resources to guide informed directions to avoid foreseeable issues in gene therapy that could prevent the field from continued success.
Collapse
Affiliation(s)
- Marian L. Henderson
- The Department of Biology, Calvin University, Grand Rapids, MI 49546, USA;
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 48824, USA; (J.K.Z.); (X.L.); (D.B.C.); (M.R.W.); (D.L.V.); (C.P.B.); (S.R.); (N.L.H.)
| | - Jacob K. Zieba
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 48824, USA; (J.K.Z.); (X.L.); (D.B.C.); (M.R.W.); (D.L.V.); (C.P.B.); (S.R.); (N.L.H.)
| | - Xiaopeng Li
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 48824, USA; (J.K.Z.); (X.L.); (D.B.C.); (M.R.W.); (D.L.V.); (C.P.B.); (S.R.); (N.L.H.)
| | - Daniel B. Campbell
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 48824, USA; (J.K.Z.); (X.L.); (D.B.C.); (M.R.W.); (D.L.V.); (C.P.B.); (S.R.); (N.L.H.)
| | - Michael R. Williams
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 48824, USA; (J.K.Z.); (X.L.); (D.B.C.); (M.R.W.); (D.L.V.); (C.P.B.); (S.R.); (N.L.H.)
| | - Daniel L. Vogt
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 48824, USA; (J.K.Z.); (X.L.); (D.B.C.); (M.R.W.); (D.L.V.); (C.P.B.); (S.R.); (N.L.H.)
| | - Caleb P. Bupp
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 48824, USA; (J.K.Z.); (X.L.); (D.B.C.); (M.R.W.); (D.L.V.); (C.P.B.); (S.R.); (N.L.H.)
- Medical Genetics, Corewell Health, Grand Rapids, MI 49503, USA
| | | | - Surender Rajasekaran
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 48824, USA; (J.K.Z.); (X.L.); (D.B.C.); (M.R.W.); (D.L.V.); (C.P.B.); (S.R.); (N.L.H.)
- Office of Research, Corewell Health, Grand Rapids, MI 49503, USA;
- Pediatric Intensive Care Unit, Helen DeVos Children’s Hospital, Corewell Health, Grand Rapids, MI 49503, USA
| | - Nicholas L. Hartog
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 48824, USA; (J.K.Z.); (X.L.); (D.B.C.); (M.R.W.); (D.L.V.); (C.P.B.); (S.R.); (N.L.H.)
- Allergy & Immunology, Corewell Health, Grand Rapids, MI 49503, USA
| | - Jeremy W. Prokop
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 48824, USA; (J.K.Z.); (X.L.); (D.B.C.); (M.R.W.); (D.L.V.); (C.P.B.); (S.R.); (N.L.H.)
- Office of Research, Corewell Health, Grand Rapids, MI 49503, USA;
| | - Jena M. Krueger
- Department of Pediatrics and Human Development, College of Human Medicine, Michigan State University, Grand Rapids, MI 48824, USA; (J.K.Z.); (X.L.); (D.B.C.); (M.R.W.); (D.L.V.); (C.P.B.); (S.R.); (N.L.H.)
- Department of Neurology, Helen DeVos Children’s Hospital, Corewell Health, Grand Rapids, MI 49503, USA
| |
Collapse
|
3
|
Sinha R, Pal RK, De RK. ENLIGHTENMENT: A Scalable Annotated Database of Genomics and NGS-Based Nucleotide Level Profiles. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:155-168. [PMID: 38055361 DOI: 10.1109/tcbb.2023.3340067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
The revolution in sequencing technologies has enabled human genomes to be sequenced at a very low cost and time leading to exponential growth in the availability of whole-genome sequences. However, the complete understanding of our genome and its association with cancer is a far way to go. Researchers are striving hard to detect new variants and find their association with diseases, which further gives rise to the need for aggregation of this Big Data into a common standard scalable platform. In this work, a database named Enlightenment has been implemented which makes the availability of genomic data integrated from eight public databases, and DNA sequencing profiles of H. sapiens in a single platform. Annotated results with respect to cancer specific biomarkers, pharmacogenetic biomarkers and its association with variability in drug response, and DNA profiles along with novel copy number variants are computed and stored, which are accessible through a web interface. In order to overcome the challenge of storage and processing of NGS technology-based whole-genome DNA sequences, Enlightenment has been extended and deployed to a flexible and horizontally scalable database HBase, which is distributed over a hadoop cluster, which would enable the integration of other omics data into the database for enlightening the path towards eradication of cancer.
Collapse
|
4
|
Lahiri S, Reys B, Wunder J, Pirzadeh-Miller S. Genetic variants with discordant classifications: An assessment of genetic counselor attitudes and practices. J Genet Couns 2023; 32:100-110. [PMID: 35978490 DOI: 10.1002/jgc4.1626] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 07/22/2022] [Accepted: 07/27/2022] [Indexed: 11/11/2022]
Abstract
Discordant variant classifications (DVCs) can impact patient care and pose challenges for clinicians. A survey-based study was conducted to examine genetic counselor (GC) attitudes and practices related to DVCs. Most GCs (202/229, 88%) in the study provide direct patient care across clinical specialties; review patients' genetic test results to determine if reported genetic variants have DVCs (176/202, 88%); and inform patients of known DVCs that impact medical management (165/202, 82%). DVC review, which takes 41 min (range: 5-240) on average per week, is typically prompted by the identification of a variant of uncertain significance (VUS) (160/176, 90%) and is primarily conducted using public databases (176/176, 100%). While most GCs felt it would not be ethical to knowingly provide different medical management recommendations to patients with the same genetic variant (152/229, 66%), they also stated they would rely on the variant classification on the test report (141/229, 61%) and/or the patient's personal/family history (188/229, 82%) to determine which classification to follow if a DVC is identified. Both factors are patient-specific and, inherently, could lead to differing recommendations. When posed with a hypothetical scenario in which two patients have the same genetic variant, but test reports show a DVC (pathogenic vs VUS), most GCs (179/229, 78.2%) stated they would make the same recommendation for both patients regardless of management guidelines. One-third (52/179, 29.1%) cited patient-specific factors, such as personal/family history, would impact their recommendations. Disagreements about whether the pathogenic or VUS classification should be used to make medical management recommendations were noted. Differing practices and opinions on how to manage patients with DVCs, as well as the fact that most GCs (209/229, 91.3%) have consulted with colleagues on this matter, highlight the need for more professional guidance to ensure equitable patient care.
Collapse
Affiliation(s)
- Sayoni Lahiri
- Cancer Genetics Program, UT Southwestern Medical Center, Dallas, Texas, USA
| | - Brian Reys
- Cancer Genetics Program, UT Southwestern Medical Center, Dallas, Texas, USA
| | - Julia Wunder
- Oncology-Abstraction, Tempus Labs, Inc., Chicago, Illinois, USA
| | | |
Collapse
|
5
|
Berisha SZ, Shetty S, Prior TW, Mitchell AL. Cytogenetic and molecular diagnostic testing associated with prenatal and postnatal birth defects. Birth Defects Res 2021; 112:293-306. [PMID: 32115903 PMCID: PMC9290954 DOI: 10.1002/bdr2.1648] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 01/08/2020] [Indexed: 11/23/2022]
Abstract
Genetic testing is beneficial for patients and providers when in search of answers to medical problems related to the prenatal or early postnatal period. It can help to identify the cause or confirm a diagnosis associated with developmental delay, intellectual disability, dysmorphic features, heart defects, multiple malformations, short stature, stillbirth, neonatal death, or fertility problems. Genetic testing can be used to rule out single‐gene or chromosome abnormalities. Different diagnostic cytogenetic and molecular genetic techniques are applied in clinical genetics laboratories, from conventional ones to the state of the art chromosomal microarrays and next‐generation sequencing. Each of the genetic techniques or methods has its strengths and limitations, however different methods complement each‐other in trying to identify the genetic variation(s) responsible for a medical condition, especially the ones related to birth defects.
Collapse
Affiliation(s)
- Stela Z Berisha
- Center for Human Genetics, University Hospitals Cleveland Medical Center, Cleveland, Ohio
| | - Shashi Shetty
- Center for Human Genetics, University Hospitals Cleveland Medical Center, Cleveland, Ohio.,Department of Pathology, Case Western Reserve University, University Hospitals, Cleveland, Ohio
| | - Thomas W Prior
- Center for Human Genetics, University Hospitals Cleveland Medical Center, Cleveland, Ohio.,Department of Pathology, Case Western Reserve University, University Hospitals, Cleveland, Ohio
| | - Anna L Mitchell
- Center for Human Genetics, University Hospitals Cleveland Medical Center, Cleveland, Ohio.,Department of Genetics and Genome Sciences, Case Western Reserve University, University Hospitals, Cleveland, Ohio
| |
Collapse
|
6
|
Chung SS, Ng JCF, Laddach A, Thomas NSB, Fraternali F. Short loop functional commonality identified in leukaemia proteome highlights crucial protein sub-networks. NAR Genom Bioinform 2021; 3:lqab010. [PMID: 33709075 PMCID: PMC7936661 DOI: 10.1093/nargab/lqab010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 12/19/2020] [Accepted: 01/26/2021] [Indexed: 11/13/2022] Open
Abstract
Direct drug targeting of mutated proteins in cancer is not always possible and efficacy can be nullified by compensating protein-protein interactions (PPIs). Here, we establish an in silico pipeline to identify specific PPI sub-networks containing mutated proteins as potential targets, which we apply to mutation data of four different leukaemias. Our method is based on extracting cyclic interactions of a small number of proteins topologically and functionally linked in the Protein-Protein Interaction Network (PPIN), which we call short loop network motifs (SLM). We uncover a new property of PPINs named 'short loop commonality' to measure indirect PPIs occurring via common SLM interactions. This detects 'modules' of PPI networks enriched with annotated biological functions of proteins containing mutation hotspots, exemplified by FLT3 and other receptor tyrosine kinase proteins. We further identify functional dependency or mutual exclusivity of short loop commonality pairs in large-scale cellular CRISPR-Cas9 knockout screening data. Our pipeline provides a new strategy for identifying new therapeutic targets for drug discovery.
Collapse
Affiliation(s)
- Sun Sook Chung
- Department of Haematological Medicine, King's College London, London, SE5 9NU, UK
| | - Joseph C F Ng
- Randall Centre for Cell and Molecular Biophysics, King's College London, London, SE1 1UL, UK
| | - Anna Laddach
- Randall Centre for Cell and Molecular Biophysics, King's College London, London, SE1 1UL, UK
| | - N Shaun B Thomas
- Department of Haematological Medicine, King's College London, London, SE5 9NU, UK
| | - Franca Fraternali
- Randall Centre for Cell and Molecular Biophysics, King's College London, London, SE1 1UL, UK
| |
Collapse
|
7
|
Rivera-Muñoz EA, Milko LV, Harrison SM, Azzariti DR, Kurtz CL, Lee K, Mester JL, Weaver MA, Currey E, Craigen W, Eng C, Funke B, Hegde M, Hershberger RE, Mao R, Steiner RD, Vincent LM, Martin CL, Plon SE, Ramos E, Rehm HL, Watson M, Berg JS. ClinGen Variant Curation Expert Panel experiences and standardized processes for disease and gene-level specification of the ACMG/AMP guidelines for sequence variant interpretation. Hum Mutat 2019; 39:1614-1622. [PMID: 30311389 DOI: 10.1002/humu.23645] [Citation(s) in RCA: 107] [Impact Index Per Article: 21.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Revised: 08/09/2018] [Accepted: 08/30/2018] [Indexed: 01/09/2023]
Abstract
Genome-scale sequencing creates vast amounts of genomic data, increasing the challenge of clinical sequence variant interpretation. The demand for high-quality interpretation requires multiple specialties to join forces to accelerate the interpretation of sequence variant pathogenicity. With over 600 international members including clinicians, researchers, and laboratory diagnosticians, the Clinical Genome Resource (ClinGen), funded by the National Institutes of Health, is forming expert groups to systematically evaluate variants in clinically relevant genes. Here, we describe the first ClinGen variant curation expert panels (VCEPs), development of consistent and streamlined processes for establishing new VCEPs, and creation of standard operating procedures for VCEPs to define application of the ACMG/AMP guidelines for sequence variant interpretation in specific genes or diseases. Additionally, ClinGen has created user interfaces to enhance reliability of curation and a Sequence Variant Interpretation Working Group (SVI WG) to harmonize guideline specifications and ensure consistency between groups. The expansion of VCEPs represents the primary mechanism by which curation of a substantial fraction of genomic variants can be accelerated and ultimately undertaken systematically and comprehensively. We welcome groups to utilize our resources and become involved in our effort to create a publicly accessible, centralized resource for clinically relevant genes and variants.
Collapse
Affiliation(s)
- Edgar A Rivera-Muñoz
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina
| | - Laura V Milko
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina
| | - Steven M Harrison
- Partners HealthCare Laboratory for Molecular Medicine, Cambridge, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Danielle R Azzariti
- Partners HealthCare Laboratory for Molecular Medicine, Cambridge, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - C Lisa Kurtz
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina
| | - Kristy Lee
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina
| | | | - Meredith A Weaver
- American College of Medical Genetics and Genomics, Bethesda, Maryland
| | - Erin Currey
- Division of Genomic Medicine, National Human Genome Research Institute (NHGRI), NIH, Bethesda, Maryland
| | - William Craigen
- Baylor College of Medicine, Departments of Molecular and Human Genetics, and Pediatrics, Houston, Texas
| | - Charis Eng
- Genomic Medicine Institute, Cleveland Clinic, Cleveland, Ohio
| | - Birgit Funke
- Partners HealthCare Laboratory for Molecular Medicine, Cambridge, Massachusetts.,Veritas Genetics, Danvers, Massachusetts.,Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts
| | - Madhuri Hegde
- PerkinElmer, Global Laboratory Services, Waltham, Massachusetts.,Emory University, Department of Human Genetics, Atlanta, Georgia
| | - Ray E Hershberger
- Divisions of Human Genetics and Cardiovascular Medicine, The Ohio State University Wexner Medical Center, Columbus, Ohio
| | - Rong Mao
- Department of Pathology, University of Utah, Salt Lake City, Utah.,Department of Molecular Genetics and Genomics, ARUP Laboratories, Salt Lake City, Utah
| | - Robert D Steiner
- Departments of Pediatrics and Genetics, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin.,Prevention Genetics, Marshfield, Wisconsin
| | | | - Christa L Martin
- Autism & Developmental Medicine Institute, Geisinger, Danville, PA
| | - Sharon E Plon
- Baylor College of Medicine, Departments of Molecular and Human Genetics, and Pediatrics, Houston, Texas
| | - Erin Ramos
- Division of Genomic Medicine, National Human Genome Research Institute (NHGRI), NIH, Bethesda, Maryland
| | - Heidi L Rehm
- Partners HealthCare Laboratory for Molecular Medicine, Cambridge, Massachusetts.,Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Michael Watson
- American College of Medical Genetics and Genomics, Bethesda, Maryland
| | - Jonathan S Berg
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina
| |
Collapse
|
8
|
Abstract
With rapid advances in genetics and genomics, the commercialization and access to new applications has become more widespread and omnipresent throughout biomedical research. Thus, increasingly, more patients will have personal genomic information they may share with primary care providers (PCPs) to better understand the clinical significance of the data. To be able to respond to patient inquiries about genomic data, variant interpretation, disease risk, and other issues, PCPs will need to be able to increase or refresh their awareness about genetics and genomics, and identify reliable resources to use or refer patients. While provider educational efforts have increased, with the rapid advances in the field, ongoing efforts will be needed to prepare PCPs to manage patient needs, integrate results into care, and refer as indicated.
Collapse
Affiliation(s)
- Susanne B Haga
- Center for Applied Genomics and Precision Medicine, Duke University School of Medicine, Durham, NC, 27708, USA.
| |
Collapse
|
9
|
Pawliczek P, Patel RY, Ashmore LR, Jackson AR, Bizon C, Nelson T, Powell B, Freimuth RR, Strande N, Shah N, Paithankar S, Wright MW, Dwight S, Zhen J, Landrum M, McGarvey P, Babb L, Plon SE, Milosavljevic A. ClinGen Allele Registry links information about genetic variants. Hum Mutat 2018; 39:1690-1701. [PMID: 30311374 PMCID: PMC6519371 DOI: 10.1002/humu.23637] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Revised: 08/01/2018] [Accepted: 08/28/2018] [Indexed: 11/18/2022]
Abstract
Effective exchange of information about genetic variants is currently hampered by the lack of readily available globally unique variant identifiers that would enable aggregation of information from different sources. The ClinGen Allele Registry addresses this problem by providing (1) globally unique "canonical" variant identifiers (CAids) on demand, either individually or in large batches; (2) access to variant-identifying information in a searchable Registry; (3) links to allele-related records in many commonly used databases; and (4) services for adding links to information about registered variants in external sources. A core element of the Registry is a canonicalization service, implemented using in-memory sequence alignment-based index, which groups variant identifiers denoting the same nucleotide variant and assigns unique and dereferenceable CAids. More than 650 million distinct variants are currently registered, including those from gnomAD, ExAC, dbSNP, and ClinVar, including a small number of variants registered by Registry users. The Registry is accessible both via a web interface and programmatically via well-documented Hypertext Transfer Protocol (HTTP) Representational State Transfer Application Programming Interface (REST-APIs). For programmatic interoperability, the Registry content is accessible in the JavaScript Object Notation for Linked Data (JSON-LD) format. We present several use cases and demonstrate how the linked information may provide raw material for reasoning about variant's pathogenicity.
Collapse
Affiliation(s)
- Piotr Pawliczek
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexas
| | - Ronak Y. Patel
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexas
| | - Lillian R. Ashmore
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexas
| | - Andrew R. Jackson
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexas
| | - Chris Bizon
- Renaissance Computing InstituteUniversity of North CarolinaChapel HillNorth Carolina
| | - Tristan Nelson
- Geisinger's Autism and Developmental MedicineLewisburgPennsylvania
| | - Bradford Powell
- Department of GeneticsUniversity of North CarolinaChapel HillNorth Carolina
| | | | - Natasha Strande
- Department of GeneticsUniversity of North CarolinaChapel HillNorth Carolina
| | - Neethu Shah
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexas
| | - Sameer Paithankar
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexas
| | - Matt W. Wright
- Department of Biomedical Data SciencesStanford University School of MedicinePalo AltoCalifornia
| | - Selina Dwight
- Department of Biomedical Data SciencesStanford University School of MedicinePalo AltoCalifornia
| | - Jimmy Zhen
- Department of Biomedical Data SciencesStanford University School of MedicinePalo AltoCalifornia
| | - Melissa Landrum
- National Center for Biotechnology InformationNational Institutes of HealthBethesdaMaryland
| | - Peter McGarvey
- Innovation Center for Biomedical InformaticsGeorgetown University Medical CenterWashingtonDistrict of Columbia
| | - Larry Babb
- Sunquest Information Systems CompanyBostonMassachusetts
| | - Sharon E. Plon
- Department of Molecular and Human GeneticsBaylor College of MedicineHoustonTexas
- Department of PediatricsBaylor College of Medicine HoustonTexas
| | | | | |
Collapse
|
10
|
Chakravorty S, Hegde M. Inferring the effect of genomic variation in the new era of genomics. Hum Mutat 2018; 39:756-773. [PMID: 29633501 DOI: 10.1002/humu.23427] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Revised: 03/20/2018] [Accepted: 03/28/2018] [Indexed: 12/11/2022]
Abstract
Accurate and detailed understanding of the effects of variants in the coding and noncoding regions of the genome is the next big challenge in the new genomic era of personalized medicine, especially to tackle newer findings of genetic and phenotypic heterogeneity of diseases. This is necessary to resolve the gene-variant-disease relationship, the pathogenic variant spectrum of genes, pathogenic variants with variable clinical consequences, and multiloci diseases. In turn, this will facilitate patient recruitment for relevant clinical trials. In this review, we describe the trends in research at the intersection of basic and clinical genomics aiming to (a) overcome molecular diagnostic challenges and increase the clinical utility of next-generation sequencing (NGS) platforms, (b) elucidate variants associated with disease, (c) determine overall genomic complexity including epistasis, complex inheritance patterns such as "synergistic heterozygosity," digenic/multigenic inheritance, modifier effect, and rare variant load. We describe the newly emerging field of integrated functional genomics, in vivo or in vitro large-scale functional approaches, statistical bioinformatics algorithms that support NGS genomics data to interpret variants for timely clinical diagnostics and disease management. Thus, facilitating the discovery of new therapeutic or biomarker options, and their roles in the future of personalized medicine.
Collapse
Affiliation(s)
- Samya Chakravorty
- Department of Human Genetics, Emory University School of Medicine, Whitehead Biomedical Research Building Suite 301, Atlanta, Georgia
| | - Madhuri Hegde
- Department of Human Genetics, Emory University School of Medicine, Whitehead Biomedical Research Building Suite 301, Atlanta, Georgia
| |
Collapse
|
11
|
Saghira C, Bis DM, Stanek D, Strickland A, Herrmann DN, Reilly MM, Scherer SS, Shy ME, Züchner S. Variant pathogenicity evaluation in the community-driven Inherited Neuropathy Variant Browser. Hum Mutat 2018; 39:635-642. [PMID: 29473246 DOI: 10.1002/humu.23412] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2017] [Revised: 02/09/2018] [Accepted: 02/16/2018] [Indexed: 12/13/2022]
Abstract
Charcot-Marie-Tooth disease (CMT) is an umbrella term for inherited neuropathies affecting an estimated one in 2,500 people. Over 120 CMT and related genes have been identified and clinical gene panels often contain more than 100 genes. Such a large genomic space will invariantly yield variants of uncertain clinical significance (VUS) in nearly any person tested. This rise in number of VUS creates major challenges for genetic counseling. Additionally, fewer individual variants in known genes are being published as the academic merit is decreasing, and most testing now happens in clinical laboratories, which typically do not correlate their variants with clinical phenotypes. For CMT, we aim to encourage and facilitate the global capture of variant data to gain a large collection of alleles in CMT genes, ideally in conjunction with phenotypic information. The Inherited Neuropathy Variant Browser provides user-friendly open access to currently reported variation in CMT genes. Geneticists, physicians, and genetic counselors can enter variants detected by clinical tests or in research studies in addition to genetic variation gathered from published literature, which are then submitted to ClinVar biannually. Active participation of the broader CMT community will provide an advance over existing resources for interpretation of CMT genetic variation.
Collapse
Affiliation(s)
- Cima Saghira
- Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami, Miami, Florida
- Hussman Institute for Human Genomics, University of Miami, Miami, Florida
| | - Dana M Bis
- Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami, Miami, Florida
- Hussman Institute for Human Genomics, University of Miami, Miami, Florida
| | - David Stanek
- Department of Paediatric Neurology, Charles University, Prague, Czech Republic
| | - Alleene Strickland
- Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami, Miami, Florida
- Hussman Institute for Human Genomics, University of Miami, Miami, Florida
| | - David N Herrmann
- Department of Neurology, University of Rochester, Rochester, New York
| | - Mary M Reilly
- MRC Centre for Neuromuscular Diseases, UCL Institute of Neurology, Queen Square, London, UK
| | - Steven S Scherer
- Department of Neurology, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Michael E Shy
- Department of Neurology, University of Iowa, Iowa City, Iowa
| | - Stephan Züchner
- Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami, Miami, Florida
- Hussman Institute for Human Genomics, University of Miami, Miami, Florida
| |
Collapse
|
12
|
Bean LJH, Hegde MR. Clinical implications and considerations for evaluation of in silico algorithms for use with ACMG/AMP clinical variant interpretation guidelines. Genome Med 2017; 9:111. [PMID: 29254502 PMCID: PMC5733812 DOI: 10.1186/s13073-017-0508-z] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Clinical genetics laboratories have recently adopted guidelines for the interpretation of sequence variants set by the American College of Medical Genetics (ACMG) and Association for Molecular Pathology (AMP). The use of in silico algorithms to predict whether amino acid substitutions result in human disease is inconsistent across clinical laboratories. The clinical genetics community must carefully consider how in silico predictions can be incorporated into variant interpretation in clinical practice. Please see related Research article: https://doi.org/10.1186/s13059-017-1353-5
Collapse
Affiliation(s)
- Lora J H Bean
- Department of Human Genetics, Emory University, Atlanta, GA, USA. .,EGL Genetics, Tucker, GA, USA.
| | - Madhuri R Hegde
- Department of Human Genetics, Emory University, Atlanta, GA, USA.,School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA.,PerkinElmer Genetics, Pittsburgh, PA, USA
| |
Collapse
|
13
|
Abstract
Next-generation or massively parallel sequencing has transformed the landscape of genetic testing for cancer susceptibility. Panel-based genetic tests evaluate multiple genes simultaneously and rapidly. Because these tests are frequently offered in clinical settings, understanding their clinical validity and utility is critical. When evaluating the inherited risk of breast and ovarian cancers, panel-based tests provide incremental benefit compared with BRCA1/2 genetic testing. For inherited risk of other cancers, such as colon cancer and pheochromocytoma-paraganglioma, the clinical utility and yield of panel-based testing are higher; in fact, simultaneous evaluation of multiple genes has been the historical standard for these diseases. Evaluating inherited risk with panel-based testing has recently entered clinical practice for prostate and pancreatic cancers, with potential therapeutic implications. The resulting variants of uncertain significance and mutations with unclear actionability pose challenges to service providers and patients, underscoring the importance of genetic counseling and data-sharing initiatives. This review explores the evolving merits, challenges, and nuances of panel-based testing for cancer susceptibility.
Collapse
Affiliation(s)
- Payal D Shah
- Division of Hematology and Oncology, Department of Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania 19104;
| | - Katherine L Nathanson
- Division of Translational Medicine and Human Genetics, Department of Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania 19104; .,Abramson Cancer Center, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania 19104
| |
Collapse
|
14
|
Hoskinson DC, Dubuc AM, Mason-Suares H. The current state of clinical interpretation of sequence variants. Curr Opin Genet Dev 2017; 42:33-39. [PMID: 28157586 DOI: 10.1016/j.gde.2017.01.001] [Citation(s) in RCA: 59] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Revised: 11/20/2016] [Accepted: 01/09/2017] [Indexed: 01/19/2023]
Abstract
Accurate and consistent variant classification is required for Precision Medicine. But clinical variant classification remains in its infancy. While recent guidelines put forth jointly by the American College of Medical Genetics and Genomics (ACMG) and Association of Molecular Pathology (AMP) for the classification of Mendelian variants has advanced the field, the degree of subjectivity allowed by these guidelines can still lead to inconsistent classification across clinical molecular genetic laboratories. In addition, there are currently no such guidelines for somatic cancer variants, only published institutional practices. Additional variant classification guidelines, including disease- or gene-specific criteria, along with inter-laboratory data sharing is critical for accurate and consistent variant interpretation.
Collapse
Affiliation(s)
- Derick C Hoskinson
- Laboratory for Molecular Medicine, Partners HealthCare Personalized Medicine, 65 Landsdowne Str., Cambridge, MA 02115 USA
| | - Adrian M Dubuc
- Department of Pathology, Harvard Medical School and Brigham and Women's Hospital, 75 Francis Str., Boston, MA 02115 USA
| | - Heather Mason-Suares
- Laboratory for Molecular Medicine, Partners HealthCare Personalized Medicine, 65 Landsdowne Str., Cambridge, MA 02115 USA; Department of Pathology, Harvard Medical School and Brigham and Women's Hospital, 75 Francis Str., Boston, MA 02115 USA.
| |
Collapse
|
15
|
Reassessment of Genomic Sequence Variation to Harmonize Interpretation for Personalized Medicine. Am J Hum Genet 2016; 99:1140-1149. [PMID: 27843123 DOI: 10.1016/j.ajhg.2016.09.015] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 09/21/2016] [Indexed: 01/25/2023] Open
Abstract
Accurate interpretation of DNA sequence variation is a prerequisite for implementing personalized medicine. Discrepancies in interpretation between testing laboratories impede the effective use of genetic test results in clinical medicine. To better understand the underpinnings of these discrepancies, we quantified differences in variant classification internally over time and those between our diagnostic laboratory and other laboratories and resources. We assessed the factors that contribute to these discrepancies and those that facilitate their resolution. Our process resolved 72% of nearly 300 discrepancies between pairs of laboratories to within a one-step classification difference and identified key sources of data that facilitate changes in variant interpretation. The identification and harmonization of variant discrepancies will maximize the clinical use of genetic information; these processes will be fostered by the accumulation of additional population data as well as the sharing of data between diagnostic laboratories.
Collapse
|
16
|
Pinard A, Miltgen M, Blanchard A, Mathieu H, Desvignes JP, Salgado D, Fabre A, Arnaud P, Barré L, Krahn M, Grandval P, Olschwang S, Zaffran S, Boileau C, Béroud C, Collod-Béroud G. Actionable Genes, Core Databases, and Locus-Specific Databases. Hum Mutat 2016; 37:1299-1307. [PMID: 27600092 DOI: 10.1002/humu.23112] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Accepted: 08/31/2016] [Indexed: 01/04/2023]
Abstract
Adoption of next-generation sequencing (NGS) in a diagnostic context raises numerous questions with regard to identification and reports of secondary variants (SVs) in actionable genes. To better understand the whys and wherefores of these questioning, it is necessary to understand how they are selected during the filtering process and how their proportion can be estimated. It is likely that SVs are underestimated and that our capacity to label all true SVs can be improved. In this context, Locus-specific databases (LSDBs) can be key by providing a wealth of information and enabling classifying variants. We illustrate this issue by analyzing 318 SVs in 23 actionable genes involved in cancer susceptibility syndromes identified through sequencing of 572 participants selected for a range of atherosclerosis phenotypes. Among these 318 SVs, only 43.4% are reported in Human Gene Mutation Database (HGMD) Professional versus 71.4% in LSDB. In addition, 23.9% of HGMD Professional variants are reported as pathogenic versus 4.8% for LSDB. These data underline the benefits of LSDBs to annotate SVs and minimize overinterpretation of mutations thanks to their efficient curation process and collection of unpublished data.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Aurélie Fabre
- Aix Marseille Univ, INSERM, GMGF, Marseille, France.,APHM, Hôpital Timone Enfants, Laboratoire de Génétique Moléculaire, Marseille, 13385, France
| | - Pauline Arnaud
- AP-HP, Hôpital Bichat, Centre National de Référence pour le syndrome de Marfan et apparentés, Paris, France.,UFR de Médecine, Diderot Paris Université Paris 7, Paris, France.,Inserm, U1148, Paris, France
| | - Laura Barré
- Aix Marseille Univ, INSERM, GMGF, Marseille, France
| | - Martin Krahn
- Aix Marseille Univ, INSERM, GMGF, Marseille, France.,APHM, Hôpital Timone Enfants, Laboratoire de Génétique Moléculaire, Marseille, 13385, France
| | - Philippe Grandval
- Aix Marseille Univ, INSERM, GMGF, Marseille, France.,AP-HM, Hôpital de la Timone, Gastroentérologie, Marseille, France
| | - Sylviane Olschwang
- Aix Marseille Univ, INSERM, GMGF, Marseille, France.,APHM, Hôpital Timone Enfants, Laboratoire de Génétique Moléculaire, Marseille, 13385, France.,Hôpital Clairval, Ramsay Générale de Santé, Marseille, France.,Hôpital Européen, Fondation Ambroise Paré, Marseille, France
| | | | - Catherine Boileau
- AP-HP, Hôpital Bichat, Centre National de Référence pour le syndrome de Marfan et apparentés, Paris, France.,UFR de Médecine, Diderot Paris Université Paris 7, Paris, France.,Inserm, U1148, Paris, France
| | - Christophe Béroud
- Aix Marseille Univ, INSERM, GMGF, Marseille, France.,APHM, Hôpital Timone Enfants, Laboratoire de Génétique Moléculaire, Marseille, 13385, France
| | | |
Collapse
|
17
|
Affiliation(s)
- Garry R. Cutting
- Institute of Genetic Medicine and Department of Pediatrics and Medicine; Johns Hopkins University School of Medicine; Baltimore Maryland
| | - Haig H. Kazazian
- Institute of Genetic Medicine and Department of Pediatrics and Medicine; Johns Hopkins University School of Medicine; Baltimore Maryland
| |
Collapse
|