1
|
Soussi T, Baliakas P. Landscape of TP53 Alterations in Chronic Lymphocytic Leukemia via Data Mining Mutation Databases. Front Oncol 2022; 12:808886. [PMID: 35251978 PMCID: PMC8890000 DOI: 10.3389/fonc.2022.808886] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 01/20/2022] [Indexed: 11/16/2022] Open
Abstract
Locus-specific databases are invaluable tools for both basic and clinical research. The extensive information they contain is gathered from the literature and manually curated by experts. Cancer genome sequencing projects generate an immense amount of data, which are stored directly in large repositories (cancer genome databases). The presence of a TP53 defect (17p deletion and/or TP53 mutations) is an independent prognostic factor in chronic lymphocytic leukemia (CLL) and TP53 status analysis has been adopted in routine clinical practice. For that reason, TP53 mutation databases have become essential for the validation of the plethora of TP53 variants detected in tumor samples. TP53 profiles in CLL are characterized by a great number of subclonal TP53 mutations with low variant allelic frequencies and the presence of multiple minor subclones harboring different TP53 mutations. In this review, we describe the various characteristics of the multiple levels of heterogeneity of TP53 variants in CLL through the analysis of TP53 mutation databases and the utility of their diagnosis in the clinic.
Collapse
Affiliation(s)
- Thierry Soussi
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden.,Sorbonne Université, UPMC Univ Paris 06, Paris, France
| | - Panagiotis Baliakas
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
2
|
Zhou Y. DNA Epidemic Model Construction and Dynamics Optimization. INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE 2020. [DOI: 10.4018/ijcini.2020070105] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In order to solve some complex optimization problems, the SIR-DNA algorithm was constructed based on the DNA-based SIR (susceptible-infectious-recovered) infectious disease model. Since infectious diseases attack a very small part of the individual's genes, the number of variables per treatment is small; thus, the natural dimensionality reduction of the algorithm is achieved. Based on the DNA-SIR infectious disease model, different infections can be distinguished in the pathogenesis of viruses. The mechanisms of disease transmission are described by the SIR model, and these are used to construct operators such as SS, SI, II, IR, RR, and RS, so that individuals can naturally exchange information naturally through disease transmission. The test results show that the algorithm has the characteristics of strong search ability and has a high convergence speed for solving complex optimization problems.
Collapse
|
3
|
Endoplasmic reticulum stress and proteasome pathway involvement in human podocyte injury with a truncated COL4A3 mutation. Chin Med J (Engl) 2020; 132:1823-1832. [PMID: 31306228 PMCID: PMC6759124 DOI: 10.1097/cm9.0000000000000294] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Background: Collagen type IV (COL4)-related nephropathy includes a variety of kidney diseases that occur with or without extra-renal manifestations caused by COL4A3-5 mutations. Previous studies revealed several novel mutations, including three COL4A3 missense mutations (G619R, G801R, and C1616Y) and the COL4A3 chr:228172489delA c.4317delA p.Thr1440ProfsX87 frameshift mutation that resulted in a truncated NC1 domain (hereafter named COL4A3 c.4317delA); however, the mutation mechanisms that lead to podocyte injury remain unclear. This study aimed to further explore the mutation mechanisms that lead to podocyte injury. Methods: Wild-type (WT) and four mutant COL4A3 segments were constructed into a lentiviral plasmid, then stably transfected into human podocytes. Real-time polymerase chain reaction and Western blotting were applied to detect endoplasmic reticulum stress (ERS)- and apoptosis-related mRNA and protein levels. Then, human podocytes were treated with MG132 (a proteasome inhibitor) and brefeldin A (a transport protein inhibitor). The human podocyte findings were verified by the establishment of a mus-Col4a3 knockout mouse monoclonal podocyte using clustered regularly interspaced short palindromic repeats/CRISPR-associated protein 9 (CRISPR/Cas9) technology. Results: Our data showed that COL4A3 mRNA was significantly overexpressed in the lentivirus stably transfected podocytes. Moreover, the COL4A3 protein level was significantly increased in all groups except the COL4A3 c.4317delA group. Compared to the other test groups, the COL4A3 c.4317delA group showed excessive ERS and apoptosis. Podocytes treated with MG132 showed remarkably increased intra-cellular expression of the COL4A3 c.4317delA mutation. MG132 intervention improved higher ERS and apoptosis levels in the COL4A3 c.4317delA group. Mouse monoclonal podocytes with COL4A3 chr:82717932insA c.4852insA p.Arg1618ThrfsX4 were successfully acquired; this NC1-truncated mutation suggested a higher level of ERS and relatively remarkable level of apoptosis compared to that of the WT group. Conclusions: We demonstrated that excessive ERS and ERS-induced apoptosis were involved in the podocyte injury caused by the NC1-truncated COL4A3 mutation. Furthermore, proteasome pathway intervention might become a potential treatment for collagen type IV-related nephropathy caused by a severely truncated COL4A3 mutation.
Collapse
|
4
|
Manzoni C, Kia DA, Vandrovcova J, Hardy J, Wood NW, Lewis PA, Ferrari R. Genome, transcriptome and proteome: the rise of omics data and their integration in biomedical sciences. Brief Bioinform 2019; 19:286-302. [PMID: 27881428 PMCID: PMC6018996 DOI: 10.1093/bib/bbw114] [Citation(s) in RCA: 388] [Impact Index Per Article: 77.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Indexed: 02/07/2023] Open
Abstract
Advances in the technologies and informatics used to generate and process large biological data sets (omics data) are promoting a critical shift in the study of biomedical sciences. While genomics, transcriptomics and proteinomics, coupled with bioinformatics and biostatistics, are gaining momentum, they are still, for the most part, assessed individually with distinct approaches generating monothematic rather than integrated knowledge. As other areas of biomedical sciences, including metabolomics, epigenomics and pharmacogenomics, are moving towards the omics scale, we are witnessing the rise of inter-disciplinary data integration strategies to support a better understanding of biological systems and eventually the development of successful precision medicine. This review cuts across the boundaries between genomics, transcriptomics and proteomics, summarizing how omics data are generated, analysed and shared, and provides an overview of the current strengths and weaknesses of this global approach. This work intends to target students and researchers seeking knowledge outside of their field of expertise and fosters a leap from the reductionist to the global-integrative analytical approach in research.
Collapse
Affiliation(s)
- Claudia Manzoni
- School of Pharmacy, University of Reading, Whiteknights, Reading, United Kingdom.,Department Molecular Neuroscience, UCL Institute of Neurology, London, United Kingdom
| | - Demis A Kia
- Department Molecular Neuroscience, UCL Institute of Neurology, London, United Kingdom
| | - Jana Vandrovcova
- Department Molecular Neuroscience, UCL Institute of Neurology, London, United Kingdom
| | - John Hardy
- Department Molecular Neuroscience, UCL Institute of Neurology, London, United Kingdom
| | - Nicholas W Wood
- Department Molecular Neuroscience, UCL Institute of Neurology, London, United Kingdom
| | - Patrick A Lewis
- School of Pharmacy, University of Reading, Whiteknights, Reading, United Kingdom.,Department Molecular Neuroscience, UCL Institute of Neurology, London, United Kingdom
| | - Raffaele Ferrari
- Department Molecular Neuroscience, UCL Institute of Neurology, London, United Kingdom
| |
Collapse
|
5
|
Feldman K, Johnson RA, Chawla NV. The State of Data in Healthcare: Path Towards Standardization. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2018; 2:248-271. [PMID: 35415409 PMCID: PMC8982788 DOI: 10.1007/s41666-018-0019-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Revised: 03/21/2018] [Accepted: 03/29/2018] [Indexed: 12/23/2022]
Abstract
Coupled with the rise of data science and machine learning, the increasing availability of digitized health and wellness data has provided an exciting opportunity for complex analyses of problems throughout the healthcare domain. Whereas many early works focused on a particular aspect of patient care, often drawing on data from a specific clinical or administrative source, it has become clear such a single-source approach is insufficient to capture the complexity of the human condition. Instead, adequately modeling health and wellness problems requires the ability to draw upon data spanning multiple facets of an individual's biology, their care, and the social aspects of their life. Although such an awareness has greatly expanded the breadth of health and wellness data collected, the diverse array of data sources and intended uses often leave researchers and practitioners with a scattered and fragmented view of any particular patient. As a result, there exists a clear need to catalogue and organize the range of healthcare data available for analysis. This work represents an effort at developing such an organization, presenting a patient-centric framework deemed the Healthcare Data Spectrum (HDS). Comprised of six layers, the HDS begins with the innermost micro-level omics and macro-level demographic data that directly characterize a patient, and extends at its outermost to aggregate population-level data derived from attributes of care for each individual patient. For each level of the HDS, this manuscript will examine the specific types of constituent data, provide examples of how the data aid in a broad set of research problems, and identify the primary terminology and standards used to describe the data.
Collapse
Affiliation(s)
- Keith Feldman
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46656 USA
- iCeNSA, University of Notre Dame, Notre Dame, IN 46656 USA
| | - Reid A. Johnson
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46656 USA
- iCeNSA, University of Notre Dame, Notre Dame, IN 46656 USA
| | - Nitesh V. Chawla
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46656 USA
- iCeNSA, University of Notre Dame, Notre Dame, IN 46656 USA
| |
Collapse
|
6
|
Abstract
The rise of genomically targeted therapies and immunotherapy has revolutionized the practice of oncology in the last 10–15 years. At the same time, new technologies and the electronic health record (EHR) in particular have permeated the oncology clinic. Initially designed as billing and clinical documentation systems, EHR systems have not anticipated the complexity and variety of genomic information that needs to be reviewed, interpreted, and acted upon on a daily basis. Improved integration of cancer genomic data with EHR systems will help guide clinician decision making, support secondary uses, and ultimately improve patient care within oncology clinics. Some of the key factors relating to the challenge of integrating cancer genomic data into EHRs include: the bioinformatics pipelines that translate raw genomic data into meaningful, actionable results; the role of human curation in the interpretation of variant calls; and the need for consistent standards with regard to genomic and clinical data. Several emerging paradigms for integration are discussed in this review, including: non-standardized efforts between individual institutions and genomic testing laboratories; “middleware” products that portray genomic information, albeit outside of the clinical workflow; and application programming interfaces that have the potential to work within clinical workflow. The critical need for clinical-genomic knowledge bases, which can be independent or integrated into the aforementioned solutions, is also discussed.
Collapse
Affiliation(s)
- Jeremy L Warner
- Department of Medicine, Division of Hematology/Oncology, Vanderbilt University, Nashville, TN, USA. .,Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, 37232, USA. .,Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, 37232, USA.
| | - Sandeep K Jain
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, 37232, USA.,Vanderbilt University School of Medicine, Nashville, TN, 37232, USA
| | - Mia A Levy
- Department of Medicine, Division of Hematology/Oncology, Vanderbilt University, Nashville, TN, USA.,Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, 37232, USA.,Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| |
Collapse
|
7
|
Chen R, Shi L, Hakenberg J, Naughton B, Sklar P, Zhang J, Zhou H, Tian L, Prakash O, Lemire M, Sleiman P, Cheng WY, Chen W, Shah H, Shen Y, Fromer M, Omberg L, Deardorff MA, Zackai E, Bobe JR, Levin E, Hudson TJ, Groop L, Wang J, Hakonarson H, Wojcicki A, Diaz GA, Edelmann L, Schadt EE, Friend SH. Analysis of 589,306 genomes identifies individuals resilient to severe Mendelian childhood diseases. Nat Biotechnol 2016; 34:531-8. [PMID: 27065010 DOI: 10.1038/nbt.3514] [Citation(s) in RCA: 209] [Impact Index Per Article: 26.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Accepted: 02/12/2016] [Indexed: 12/21/2022]
Abstract
Genetic studies of human disease have traditionally focused on the detection of disease-causing mutations in afflicted individuals. Here we describe a complementary approach that seeks to identify healthy individuals resilient to highly penetrant forms of genetic childhood disorders. A comprehensive screen of 874 genes in 589,306 genomes led to the identification of 13 adults harboring mutations for 8 severe Mendelian conditions, with no reported clinical manifestation of the indicated disease. Our findings demonstrate the promise of broadening genetic studies to systematically search for well individuals who are buffering the effects of rare, highly penetrant, deleterious mutations. They also indicate that incomplete penetrance for Mendelian diseases is likely more common than previously believed. The identification of resilient individuals may provide a first step toward uncovering protective genetic variants that could help elucidate the mechanisms of Mendelian diseases and new therapeutic strategies.
Collapse
Affiliation(s)
- Rong Chen
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Lisong Shi
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Jörg Hakenberg
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | | | - Pamela Sklar
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Friedman Brain Institute and Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | | | | | - Lifeng Tian
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Om Prakash
- Department of Clinical Sciences, Diabetes &Endocrinology, Lund University Diabetes Center, Skåne University Hospital, Lund University, Malmö, Sweden
| | - Mathieu Lemire
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Patrick Sleiman
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Wei-Yi Cheng
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | | | - Hardik Shah
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | | | - Menachem Fromer
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Friedman Brain Institute and Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | | | - Matthew A Deardorff
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Elaine Zackai
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Jason R Bobe
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Elissa Levin
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Thomas J Hudson
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Leif Groop
- Department of Clinical Sciences, Diabetes &Endocrinology, Lund University Diabetes Center, Skåne University Hospital, Lund University, Malmö, Sweden
| | | | - Hakon Hakonarson
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | | | - George A Diaz
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Lisa Edelmann
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Eric E Schadt
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Stephen H Friend
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA.,Sage Bionetworks, Seattle, Washington, USA
| |
Collapse
|
8
|
Dalgleish R. LSDBs and How They Have Evolved. Hum Mutat 2016; 37:532-9. [DOI: 10.1002/humu.22979] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Accepted: 02/18/2016] [Indexed: 01/10/2023]
Affiliation(s)
- Raymond Dalgleish
- Department of Genetics; University of Leicester; Leicester United Kingdom
| |
Collapse
|
9
|
Tack V, Deans ZC, Wolstenholme N, Patton S, Dequeker EMC. What's in a Name? A Coordinated Approach toward the Correct Use of a Uniform Nomenclature to Improve Patient Reports and Databases. Hum Mutat 2016; 37:570-5. [DOI: 10.1002/humu.22975] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Accepted: 02/04/2016] [Indexed: 11/11/2022]
Affiliation(s)
- Véronique Tack
- Department of Public Health and Primary Care, Biomedical Quality Assurance Research Unit; KU Leuven; Leuven Belgium
| | - Zandra C. Deans
- Department of Laboratory Medicine, UK NEQAS for Molecular Genetics, UK NEQAS Edinburgh; The Royal Infirmary of Edinburgh; Edinburgh UK
| | - Nicola Wolstenholme
- EMQN, Manchester Centre for Genomic Medicine; St Mary's Hospital; Manchester M13 9WL UK
| | - Simon Patton
- EMQN, Manchester Centre for Genomic Medicine; St Mary's Hospital; Manchester M13 9WL UK
| | - Elisabeth M. C. Dequeker
- Department of Public Health and Primary Care, Biomedical Quality Assurance Research Unit; KU Leuven; Leuven Belgium
| |
Collapse
|
10
|
Liquori A, Vaché C, Baux D, Blanchet C, Hamel C, Malcolm S, Koenig M, Claustres M, Roux AF. Whole USH2A Gene Sequencing Identifies Several New Deep Intronic Mutations. Hum Mutat 2015; 37:184-93. [PMID: 26629787 DOI: 10.1002/humu.22926] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2015] [Accepted: 10/19/2015] [Indexed: 01/01/2023]
Abstract
Deep intronic mutations leading to pseudoexon (PE) insertions are underestimated and most of these splicing alterations have been identified by transcript analysis, for instance, the first deep intronic mutation in USH2A, the gene most frequently involved in Usher syndrome type II (USH2). Unfortunately, analyzing USH2A transcripts is challenging and for 1.8%-19% of USH2 individuals carrying a single USH2A recessive mutation, a second mutation is yet to be identified. We have developed and validated a DNA next-generation sequencing approach to identify deep intronic variants in USH2A and evaluated their consequences on splicing. Three distinct novel deep intronic mutations have been identified. All were predicted to affect splicing and resulted in the insertion of PEs, as shown by minigene assays. We present a new and attractive strategy to identify deep intronic mutations, when RNA analyses are not possible. Moreover, the bioinformatics pipeline developed is independent of the gene size, implying the possible application of this approach to any disease-linked gene. Finally, an antisense morpholino oligonucleotide tested in vitro for its ability to restore splicing caused by the c.9959-4159A>G mutation provided high inhibition rates, which are indicative of its potential for molecular therapy.
Collapse
Affiliation(s)
- Alessandro Liquori
- Laboratoire de Génétique de Maladies Rares EA 7402, Université de Montpellier, Montpellier, France
| | - Christel Vaché
- Laboratoire de Génétique de Maladies Rares EA 7402, Université de Montpellier, Montpellier, France.,Laboratoire de Génétique Moléculaire, CHRU Montpellier, Montpellier, France
| | - David Baux
- Laboratoire de Génétique de Maladies Rares EA 7402, Université de Montpellier, Montpellier, France.,Laboratoire de Génétique Moléculaire, CHRU Montpellier, Montpellier, France
| | - Catherine Blanchet
- Service ORL, CHRU Montpellier, Montpellier, France.,CHU Montpellier, Centre National de Référence Maladies Rares, "Affections Sensorielles Génétiques, France
| | - Christian Hamel
- CHU Montpellier, Centre National de Référence Maladies Rares, "Affections Sensorielles Génétiques, France
| | - Sue Malcolm
- Genetics and Genomic Medicine Programme, Institute of Child Health, UCL, London, UK
| | - Michel Koenig
- Laboratoire de Génétique de Maladies Rares EA 7402, Université de Montpellier, Montpellier, France.,Laboratoire de Génétique Moléculaire, CHRU Montpellier, Montpellier, France
| | - Mireille Claustres
- Laboratoire de Génétique de Maladies Rares EA 7402, Université de Montpellier, Montpellier, France.,Laboratoire de Génétique Moléculaire, CHRU Montpellier, Montpellier, France
| | - Anne-Françoise Roux
- Laboratoire de Génétique de Maladies Rares EA 7402, Université de Montpellier, Montpellier, France.,Laboratoire de Génétique Moléculaire, CHRU Montpellier, Montpellier, France
| |
Collapse
|
11
|
Improving the Sequence Ontology terminology for genomic variant annotation. J Biomed Semantics 2015; 6:32. [PMID: 26229585 PMCID: PMC4520272 DOI: 10.1186/s13326-015-0030-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2013] [Accepted: 07/22/2015] [Indexed: 11/11/2022] Open
Abstract
Background The Genome Variant Format (GVF) uses the Sequence Ontology (SO) to enable detailed annotation of sequence variation. The annotation includes SO terms for the type of sequence alteration, the genomic features that are changed and the effect of the alteration. The SO maintains and updates the specification and provides the underlying ontologicial structure. Methods A requirements analysis was undertaken to gather terms missing in the SO release at the time, but needed to adequately describe the effects of sequence alteration on a set of variant genomic annotations. We have extended and remodeled the SO to include and define all terms that describe the effect of variation upon reference genomic features in the Ensembl variation databases. Results The new terminology was used to annotate the human reference genome with a set of variants from both COSMIC and dbSNP. A GVF file containing 170,853 sequence alterations was generated using the SO terminology to annotate the kinds of alteration, the effect of the alteration and the reference feature changed. There are four kinds of alteration and 24 kinds of effect seen in this dataset. (Ensembl Variation annotates 34 different SO consequence terms: http://www.ensembl.org/info/docs/variation/predicted_data.html). Conclusions We explain the updates to the Sequence Ontology to describe the effect of variation on existing reference features. We have provided a set of annotations using this terminology, and the well defined GVF specification. We have also provided a provisional exploration of this large annotation dataset.
Collapse
|
12
|
Cheng DT, Mitchell TN, Zehir A, Shah RH, Benayed R, Syed A, Chandramohan R, Liu ZY, Won HH, Scott SN, Brannon AR, O'Reilly C, Sadowska J, Casanova J, Yannes A, Hechtman JF, Yao J, Song W, Ross DS, Oultache A, Dogan S, Borsu L, Hameed M, Nafa K, Arcila ME, Ladanyi M, Berger MF. Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT): A Hybridization Capture-Based Next-Generation Sequencing Clinical Assay for Solid Tumor Molecular Oncology. J Mol Diagn 2015; 17:251-64. [PMID: 25801821 DOI: 10.1016/j.jmoldx.2014.12.006] [Citation(s) in RCA: 1516] [Impact Index Per Article: 168.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2014] [Revised: 11/20/2014] [Accepted: 12/18/2014] [Indexed: 01/17/2023] Open
Abstract
The identification of specific genetic alterations as key oncogenic drivers and the development of targeted therapies are together transforming clinical oncology and creating a pressing need for increased breadth and throughput of clinical genotyping. Next-generation sequencing assays allow the efficient and unbiased detection of clinically actionable mutations. To enable precision oncology in patients with solid tumors, we developed Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT), a hybridization capture-based next-generation sequencing assay for targeted deep sequencing of all exons and selected introns of 341 key cancer genes in formalin-fixed, paraffin-embedded tumors. Barcoded libraries from patient-matched tumor and normal samples were captured, sequenced, and subjected to a custom analysis pipeline to identify somatic mutations. Sensitivity, specificity, reproducibility of MSK-IMPACT were assessed through extensive analytical validation. We tested 284 tumor samples with previously known point mutations and insertions/deletions in 47 exons of 19 cancer genes. All known variants were accurately detected, and there was high reproducibility of inter- and intrarun replicates. The detection limit for low-frequency variants was approximately 2% for hotspot mutations and 5% for nonhotspot mutations. Copy number alterations and structural rearrangements were also reliably detected. MSK-IMPACT profiles oncogenic DNA alterations in clinical solid tumor samples with high accuracy and sensitivity. Paired analysis of tumors and patient-matched normal samples enables unambiguous detection of somatic mutations to guide treatment decisions.
Collapse
Affiliation(s)
- Donavan T Cheng
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Talia N Mitchell
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Ahmet Zehir
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Ronak H Shah
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Ryma Benayed
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Aijazuddin Syed
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Raghu Chandramohan
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Zhen Yu Liu
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Helen H Won
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Sasinya N Scott
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - A Rose Brannon
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Catherine O'Reilly
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Justyna Sadowska
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Jacklyn Casanova
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Angela Yannes
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Jaclyn F Hechtman
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Jinjuan Yao
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Wei Song
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Dara S Ross
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Alifya Oultache
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Snjezana Dogan
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Laetitia Borsu
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Meera Hameed
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Khedoudja Nafa
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Maria E Arcila
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Marc Ladanyi
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Michael F Berger
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, New York.
| |
Collapse
|
13
|
Welch BM, Rodriguez Loya S, Eilbeck K, Kawamoto K. A proposed clinical decision support architecture capable of supporting whole genome sequence information. J Pers Med 2015; 4:176-99. [PMID: 25411644 PMCID: PMC4234046 DOI: 10.3390/jpm4020176] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Whole genome sequence (WGS) information may soon be widely available to help clinicians personalize the care and treatment of patients. However, considerable barriers exist, which may hinder the effective utilization of WGS information in a routine clinical care setting. Clinical decision support (CDS) offers a potential solution to overcome such barriers and to facilitate the effective use of WGS information in the clinic. However, genomic information is complex and will require significant considerations when developing CDS capabilities. As such, this manuscript lays out a conceptual framework for a CDS architecture designed to deliver WGS-guided CDS within the clinical workflow. To handle the complexity and breadth of WGS information, the proposed CDS framework leverages service-oriented capabilities and orchestrates the interaction of several independently-managed components. These independently-managed components include the genome variant knowledge base, the genome database, the CDS knowledge base, a CDS controller and the electronic health record (EHR). A key design feature is that genome data can be stored separately from the EHR. This paper describes in detail: (1) each component of the architecture; (2) the interaction of the components; and (3) how the architecture attempts to overcome the challenges associated with WGS information. We believe that service-oriented CDS capabilities will be essential to using WGS information for personalized medicine.
Collapse
Affiliation(s)
- Brandon M. Welch
- Program in Personalized Health Care, University of Utah, 15 North 2030 East, EIHG Room 2110, Salt Lake City, UT 84112, USA
- Department of Biomedical Informatics, University of Utah, 26 South 2000 East, Room 5775 HSEB, Salt Lake City, UT 84112, USA; E-Mails: (K.E.); (K.K.)
- Author to whom correspondence should be addressed; E-Mail: ; Tel.: +1-585-455-0461
| | - Salvador Rodriguez Loya
- School of Engineering and Informatics, University of Sussex, Shawcross Building, Room Gc4, Falmer, Brighton, East Sussex, BN1 9QT, UK; E-Mail:
| | - Karen Eilbeck
- Department of Biomedical Informatics, University of Utah, 26 South 2000 East, Room 5775 HSEB, Salt Lake City, UT 84112, USA; E-Mails: (K.E.); (K.K.)
| | - Kensaku Kawamoto
- Department of Biomedical Informatics, University of Utah, 26 South 2000 East, Room 5775 HSEB, Salt Lake City, UT 84112, USA; E-Mails: (K.E.); (K.K.)
| |
Collapse
|
14
|
Welch BM, Rodriguez-Loya S, Eilbeck K, Kawamoto K. Clinical decision support for whole genome sequence information leveraging a service-oriented architecture: a prototype. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2014; 2014:1188-1197. [PMID: 25954430 PMCID: PMC4419907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Whole genome sequence (WGS) information could soon be routinely available to clinicians to support the personalized care of their patients. At such time, clinical decision support (CDS) integrated into the clinical workflow will likely be necessary to support genome-guided clinical care. Nevertheless, developing CDS capabilities for WGS information presents many unique challenges that need to be overcome for such approaches to be effective. In this manuscript, we describe the development of a prototype CDS system that is capable of providing genome-guided CDS at the point of care and within the clinical workflow. To demonstrate the functionality of this prototype, we implemented a clinical scenario of a hypothetical patient at high risk for Lynch Syndrome based on his genomic information. We demonstrate that this system can effectively use service-oriented architecture principles and standards-based components to deliver point of care CDS for WGS information in real-time.
Collapse
Affiliation(s)
- Brandon M Welch
- Medical University of South Carolina, Charleston, SC ; University of Utah, Salt Lake City, UT
| | | | | | | |
Collapse
|
15
|
Pharmacogenomics for Precision Medicine in the Era of Collaborative Co-creation and Crowdsourcing. CURRENT GENETIC MEDICINE REPORTS 2014. [DOI: 10.1007/s40142-014-0041-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
16
|
Soussi T. Locus-Specific Databases in Cancer: What Future in a Post-Genomic Era? The TP53 LSDB paradigm. Hum Mutat 2014; 35:643-53. [DOI: 10.1002/humu.22518] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2013] [Accepted: 01/16/2014] [Indexed: 11/08/2022]
Affiliation(s)
- Thierry Soussi
- Department of Oncology-Pathology Cancer Center Karolinska (CCK); Karolinska Institute; Stockholm Sweden
- Université Pierre et Marie Curie Paris 6; Paris France
| |
Collapse
|
17
|
Patrinos GP, Cooper DN, van Mulligen E, Gkantouna V, Tzimas G, Tatum Z, Schultes E, Roos M, Mons B. Microattribution and nanopublication as means to incentivize the placement of human genome variation data into the public domain. Hum Mutat 2012; 33:1503-12. [PMID: 22736453 DOI: 10.1002/humu.22144] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2012] [Accepted: 05/23/2012] [Indexed: 11/07/2022]
Abstract
The advances in bioinformatics required to annotate human genomic variants and to place them in public data repositories have not kept pace with their discovery. Moreover, a law of diminishing returns has begun to operate both in terms of data publication and submission. Although the continued deposition of such data in the public domain is essential to maximize both their scientific and clinical utility, rewards for data sharing are few, representing a serious practical impediment to data submission. To date, two main strategies have been adopted as a means to encourage the submission of human genomic variant data: (1) database journal linkups involving the affiliation of a scientific journal with a publicly available database and (2) microattribution, involving the unambiguous linkage of data to their contributors via a unique identifier. The latter could in principle lead to the establishment of a microcitation-tracking system that acknowledges individual endeavor and achievement. Both approaches could incentivize potential data contributors, thereby encouraging them to share their data with the scientific community. Here, we summarize and critically evaluate approaches that have been proposed to address current deficiencies in data attribution and discuss ways in which they could become more widely adopted as novel scientific publication modalities.
Collapse
Affiliation(s)
- George P Patrinos
- Department of Pharmacy, School of Health Sciences, University of Patras, Patras, Greece.
| | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Abstract
Background SNP (Single Nucleotide Polymorphism), the most common genetic variations between human beings, is believed to be a promising way towards personalized medicine. As more and more research on SNPs are being conducted, non-standard nomenclatures may generate potential problems. The most serious issue is that researchers cannot perform cross referencing among different SNP databases. This will result in more resources and time required to track SNPs. It could be detrimental to the entire academic community. Results UASIS (Universal Automated SNP Identification System) is a web-based server for SNP nomenclature standardization and translation at DNA level. Three utilities are available. They are UASIS Aligner, Universal SNP Name Generator and SNP Name Mapper. UASIS maps SNPs from different databases, including dbSNP, GWAS, HapMap and JSNP etc., into an uniform view efficiently using a proposed universal nomenclature and state-of-art alignment algorithms. UASIS is freely available at http://www.uasis.tk with no requirement of log-in. Conclusions UASIS is a helpful platform for SNP cross referencing and tracking. By providing an informative, unique and unambiguous nomenclature, which utilizes unique position of a SNP, we aim to resolve the ambiguity of SNP nomenclatures currently practised. Our universal nomenclature is a good complement to mainstream SNP notations such as rs# and HGVS guidelines. UASIS acts as a bridge to connect heterogeneous representations of SNPs.
Collapse
Affiliation(s)
- Danny C C Poo
- Department of Information Systems, School of Computing, National University of Singapore, 13 Computing Drive, Singapore 117417.
| | | | | |
Collapse
|
19
|
Vaché C, Besnard T, le Berre P, García-García G, Baux D, Larrieu L, Abadie C, Blanchet C, Bolz HJ, Millan J, Hamel C, Malcolm S, Claustres M, Roux AF. Usher syndrome type 2 caused by activation of an USH2A pseudoexon: implications for diagnosis and therapy. Hum Mutat 2011; 33:104-8. [PMID: 22009552 DOI: 10.1002/humu.21634] [Citation(s) in RCA: 85] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2011] [Accepted: 10/07/2011] [Indexed: 11/09/2022]
Abstract
USH2A sequencing in three affected members of a large family, referred for the recessive USH2 syndrome, identified a single pathogenic alteration in one of them and a different mutation in the two affected nieces. As the patients carried a common USH2A haplotype, they likely shared a mutation not found by standard sequencing techniques. Analysis of RNA from nasal cells in one affected individual identified an additional pseudoexon (PE) resulting from a deep intronic mutation. This was confirmed by minigene assay. This is the first example in Usher syndrome (USH) with a mutation causing activation of a PE. The finding of this alteration in eight other individuals of mixed European origin emphasizes the importance of including RNA analysis in a comprehensive diagnostic service. Finally, this mutation, which would not have been found by whole-exome sequencing, could offer, for the first time in USH, the possibility of therapeutic correction by antisense oligonucleotides (AONs).
Collapse
Affiliation(s)
- Christel Vaché
- CHU Montpellier, Laboratoire de Génétique Moléculaire, Montpellier, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Abstract
TP53 mutations are the most frequent genetic alterations found in human cancer. For more than 20 years, TP53 mutation databases have collected over 30,000 somatic mutations from various types of cancer. Analyses of these mutations have led to many types of studies and have improved our knowledge about the TP53 protein and its function. The recent advances in sequencing methodologies and the various cancer genome sequencing projects will lead to a profound shift in database curation and data management. In this paper, we will review the current status of the TP53 mutation database, its application to various fields of research, and how data quality and curation can be improved. We will also discuss how the genetic data will be stored and handled in the future and the consequences for database management.
Collapse
|
21
|
Affiliation(s)
- A C Goodeve
- Haemostasis Research Group, University of Sheffield, Sheffield, UK.
| | | | | |
Collapse
|
22
|
Soussi T, Hamroun D, Hjortsberg L, Rubio-Nevado JM, Fournier JL, Béroud C. MUT-TP53 2.0: a novel versatile matrix for statistical analysis of TP53 mutations in human cancer. Hum Mutat 2010; 31:1020-5. [PMID: 20572016 DOI: 10.1002/humu.21313] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Analysis of the literature reporting p53 mutations shows that 8% of report display typographical mistakes with a notable increase in recent years. These errors are sometimes isolated, but in some cases, they concern several or even all mutations described in a single article. Furthermore, some works report unusual profile of p53 mutations whose accuracy is difficult to assess. To handle these problems we have developed MUT-TP53 2.0, an accurate and powerful tool that will automatically handle p53 mutations and generate tables ready for publication that will lower the risk of typographical errors. Furthermore, using functional and statistical information issued from the UMD p53 database, it allows to assess the biological activity and the likelihood of every p53 mutant.
Collapse
Affiliation(s)
- Thierry Soussi
- Karolinska Institute Department of Oncology-Pathology Cancer Center Karolinska, Stockholm, Sweden.
| | | | | | | | | | | |
Collapse
|
23
|
Laurila JB, Naderi N, Witte R, Riazanov A, Kouznetsov A, Baker CJO. Algorithms and semantic infrastructure for mutation impact extraction and grounding. BMC Genomics 2010; 11 Suppl 4:S24. [PMID: 21143808 PMCID: PMC3005927 DOI: 10.1186/1471-2164-11-s4-s24] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Background Mutation impact extraction is a hitherto unaccomplished task in state of the art mutation extraction systems. Protein mutations and their impacts on protein properties are hidden in scientific literature, making them poorly accessible for protein engineers and inaccessible for phenotype-prediction systems that currently depend on manually curated genomic variation databases. Results We present the first rule-based approach for the extraction of mutation impacts on protein properties, categorizing their directionality as positive, negative or neutral. Furthermore protein and mutation mentions are grounded to their respective UniProtKB IDs and selected protein properties, namely protein functions to concepts found in the Gene Ontology. The extracted entities are populated to an OWL-DL Mutation Impact ontology facilitating complex querying for mutation impacts using SPARQL. We illustrate retrieval of proteins and mutant sequences for a given direction of impact on specific protein properties. Moreover we provide programmatic access to the data through semantic web services using the SADI (Semantic Automated Discovery and Integration) framework. Conclusion We address the problem of access to legacy mutation data in unstructured form through the creation of novel mutation impact extraction methods which are evaluated on a corpus of full-text articles on haloalkane dehalogenases, tagged by domain experts. Our approaches show state of the art levels of precision and recall for Mutation Grounding and respectable level of precision but lower recall for the task of Mutant-Impact relation extraction. The system is deployed using text mining and semantic web technologies with the goal of publishing to a broad spectrum of consumers.
Collapse
Affiliation(s)
- Jonas B Laurila
- Department of Computer Science & Applied Statistics, University of New Brunswick, Saint John, New Brunswick, Canada.
| | | | | | | | | | | |
Collapse
|
24
|
Vaché C, Besnard T, Blanchet C, Baux D, Larrieu L, Faugère V, Mondain M, Hamel C, Malcolm S, Claustres M, Roux AF. Nasal epithelial cells are a reliable source to study splicing variants in Usher syndrome. Hum Mutat 2010; 31:734-41. [PMID: 20513143 DOI: 10.1002/humu.21255] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
We have shown that nasal ciliated epithelium, which can be easily biopsied under local anesthetic, provides a good source of RNA transcripts from eight of the nine known genes that cause Usher syndrome, namely, MYO7A, USH1C, CDH23, PCDH15, USH1G for Usher type 1, and USH2A, GPR98, WHRN for Usher type 2. Furthermore, the known or predicted effect on mRNA splicing of eight variants was faithfully reproduced in the biopsied sample as measured by nested RT-PCR. These included changes at the canonical acceptor site, changes within the noncanonical acceptor site and both synonymous and nonsynonymous amino acid changes. This shows that mRNA analysis by this method will help in assessing the pathogenic effect of variants, which is a major problem in the molecular diagnosis of Usher syndrome.
Collapse
Affiliation(s)
- Christel Vaché
- CHU Montpellier, Laboratoire de Génétique Moléculaire, Montpellier, France
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Küntzer J, Eggle D, Klostermann S, Burtscher H. Human variation databases. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2010; 2010:baq015. [PMID: 20639550 PMCID: PMC2911800 DOI: 10.1093/database/baq015] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
More than 100 000 human genetic variations have been described in various genes that are associated with a wide variety of diseases. Such data provides invaluable information for both clinical medicine and basic science. A number of locus-specific databases have been developed to exploit this huge amount of data. However, the scope, format and content of these databases differ strongly and as no standard for variation databases has yet been adopted, the way data is presented varies enormously. This review aims to give an overview of current resources for human variation data in public and commercial resources.
Collapse
Affiliation(s)
- Jan Küntzer
- Pharma Research and Early Development, pRED Informatics, Roche Diagnostics GmbH, Penzberg, Germany.
| | | | | | | |
Collapse
|
26
|
Winnenburg R, Plake C, Schroeder M. Improved mutation tagging with gene identifiers applied to membrane protein stability prediction. BMC Bioinformatics 2009; 10 Suppl 8:S3. [PMID: 19758467 PMCID: PMC2745585 DOI: 10.1186/1471-2105-10-s8-s3] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Background The automated retrieval and integration of information about protein point mutations in combination with structure, domain and interaction data from literature and databases promises to be a valuable approach to study structure-function relationships in biomedical data sets. Results We developed a rule- and regular expression-based protein point mutation retrieval pipeline for PubMed abstracts, which shows an F-measure of 87% for the mutation retrieval task on a benchmark dataset. In order to link mutations to their proteins, we utilize a named entity recognition algorithm for the identification of gene names co-occurring in the abstract, and establish links based on sequence checks. Vice versa, we could show that gene recognition improved from 77% to 91% F-measure when considering mutation information given in the text. To demonstrate practical relevance, we utilize mutation information from text to evaluate a novel solvation energy based model for the prediction of stabilizing regions in membrane proteins. For five G protein-coupled receptors we identified 35 relevant single mutations and associated phenotypes, of which none had been annotated in the UniProt or PDB database. In 71% reported phenotypes were in compliance with the model predictions, supporting a relation between mutations and stability issues in membrane proteins. Conclusion We present a reliable approach for the retrieval of protein mutations from PubMed abstracts for any set of genes or proteins of interest. We further demonstrate how amino acid substitution information from text can be utilized for protein structure stability studies on the basis of a novel energy model.
Collapse
Affiliation(s)
- Rainer Winnenburg
- Biotechnology Center, Technische Universität Dresden, Tatzberg, Germany.
| | | | | |
Collapse
|
27
|
Abstract
Increasingly, the molecular genetics laboratory has to assess the biological significance of changes (variants) in a DNA sequence. Using the large genes BRCA1 and BRCA2 as examples, some approaches used to determine the biological significance of DNA variants are described. These include the characterization of the variant through a review of the literature and the various databases to assess if it has previously been described. Potential difficulties with the various databases that are available are described. Other considerations include the co-inheritance of the variant with other DNA changes, and its evolutionary conservation. Determining the possible effect of the variant on protein function is described in terms of the Grantham assessment as well as identifying functional domains. Studies looking at the distribution of the variant in both the population and the family can also help in assessing its significance. Loss of the variant in a tumor sample would imply that it is not deleterious. Ultimately, it is not any single parameter that helps determine the DNA variants biological significance. Usually this requires multiple lines of evidence.
Collapse
|
28
|
Khan S, Vihinen M. Spectrum of disease-causing mutations in protein secondary structures. BMC STRUCTURAL BIOLOGY 2007; 7:56. [PMID: 17727703 PMCID: PMC1995201 DOI: 10.1186/1472-6807-7-56] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2006] [Accepted: 08/29/2007] [Indexed: 11/10/2022]
Abstract
BACKGROUND Most genetic disorders are linked to missense mutations as even minor changes in the size or properties of an amino acid can alter or prevent the function of the protein. Further, the effect of a mutation is also dependent on the sequence and structure context of the alteration. RESULTS We investigated the spectrum of disease-causing missense mutations in secondary structure elements in proteins with numerous known mutations and for which an experimentally defined three-dimensional structure is available. We obtained a comprehensive map of the differences in mutation frequencies, location and contact energies, and the changes in residue volume and charge - both in the mutated (original) amino acids and in the mutant amino acids in the different secondary structure types. We collected information for 44 different proteins involved in a large number of diseases. The studied proteins contained a total of 2413 mutations of which 1935 (80%) appeared in secondary structures. Differences in mutation patterns between secondary structures and whole proteins were generally not statistically significant whereas within the secondary structural elements numerous highly significant features were observed. CONCLUSION Numerous trends in mutated and mutant amino acids are apparent. Among the original residues, arginine clearly has the highest relative mutability. The overall relative mutability among mutant residues is highest for cysteine and tryptophan. The mutability values are higher for mutated residues than for mutant residues. Arginine and glycine are among the most mutated residues in all secondary structures whereas the other amino acids have large variations in mutability between structure types. Statistical analysis was used to reveal trends in different secondary structural elements, residue types as well as for the charge and volume changes.
Collapse
Affiliation(s)
- Sofia Khan
- Institute of Medical Technology, FI-33014 University of Tampere, Finland
| | - Mauno Vihinen
- Institute of Medical Technology, FI-33014 University of Tampere, Finland
- Research Unit, Tampere University Hospital, FI-33520 Tampere, Finland
| |
Collapse
|
29
|
Gulley ML, Braziel RM, Halling KC, Hsi ED, Kant JA, Nikiforova MN, Nowak JA, Ogino S, Oliveira A, Polesky HF, Silverman L, Tubbs RR, Van Deerlin VM, Vance GH, Versalovic J. Clinical laboratory reports in molecular pathology. Arch Pathol Lab Med 2007; 131:852-63. [PMID: 17550311 DOI: 10.5858/2007-131-852-clrimp] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2007] [Indexed: 11/06/2022]
Abstract
CONTEXT Molecular pathology is a rapidly growing area of laboratory medicine in which DNA and RNA are analyzed. The recent introduction of array technology has added another layer of complexity involving massive parallel analysis of multiple genes, transcripts, or proteins. OBJECTIVE As molecular technologies are increasingly implemented in clinical settings, it is important to bring uniformity to the way that test results are reported. DATA SOURCES The College of American Pathologists Molecular Pathology Resource Committee members summarize elements that are already common to virtually all molecular pathology reports, as set forth in the College of American Pathologists checklists used in the laboratory accreditation process. Consensus recommendations are proposed to improve report format and content, and areas of controversy are discussed. Resources are cited that promote use of proper gene nomenclature and that describe methods for reporting mutations, translocations, microsatellite instability, and other genetic alterations related to inherited disease, cancer, identity testing, microbiology, and pharmacogenetics. CONCLUSIONS These resources and recommendations provide a framework for composing patient reports to convey molecular test results and their clinical significance to members of the health care team.
Collapse
Affiliation(s)
- Margaret L Gulley
- Department of Pathology, 913 Brinkhous-Bullitt Bldg, University of North Carolina, Chapel Hill, NC 27599-7525, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Gout AM, Ravine D, Harris PC, Rossetti S, Peters D, Breuning M, Henske EP, Koizumi A, Inoue S, Shimizu Y, Thongnoppakhun W, Yenchitsomanus PT, Deltas C, Sandford R, Torra R, Turco AE, Jeffery S, Fontes M, Somlo S, Furu LM, Smulders YM, Mercier B, Ferec C, Burtey S, Pei Y, Kalaydjieva L, Bogdanova N, McCluskey M, Geon LJ, Wouters CH, Reiterova J, Stekrová J, San Millan JL, Aguiari G, Del Senno L. Analysis of published PKD1 gene sequence variants. Nat Genet 2007; 39:427-8. [PMID: 17392796 DOI: 10.1038/ng0407-427] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
31
|
Ogino S, Gulley ML, den Dunnen JT, Wilson RB. Standard mutation nomenclature in molecular diagnostics: practical and educational challenges. J Mol Diagn 2007; 9:1-6. [PMID: 17251329 PMCID: PMC1867422 DOI: 10.2353/jmoldx.2007.060081] [Citation(s) in RCA: 128] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
To translate basic research findings into clinical practice, it is essential that information about mutations and variations in the human genome are communicated easily and unequivocally. Unfortunately, there has been much confusion regarding the description of genetic sequence variants. This is largely because research articles that first report novel sequence variants do not often use standard nomenclature, and the final genomic sequence is compiled over many separate entries. In this article, we discuss issues crucial to clear communication, using examples of genes that are commonly assayed in clinical laboratories. Although molecular diagnostics is a dynamic field, this should not inhibit the need for and movement toward consensus nomenclature for accurate reporting among laboratories. Our aim is to alert laboratory scientists and other health care professionals to the important issues and provide a foundation for further discussions that will ultimately lead to solutions.
Collapse
Affiliation(s)
- Shuji Ogino
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, 75 Francis St., Boston, MA 02115, USA.
| | | | | | | |
Collapse
|
32
|
Brown N. Moving forwards from the human genome project. J Paediatr Child Health 2007; 43:92. [PMID: 17207066 DOI: 10.1111/j.1440-1754.2007.01012.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
33
|
van Baal S, Kaimakis P, Phommarinh M, Koumbi D, Cuppens H, Riccardino F, Macek M, Scriver CR, Patrinos GP. FINDbase: a relational database recording frequencies of genetic defects leading to inherited disorders worldwide. Nucleic Acids Res 2006; 35:D690-5. [PMID: 17135191 PMCID: PMC1747180 DOI: 10.1093/nar/gkl934] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Frequency of INherited Disorders database (FINDbase) (http://www.findbase.org) is a relational database, derived from the ETHNOS software, recording frequencies of causative mutations leading to inherited disorders worldwide. Database records include the population and ethnic group, the disorder name and the related gene, accompanied by links to any corresponding locus-specific mutation database, to the respective Online Mendelian Inheritance in Man entries and the mutation together with its frequency in that population. The initial information is derived from the published literature, locus-specific databases and genetic disease consortia. FINDbase offers a user-friendly query interface, providing instant access to the list and frequencies of the different mutations. Query outputs can be either in a table or graphical format, accompanied by reference(s) on the data source. Registered users from three different groups, namely administrator, national coordinator and curator, are responsible for database curation and/or data entry/correction online via a password-protected interface. Databaseaccess is free of charge and there are no registration requirements for data querying. FINDbase provides a simple, web-based system for population-based mutation data collection and retrieval and can serve not only as a valuable online tool for molecular genetic testing of inherited disorders but also as a non-profit model for sustainable database funding, in the form of a 'database-journal'.
Collapse
Affiliation(s)
- Sjozef van Baal
- Erasmus MC, MGC-Department of Cell Biology and GeneticsRotterdam, The Netherlands
| | - Polynikis Kaimakis
- Erasmus MC, MGC-Department of Cell Biology and GeneticsRotterdam, The Netherlands
| | - Manyphong Phommarinh
- Montreal Children's Hospital Research Institute, McGill UniversityMontreal, Canada
| | - Daphne Koumbi
- Fox Chase Cancer Center, Human Genetics DivisionPhiladelphia, PA, USA
| | - Harry Cuppens
- Centre for Human Genetics, Katholic University of LeuvenCampus Gasthuisberg, Leuven, Belgium
| | - Francesca Riccardino
- Dipartimento di Genetica, Biologia, Biochimica, Università di TorinoTorino, Italy
| | - Milan Macek
- Department of Molecular Genetics, Institute of Biology and Medical Genetics–National Cystic Fibrosis Centre, University Hospital Motol and Second School of Medicine of Charles UniversityPrague, Czech Republic
| | - Charles R. Scriver
- Montreal Children's Hospital Research Institute, McGill UniversityMontreal, Canada
| | - George P. Patrinos
- Erasmus MC, MGC-Department of Cell Biology and GeneticsRotterdam, The Netherlands
- Asclepion GeneticsLausanne, Switzerland
- To whom correspondence should be addressed. Tel: +31 10 408 7454; Fax: +31 10 408 9468;
| |
Collapse
|
34
|
Soussi T, Rubio-Nevado JM, Ishioka C. MUT-TP53: a versatile matrix for TP53 mutation verification and publication. Hum Mutat 2006; 27:1151-4. [PMID: 16941637 DOI: 10.1002/humu.20395] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
35
|
McDonald R, Scott Winters R, Ankuda CK, Murphy JA, Rogers AE, Pereira F, Greenblatt MS, White PS. An automated procedure to identify biomedical articles that contain cancer-associated gene variants. Hum Mutat 2006; 27:957-64. [PMID: 16865690 DOI: 10.1002/humu.20363] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The proliferation of biomedical literature makes it increasingly difficult for researchers to find and manage relevant information. However, identifying research articles containing mutation data, a requisite first step in integrating large and complex mutation data sets, is currently tedious, time-consuming and imprecise. More effective mechanisms for identifying articles containing mutation information would be beneficial both for the curation of mutation databases and for individual researchers. We developed an automated method that uses information extraction, classifier, and relevance ranking techniques to determine the likelihood of MEDLINE abstracts containing information regarding genomic variation data suitable for inclusion in mutation databases. We targeted the CDKN2A (p16) gene and the procedure for document identification currently used by CDKN2A Database curators as a measure of feasibility. A set of abstracts was manually identified from a MEDLINE search as potentially containing specific CDKN2A mutation events. A subset of these abstracts was used as a training set for a maximum entropy classifier to identify text features distinguishing "relevant" from "not relevant" abstracts. Each document was represented as a set of indicative word, word pair, and entity tagger-derived genomic variation features. When applied to a test set of 200 candidate abstracts, the classifier predicted 88 articles as being relevant; of these, 29 of 32 manuscripts in which manual curation found CDKN2A sequence variants were positively predicted. Thus, the set of potentially useful articles that a manual curator would have to review was reduced by 56%, maintaining 91% recall (sensitivity) and more than doubling precision (positive predictive value). Subsequent expansion of the training set to 494 articles yielded similar precision and recall rates, and comparison of the original and expanded trials demonstrated that the average precision improved with the larger data set. Our results show that automated systems can effectively identify article subsets relevant to a given task and may prove to be powerful tools for the broader research community. This procedure can be readily adapted to any or all genes, organisms, or sets of documents.
Collapse
Affiliation(s)
- Ryan McDonald
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, USA
| | | | | | | | | | | | | | | |
Collapse
|
36
|
Abstract
Genetic databases are gradually assuming an increasing importance in all areas of health care. The national and ethnic mutation databases (NEMDBs) are continuously updated mutation depositories, recording extensive information over the described genetic heterogeneity of an ethnic group or population. Together with the central and locus-specific databases, those resources not only enhance awareness of the various genetic disorders but also facilitate the provision of genetic services and provide useful insights into the genographic history of human populations. Fifteen independent NEMDBs devoted to the documentation of the extant genetic heterogeneity in various population groups within 57 different countries were assessed; 13 of the NEMDBs were fully functional. The contents of the 13 fully functional NEMDBs were thoroughly analyzed for the presence or absence of 39 criteria, pertaining database general information, operating platform, data source and submission, and querying capacity. This study provides a strong case for uniformity of data to make the NEMDBs content maximally useful. In this direction, a hypothetical content for the ideal NEMDB is derived, which is currently being incorporated in an upgraded version of the ETHNOS NEMDB development and curation software, as well as a community structure that would enhance the chances of mutation frequency capture and documentation in human populations. The ultimate goal is that interested parties and granting bodies will assist in achieving the vision of a comprehensive resource that collects and displays all population-specific genetic information discovered.
Collapse
Affiliation(s)
- George P Patrinos
- Erasmus University Medical Center, Faculty of Medicine and Health Sciences, MGC-Department of Cell Biology and Genetics, Rotterdam, The Netherlands.
| |
Collapse
|
37
|
Abstract
Information theory-based software tools have been useful in interpreting noncoding sequence variation within functional sequence elements such as splice sites. Individual information analysis detects activated cryptic splice sites and associated splicing regulatory sites and is capable of distinguishing null from partially functional alleles. We present a server (https://splice.cmh.edu) designed to analyze splicing mutations in binding sites in either human genes, genome-mapped mRNAs, user-defined sequences, or dbSNP entries. Standard HUGO-approved gene symbols and HGVS-approved systematic mutation nomenclature (or dbSNP format) are entered via a web portal. After verifying the accuracy of input variant(s), the surrounding interval is retrieved from the human genome or user-supplied reference sequence. The server then computes the information contents (Ri) of all potential constitutive and/or regulatory splice sites in both the reference and variant sequences. Changes in information content are color-coded, tabulated, and visualized as sequence walkers, which display the binding sites with the reference sequence. The software was validated by analyzing approximately 1,300 mutations from Human Mutation as well as eight mapped SNPs from dbSNP designated as splice site variants. All of the splicing mutations and variants affected splice site strength or activated cryptic splice sites. The server also detected several missense mutations that were unexpectedly predicted to have concomitant effects on splicing or appeared to activate cryptic splicing.
Collapse
Affiliation(s)
- Vijay K Nalla
- Laboratory of Human Molecular Genetics, Children's Mercy Hospital and Clinics, University of Missouri-Kansas City, Kansas City, Missouri, USA
| | | |
Collapse
|
38
|
Kalmár L, Hegedüs T, Farkas H, Nagy M, Tordai A. HAEdb: a novel interactive, locus-specific mutation database for the C1 inhibitor gene. Hum Mutat 2006; 25:1-5. [PMID: 15580551 DOI: 10.1002/humu.20112] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Hereditary angioneurotic edema (HAE) is an autosomal dominant disorder characterized by episodic local subcutaneous and submucosal edema and is caused by the deficiency of the activated C1 esterase inhibitor protein (C1-INH or C1INH; approved gene symbol SERPING1). Published C1-INH mutations are represented in large universal databases (e.g., OMIM, HGMD), but these databases update their data rather infrequently, they are not interactive, and they do not allow searches according to different criteria. The HAEdb, a C1-INH gene mutation database (http://hae.biomembrane.hu) was created to contribute to the following expectations: 1) help the comprehensive collection of information on genetic alterations of the C1-INH gene; 2) create a database in which data can be searched and compared according to several flexible criteria; and 3) provide additional help in new mutation identification. The website uses MySQL, an open-source, multithreaded, relational database management system. The user-friendly graphical interface was written in the PHP web programming language. The website consists of two main parts, the freely browsable search function, and the password-protected data deposition function. Mutations of the C1-INH gene are divided in two parts: gross mutations involving DNA fragments >1 kb, and micro mutations encompassing all non-gross mutations. Several attributes (e.g., affected exon, molecular consequence, family history) are collected for each mutation in a standardized form. This database may facilitate future comprehensive analyses of C1-INH mutations and also provide regular help for molecular diagnostic testing of HAE patients in different centers.
Collapse
Affiliation(s)
- Lajos Kalmár
- Laboratory of Molecular Genetics, National Medical Center, Institute of Hematology and Immunology, Budapest, Hungary
| | | | | | | | | |
Collapse
|
39
|
Soussi T, Ishioka C, Claustres M, Béroud C. Locus-specific mutation databases: pitfalls and good practice based on the p53 experience. Nat Rev Cancer 2006; 6:83-90. [PMID: 16397528 DOI: 10.1038/nrc1783] [Citation(s) in RCA: 115] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Between 50,000 and 60,000 mutations have been described in various genes that are associated with a wide variety of diseases. Reporting, storing and analysing these data is an important challenge as such data provide invaluable information for both clinical medicine and basic science. Locus-specific databases have been developed to exploit this huge volume of data. The p53 mutation database is a paradigm, as it constitutes the largest collection of somatic mutations (22,000). However, there are several biases in this database that can lead to serious erroneous interpretations. We describe several rules for mutation database management that could benefit the entire scientific community.
Collapse
Affiliation(s)
- Thierry Soussi
- Université P.M. Curie, 4 place Jussieu, 75005 Paris, France.
| | | | | | | |
Collapse
|
40
|
|
41
|
Béroud C, Hamroun D, Collod-Béroud G, Boileau C, Soussi T, Claustres M. UMD (Universal Mutation Database): 2005 update. Hum Mutat 2005; 26:184-91. [PMID: 16086365 DOI: 10.1002/humu.20210] [Citation(s) in RCA: 91] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
With the completion of the Human Genome Project, our vision of human genetic diseases has changed. The cloning of new disease-causing genes can now be performed in silico, and thousands of mutations are being identified in diagnostic and research laboratories yearly. Knowledge about these mutations and their association with clinical and biological data is essential for clinicians, geneticists, and researchers. To collect and analyze these data, we developed a generic software called Universal Mutation Databases (UMD) to create locus-specific databases. Here we report the new release (September 2004) of this freely available tool (www.umd.be), which allows the creation of LSDBs for virtually any gene and includes a large set of new analysis tools. We have implemented new features to integrate noncoding sequences, clinical data, pictures, monoclonal antibodies, and polymorphic markers (SNPs). Today the UMD retains all specifically designed tools to analyze mutations at the molecular level, as well as new sets of routines to search for genotype-phenotype correlations. We also created specific tools for infrequent mutations such as gross deletions and duplications, and deep intronic mutations. A large set of dedicated tools are now available for intronic mutations, including methods to calculate the consensus values (CVs) of potential splice sites and to search for exonic splicing enhancer (ESE) motifs. In addition, we have created specific routines to help researchers design new therapeutic strategies, such as exon skipping, aminoglycoside read-through of stop codons, or monoclonal antibody selection and epitope scanning for gene therapy.
Collapse
|
42
|
Patrinos GP, van Baal S, Petersen MB, Papadakis MN. Hellenic National Mutation Database: a prototype database for mutations leading to inherited disorders in the Hellenic population. Hum Mutat 2005; 25:327-33. [PMID: 15776445 DOI: 10.1002/humu.20157] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The exponential discovery rate of new genomic alterations, leading to inherited disorders, as well as the need for comparative studies of different population's mutation frequencies necessitates recording their population-wide spectrum in online mutation databases. We report the construction of the Hellenic National Mutation database (http://www.goldenhelix.org/hellenic), a prototype database derived from a multicenter academic initiative, aiming to provide high quality and up-to-date information on the underlying genetic heterogeneity of inherited disorders found in the Hellenic population. Database records include informative summaries of the various genetic disorders studied in the Hellenic population, focused in particular on their incidence in Greece, a comprehensive reference list, and a well-structured query interface, which provides easy access to the list of the different mutations responsible for the inherited disorders in the Hellenic population. Also, extensive links to the respective Online Mendelian Inheritance in Man (OMIM) entries and, when available, to the locus-specific databases are provided, so that the user can retrieve the maximum amount of information from a single website. Furthermore, the Hellenic National Mutation database design allows easy data entry and curation. Creation of the Hellenic National Mutation database will significantly facilitate molecular diagnosis of inherited disorders in Greece and will motivate further investigation of yet unknown genetic diseases in the Hellenic population.
Collapse
Affiliation(s)
- George P Patrinos
- Erasmus University Medical Center, Faculty of Medicine and Health Sciences, MGC-Department of Cell Biology and Genetics, Rotterdam, The Netherlands.
| | | | | | | |
Collapse
|
43
|
Fokkema IFAC, den Dunnen JT, Taschner PEM. LOVD: Easy creation of a locus-specific sequence variation database using an “LSDB-in-a-box” approach. Hum Mutat 2005; 26:63-8. [PMID: 15977173 DOI: 10.1002/humu.20201] [Citation(s) in RCA: 191] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The completion of the human genome project has initiated, as well as provided the basis for, the collection and study of all sequence variation between individuals. Direct access to up-to-date information on sequence variation is currently provided most efficiently through web-based, gene-centered, locus-specific databases (LSDBs). We have developed the Leiden Open (source) Variation Database (LOVD) software approaching the "LSDB-in-a-Box" idea for the easy creation and maintenance of a fully web-based gene sequence variation database. LOVD is platform-independent and uses PHP and MySQL open source software only. The basic gene-centered and modular design of the database follows the recommendations of the Human Genome Variation Society (HGVS) and focuses on the collection and display of DNA sequence variations. With minimal effort, the LOVD platform is extendable with clinical data. The open set-up should both facilitate and promote functional extension with scripts written by the community. The LOVD software is freely available from the Leiden Muscular Dystrophy pages (www.DMD.nl/LOVD/). To promote the use of LOVD, we currently offer curators the possibility to set up an LSDB on our Leiden server.
Collapse
Affiliation(s)
- Ivo F A C Fokkema
- Center of Human and Clinical Genetics, Leiden University Medical Center, Leiden, the Netherlands
| | | | | |
Collapse
|
44
|
Syvänen AC, Taylor GR. Approaches for analyzing human mutations and nucleotide sequence variation: a report from the Seventh International Mutation Detection meeting, 2003. Hum Mutat 2004; 23:401-5. [PMID: 15108269 DOI: 10.1002/humu.20031] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The Seventh International Symposium on Mutations in the Human Genome, Mutation Detection 2003, was held during 2-6 July 2003 in Palm Cove near Cairns, Australia. The meeting was organized under the auspices of the Human Genome Organisation (HUGO) as a satellite meeting of the International World Congress of Genetics, held in Melbourne the following week. Meeting participants reported on advances in mutation detection technologies, including advances in high-throughput detection systems for SNP genotyping applicable to the international haplotype mapping project (HapMap); and bioinformatics tools, including databases for handling and processing growing amounts of genome variation data. This meeting report summarizes the presentations and cites related articles from the special issue of Human Mutation (Volume 23#5, May 2004; available online at www.wiley.com/humanmutation).
Collapse
|