1
|
Murine allele and transgene symbols: ensuring unique, concise, and informative nomenclature. Mamm Genome 2021; 33:108-119. [PMID: 34389871 PMCID: PMC8913455 DOI: 10.1007/s00335-021-09902-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 08/03/2021] [Indexed: 11/15/2022]
Abstract
In addition to naturally occurring sequence variation and spontaneous mutations, a wide array of technologies exist for modifying the mouse genome. Standardized nomenclature, including allele, transgene, and other mutation nomenclature, as well as persistent unique identifiers (PUID) are critical for effective scientific communication, comparison of results, and integration of data into knowledgebases such as Mouse Genome Informatics (MGI), Alliance for Genome Resources, and International Mouse Strain Resource (IMSR). As well as being the authoritative source for mouse gene, allele, and strain nomenclature, MGI integrates published and unpublished genomic, phenotypic, and expression data while linking to other online resources for a complete view of the mouse as a valuable model organism. The International Committee on Standardized Genetic Nomenclature for Mice has developed allele nomenclature rules and guidelines that take into account the number of genes impacted, the method of allele generation, and the nature of the sequence alteration. To capture details that cannot be included in allele symbols, MGI has further developed allele to gene relationships using sequence ontology (SO) definitions for mutations that provide links between alleles and the genes affected. MGI is also using (HGVS) variant nomenclature for variants associated with alleles that will enhance searching for mutations and will improve cross-species comparison. With the ability to assign unique and informative symbols as well as to link alleles with more than one gene, allele and transgene nomenclature rules and guidelines provide an unambiguous way to represent alterations in the mouse genome and facilitate data integration among multiple resources such the Alliance of Genome Resources and International Mouse Strain Resource.
Collapse
|
2
|
Lefter M, Vis JK, Vermaat M, den Dunnen JT, Taschner PEM, Laros JFJ. Mutalyzer 2: next generation HGVS nomenclature checker. Bioinformatics 2021; 37:2811-2817. [PMID: 33538839 PMCID: PMC8479679 DOI: 10.1093/bioinformatics/btab051] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 12/02/2020] [Accepted: 01/22/2021] [Indexed: 02/02/2023] Open
Abstract
MOTIVATION Unambiguous variant descriptions are of utmost importance in clinical genetic diagnostics, scientific literature and genetic databases. The Human Genome Variation Society (HGVS) publishes a comprehensive set of guidelines on how variants should be correctly and unambiguously described. We present the implementation of the Mutalyzer 2 tool suite, designed to automatically apply the HGVS guidelines so users do not have to deal with the HGVS intricacies explicitly to check and correct their variant descriptions. RESULTS Mutalyzer is profusely used by the community, having processed over 133 million descriptions since its launch. Over a five year period, Mutalyzer reported a correct input in ∼50% of cases. In 41% of the cases either a syntactic or semantic error was identified and for ∼7% of cases, Mutalyzer was able to automatically correct the description. AVAILABILITY AND IMPLEMENTATION Mutalyzer is an Open Source project under the GNU Affero General Public License. The source code is available on GitHub (https://github.com/mutalyzer/mutalyzer) and a running instance is available at: https://mutalyzer.nl.
Collapse
Affiliation(s)
- Mihai Lefter
- Department of Human Genetics, Leiden University Medical Center (LUMC)Leiden, The Netherlands,To whom correspondence should be addressed.
| | - Jonathan K Vis
- Department of Human Genetics, Leiden University Medical Center (LUMC)Leiden, The Netherlands,Department of Clinical Genetics, Leiden University Medical Center (LUMC)Leiden, The Netherlands
| | - Martijn Vermaat
- Department of Human Genetics, Leiden University Medical Center (LUMC)Leiden, The Netherlands
| | - Johan T den Dunnen
- Department of Human Genetics, Leiden University Medical Center (LUMC)Leiden, The Netherlands,Department of Clinical Genetics, Leiden University Medical Center (LUMC)Leiden, The Netherlands
| | - Peter E M Taschner
- Department of Human Genetics, Leiden University Medical Center (LUMC)Leiden, The Netherlands,Generade Centre of Expertise Genomics and Leiden Centre for Applied Bioscience, University of Applied Sciences Leiden, Leiden, The Netherlands
| | - Jeroen F J Laros
- Department of Human Genetics, Leiden University Medical Center (LUMC)Leiden, The Netherlands,Department of Clinical Genetics, Leiden University Medical Center (LUMC)Leiden, The Netherlands,National Institute for Public Health and the Environment (RIVM), Bthoven, The Netherlands
| |
Collapse
|
3
|
Abstract
Autism Spectrum Disorder is a developmental disorder that affects children from a very young age and is characterized by persistent deficits in social, communicational, and behavioral abilities. Since there is no cure for autism, domain experts focus on aiding these children through specific intervention plans that are aimed towards the development of the deficient areas. Using socially assistive robots that interact in a social manner with children in autism interventions, efforts are being made towards alleviating the autistic behavior of children and enhancing their social behavior. However, implementing robots in autism interventions could lead to harmful situations concerning safety. In this paper, an architecture for safe child–robot interactions in autism interventions is proposed. First, a taxonomy of child–robot interactions in autism interventions is presented, explaining its complete framework. Next, the interaction is modelled according to this taxonomy where an interaction case is employed in order for the structure of the interaction to be defined. Based on that, the safety architecture is proposed that will be integrated into the robot’s controller. Focus is placed on detecting possible distracting elements that could influence the performance of the child, affecting their psychological or physical safety. Lastly, the interaction between child and robot is created in a simulated environment through dialogue inputs and outputs, and the code of the architecture is tested, where a virtual robot performs the appropriate actions.
Collapse
|
4
|
de Randamie R, Martos-Moreno GÁ, Lumbreras C, Chueca M, Donnay S, Luque M, Regojo RM, Mendiola M, Hardisson D, Argente J, Moreno JC. Frequent and Rare HABP2 Variants Are Not Associated with Increased Susceptibility to Familial Nonmedullary Thyroid Carcinoma in the Spanish Population. Horm Res Paediatr 2018; 89:397-407. [PMID: 29895015 DOI: 10.1159/000487395] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Accepted: 01/25/2018] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND/AIMS A genomic HABP2 variant was proposed to be responsible for familial nonmedullary thyroid carcinoma (FNMTC). However, its involvement has been questioned in subsequent studies. We aimed to identify genetic HABP2 mutations in a series of FNMTC patients and investigate their involvement in the disease. METHODS HABP2 was sequenced from 6 index patients. Presence of the variants was investigated in all members of one family. Somatic BRAF and RAS "hotspot" mutations were investigated by the IdyllaTM BRAF Mutation Test and/or Sanger sequencing. RESULTS Two HABP2 variants (p.E393Q and p.G534E) were identified in the index patient from one family with papillary thyroid carcinoma (PTC) (follicular variant). The prevalence of p.E393Q in Spanish control alleles was 0.5% and that of p.G534E was 5.1%. However, neither change cosegregated with the phenotype in 3 affected members and 5 healthy members of the kindred. Interestingly, all 3 members affected by PTC harbored the p.V600E somatic mutation in BRAF. CONCLUSIONS The variant G534E is prevalent in the Spanish population (5.1%); however, p.E393Q is rare (< 1%) and none cosegregated with the FNMTC phenotype. The presence of the noninheritable V600E BRAF mutation in this family supports Knudson's "double-hit" hypothesis for cancer development and suggests the involvement of more than 1 gene in the clinical expression of FNMTC.
Collapse
Affiliation(s)
- Rajdee de Randamie
- Thyroid Molecular Laboratory, Institute for Medical and Molecular Genetics (INGEMM), IdiPAZ, La Paz University Hospital, Universidad Autónoma de Madrid, Madrid, Spain
| | - Gabriel Ángel Martos-Moreno
- Department of Pediatrics and Pediatric Endocrinology, Hospital Infantil Universitario Niño Jesús, Universidad Autónoma de Madrid, CIBER de fisiopatología de la obesidad y nutrición (CIBEROBN), Instituto de Salud Carlos III, Madrid, Spain
| | - César Lumbreras
- Thyroid Molecular Laboratory, Institute for Medical and Molecular Genetics (INGEMM), IdiPAZ, La Paz University Hospital, Universidad Autónoma de Madrid, Madrid, Spain
| | - Maria Chueca
- Pediatric Endocrinology, Virgen del Camino University Hospital, Pamplona, Spain
| | - Sergio Donnay
- Department of Endocrinology and Nutrition, Hospital Universitario Fundación Alcorcón, Madrid, Spain
| | - Manuel Luque
- Department of Endocrinology and Nutrition, Hospital Universitario Ramón y Cajal, Madrid, Spain
| | | | - Marta Mendiola
- Molecular Pathology and Therapeutic Targets Group, Hospital Universitario La Paz, IdiPAZ, Madrid, Spain.,Molecular Pathology Diagnostic Unit, Hospital Universitario La Paz, INGEMM, IdiPAZ, Madrid, Spain
| | - David Hardisson
- Pathology Department, La Paz University Hospital, Madrid, Spain.,Molecular Pathology and Therapeutic Targets Group, Hospital Universitario La Paz, IdiPAZ, Madrid, Spain.,Molecular Pathology Diagnostic Unit, Hospital Universitario La Paz, INGEMM, IdiPAZ, Madrid, Spain
| | - Jesús Argente
- Department of Pediatrics and Pediatric Endocrinology, Hospital Infantil Universitario Niño Jesús, Universidad Autónoma de Madrid, CIBER de fisiopatología de la obesidad y nutrición (CIBEROBN), Instituto de Salud Carlos III, Madrid, Spain
| | - José C Moreno
- Thyroid Molecular Laboratory, Institute for Medical and Molecular Genetics (INGEMM), IdiPAZ, La Paz University Hospital, Universidad Autónoma de Madrid, Madrid, Spain
| |
Collapse
|
5
|
Natarajan P, Gold NB, Bick AG, McLaughlin H, Kraft P, Rehm HL, Peloso GM, Wilson JG, Correa A, Seidman JG, Seidman CE, Kathiresan S, Green RC. Aggregate penetrance of genomic variants for actionable disorders in European and African Americans. Sci Transl Med 2017; 8:364ra151. [PMID: 27831900 DOI: 10.1126/scitranslmed.aag2367] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Accepted: 09/30/2016] [Indexed: 12/21/2022]
Abstract
In populations that have not been selected for family history of disease, it is unclear how commonly pathogenic variants (PVs) in disease-associated genes for rare Mendelian conditions are found and how often they are associated with clinical features of these conditions. We conducted independent, prospective analyses of participants in two community-based epidemiological studies to test the hypothesis that persons carrying PVs in any of 56 genes that lead to 24 dominantly inherited, actionable conditions are more likely to exhibit the clinical features of the corresponding diseases than those without PVs. Among 462 European American Framingham Heart Study (FHS) and 3223 African-American Jackson Heart Study (JHS) participants who were exome-sequenced, we identified and classified 642 and 4429 unique variants, respectively, in these 56 genes while blinded to clinical data. In the same participants, we ascertained related clinical features from the participants' clinical history of cancer and most recent echocardiograms, electrocardiograms, and lipid measurements, without knowledge of variant classification. PVs were found in 5 FHS (1.1%) and 31 JHS (1.0%) participants. Carriers of PVs were more likely than expected, on the basis of incidence in noncarriers, to have related clinical features in both FHS (80.0% versus 12.4%) and JHS (26.9% versus 5.4%), yielding standardized incidence ratios of 6.4 [95% confidence interval (CI), 1.7 to 16.5; P = 7 × 10-4) in FHS and 4.7 (95% CI, 1.9 to 9.7; P = 3 × 10-4) in JHS. Individuals unselected for family history who carry PVs in 56 genes for actionable conditions have an increased aggregated risk of developing clinical features associated with the corresponding diseases.
Collapse
Affiliation(s)
- Pradeep Natarajan
- Center for Human Genetic Research, Cardiovascular Research Center, and Cardiology Division, Department of Medicine, Massachusetts General Hospital, Boston, MA 02114, USA.,Harvard Medical School, Boston, MA 02115, USA.,Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Nina B Gold
- Harvard Medical School, Boston, MA 02115, USA.,Boston Children's Hospital, Boston, MA 02115, USA
| | - Alexander G Bick
- Harvard Medical School, Boston, MA 02115, USA.,Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA 02142, USA.,Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Heather McLaughlin
- Harvard Medical School, Boston, MA 02115, USA.,Department of Pathology, Brigham and Women's Hospital, Boston, MA 02115, USA.,Partners HealthCare Personalized Medicine, Boston, MA 02115, USA
| | - Peter Kraft
- Departments of Epidemiology and Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Heidi L Rehm
- Harvard Medical School, Boston, MA 02115, USA.,Department of Pathology, Brigham and Women's Hospital, Boston, MA 02115, USA.,Partners HealthCare Personalized Medicine, Boston, MA 02115, USA
| | - Gina M Peloso
- Harvard Medical School, Boston, MA 02115, USA.,Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - James G Wilson
- Department of Physiology and Biophysics, University of Mississippi Medical Center, Jackson, MS 39216, USA
| | - Adolfo Correa
- Departments of Pediatrics and Medicine, University of Mississippi Medical Center, Jackson, MS 39216, USA
| | - Jonathan G Seidman
- Harvard Medical School, Boston, MA 02115, USA.,Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Christine E Seidman
- Harvard Medical School, Boston, MA 02115, USA.,Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.,Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA.,Howard Hughes Medical Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Sekar Kathiresan
- Center for Human Genetic Research, Cardiovascular Research Center, and Cardiology Division, Department of Medicine, Massachusetts General Hospital, Boston, MA 02114, USA.,Harvard Medical School, Boston, MA 02115, USA.,Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Robert C Green
- Harvard Medical School, Boston, MA 02115, USA. .,Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA 02142, USA.,Partners HealthCare Personalized Medicine, Boston, MA 02115, USA.,Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA
| |
Collapse
|
6
|
Thomas P, Rocktäschel T, Hakenberg J, Lichtblau Y, Leser U. SETH detects and normalizes genetic variants in text. Bioinformatics 2016; 32:2883-5. [PMID: 27256315 DOI: 10.1093/bioinformatics/btw234] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2015] [Accepted: 04/18/2016] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED : Descriptions of genetic variations and their effect are widely spread across the biomedical literature. However, finding all mentions of a specific variation, or all mentions of variations in a specific gene, is difficult to achieve due to the many ways such variations are described. Here, we describe SETH, a tool for the recognition of variations from text and their subsequent normalization to dbSNP or UniProt. SETH achieves high precision and recall on several evaluation corpora of PubMed abstracts. It is freely available and encompasses stand-alone scripts for isolated application and evaluation as well as a thorough documentation for integration into other applications. AVAILABILITY AND IMPLEMENTATION SETH is released under the Apache 2.0 license and can be downloaded from http://rockt.github.io/SETH/ CONTACT: thomas@informatik.hu-berlin.de or leser@informatik.hu-berlin.de.
Collapse
Affiliation(s)
- Philippe Thomas
- Language Technology Lab, DFKI Berlin, Germany Knowledge Management in Bioinformatics, Institute for Computer Science, Humboldt-Universität Zu Berlin, Unter Den Linden 6, Berlin 10099, Germany
| | | | - Jörg Hakenberg
- Illumina, Inc, 451 El Camino Real, Santa Clara, CA 95050, USA
| | - Yvonne Lichtblau
- Knowledge Management in Bioinformatics, Institute for Computer Science, Humboldt-Universität Zu Berlin, Unter Den Linden 6, Berlin 10099, Germany
| | - Ulf Leser
- Knowledge Management in Bioinformatics, Institute for Computer Science, Humboldt-Universität Zu Berlin, Unter Den Linden 6, Berlin 10099, Germany
| |
Collapse
|
7
|
Human genotype–phenotype databases: aims, challenges and opportunities. Nat Rev Genet 2015; 16:702-15. [DOI: 10.1038/nrg3932] [Citation(s) in RCA: 77] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
8
|
Hart RK, Rico R, Hare E, Garcia J, Westbrook J, Fusaro VA. A Python package for parsing, validating, mapping and formatting sequence variants using HGVS nomenclature. ACTA ACUST UNITED AC 2014; 31:268-70. [PMID: 25273102 PMCID: PMC4287946 DOI: 10.1093/bioinformatics/btu630] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
UNLABELLED Biological sequence variants are commonly represented in scientific literature, clinical reports and databases of variation using the mutation nomenclature guidelines endorsed by the Human Genome Variation Society (HGVS). Despite the widespread use of the standard, no freely available and comprehensive programming libraries are available. Here we report an open-source and easy-to-use Python library that facilitates the parsing, manipulation, formatting and validation of variants according to the HGVS specification. The current implementation focuses on the subset of the HGVS recommendations that precisely describe sequence-level variation relevant to the application of high-throughput sequencing to clinical diagnostics. AVAILABILITY AND IMPLEMENTATION The package is released under the Apache 2.0 open-source license. Source code, documentation and issue tracking are available at http://bitbucket.org/hgvs/hgvs/. Python packages are available at PyPI (https://pypi.python.org/pypi/hgvs). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Reece K Hart
- Invitae Inc., San Francisco, CA 94107 and 23andMe Inc., Mountain View, CA 94043, USA Invitae Inc., San Francisco, CA 94107 and 23andMe Inc., Mountain View, CA 94043, USA
| | - Rudolph Rico
- Invitae Inc., San Francisco, CA 94107 and 23andMe Inc., Mountain View, CA 94043, USA
| | - Emily Hare
- Invitae Inc., San Francisco, CA 94107 and 23andMe Inc., Mountain View, CA 94043, USA
| | - John Garcia
- Invitae Inc., San Francisco, CA 94107 and 23andMe Inc., Mountain View, CA 94043, USA
| | - Jody Westbrook
- Invitae Inc., San Francisco, CA 94107 and 23andMe Inc., Mountain View, CA 94043, USA
| | - Vincent A Fusaro
- Invitae Inc., San Francisco, CA 94107 and 23andMe Inc., Mountain View, CA 94043, USA
| |
Collapse
|
9
|
Choquet R, Maaroufi M, de Carrara A, Messiaen C, Luigi E, Landais P. A methodology for a minimum data set for rare diseases to support national centers of excellence for healthcare and research. J Am Med Inform Assoc 2014; 22:76-85. [PMID: 25038198 DOI: 10.1136/amiajnl-2014-002794] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Although rare disease patients make up approximately 6-8% of all patients in Europe, it is often difficult to find the necessary expertise for diagnosis and care and the patient numbers needed for rare disease research. The second French National Plan for Rare Diseases highlighted the necessity for better care coordination and epidemiology for rare diseases. A clinical data standard for normalization and exchange of rare disease patient data was proposed. The original methodology used to build the French national minimum data set (F-MDS-RD) common to the 131 expert rare disease centers is presented. METHODS To encourage consensus at a national level for homogeneous data collection at the point of care for rare disease patients, we first identified four national expert groups. We reviewed the scientific literature for rare disease common data elements (CDEs) in order to build the first version of the F-MDS-RD. The French rare disease expert centers validated the data elements (DEs). The resulting F-MDS-RD was reviewed and approved by the National Plan Strategic Committee. It was then represented in an HL7 electronic format to maximize interoperability with electronic health records. RESULTS The F-MDS-RD is composed of 58 DEs in six categories: patient, family history, encounter, condition, medication, and questionnaire. It is HL7 compatible and can use various ontologies for diagnosis or sign encoding. The F-MDS-RD was aligned with other CDE initiatives for rare diseases, thus facilitating potential interconnections between rare disease registries. CONCLUSIONS The French F-MDS-RD was defined through national consensus. It can foster better care coordination and facilitate determining rare disease patients' eligibility for research studies, trials, or cohorts. Since other countries will need to develop their own standards for rare disease data collection, they might benefit from the methods presented here.
Collapse
Affiliation(s)
- Rémy Choquet
- BNDMR, Assistance Publique Hôpitaux de Paris, Hôpital Necker Enfants Malades, Paris, France INSERM, U1142, LIMICS, Paris, France
| | - Meriem Maaroufi
- BNDMR, Assistance Publique Hôpitaux de Paris, Hôpital Necker Enfants Malades, Paris, France INSERM, U1142, LIMICS, Paris, France
| | - Albane de Carrara
- BNDMR, Assistance Publique Hôpitaux de Paris, Hôpital Necker Enfants Malades, Paris, France
| | - Claude Messiaen
- BNDMR, Assistance Publique Hôpitaux de Paris, Hôpital Necker Enfants Malades, Paris, France
| | - Emmanuel Luigi
- Direction Générale de l'Offre de Soins, Ministère de la Santé et de la Solidarité, Paris, France
| | - Paul Landais
- BNDMR, Assistance Publique Hôpitaux de Paris, Hôpital Necker Enfants Malades, Paris, France Faculty of Medicine, EA2415, Clinical Research University Institute, Montpellier 1 University and BESPIM, Nîmes University Hospital, France
| |
Collapse
|
10
|
Wasmann RA, Wassink-Ruiter JSK, Sundin OH, Morales E, Verheij JBGM, Pott JWR. Novel membrane frizzled-related protein gene mutation as cause of posterior microphthalmia resulting in high hyperopia with macular folds. Acta Ophthalmol 2014; 92:276-81. [PMID: 23742260 DOI: 10.1111/aos.12105] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
PURPOSE We present a genetic and clinical analysis of two sisters, 3 and 4 years of age, with nanophthalmos and macular folds. METHODS Ophthalmological examination, general paediatric examination and molecular genetic analysis of the MFRP gene were performed in both affected siblings. RESULTS Clinical analysis showed high hyperopia (+11 D and +12 D), short axial lengths (15 mm) and the presence of macular folds and optic nerve head drusen. Autofluorescence of the retina was generally normal with subtle macular abnormalities. Sequence analysis showed compound heterozygosity for severe MFRP mutations in both sisters: a previously reported p.Asn167fs (c.498dupC) and a novel stop codon mutation p.Gln91X (c.271C>T). CONCLUSION These are the youngest nanophthalmos patients in the literature identified with severe loss of MFRP function, showing already the known structural abnormalities for this disease. Adult patients affected by homozygous or compound heterozygous MFRP mutations generally show signs of retinal dystrophy, with ERG disturbances and RPE abnormalities on autofluorescence imaging. ERG examination could not be performed in these children, but extensive RPE abnormalities were not seen at this young age.
Collapse
Affiliation(s)
- Rosemarie A Wasmann
- Department of Ophthalmology, University of Groningen, University Medical Center Groningen, Groningen, The NetherlandsDepartment of Clinical Genetics, University of Groningen, University Medical Center Groningen, Groningen, The NetherlandsDepartment of Biomedical Sciences Center of Excellence for Neuroscience, Foster School of Medicine, Texas Tech Health Sciences Center, El Paso, Texas, USA
| | | | | | | | | | | |
Collapse
|
11
|
Jimeno Yepes A, Verspoor K. Mutation extraction tools can be combined for robust recognition of genetic variants in the literature. F1000Res 2014; 3:18. [PMID: 25285203 PMCID: PMC4176422 DOI: 10.12688/f1000research.3-18.v2] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/27/2014] [Indexed: 11/20/2022] Open
Abstract
As the cost of genomic sequencing continues to fall, the amount of data being collected and studied for the purpose of understanding the genetic basis of disease is increasing dramatically. Much of the source information relevant to such efforts is available only from unstructured sources such as the scientific literature, and significant resources are expended in manually curating and structuring the information in the literature. As such, there have been a number of systems developed to target automatic extraction of mutations and other genetic variation from the literature using text mining tools. We have performed a broad survey of the existing publicly available tools for extraction of genetic variants from the scientific literature. We consider not just one tool but a number of different tools, individually and in combination, and apply the tools in two scenarios. First, they are compared in an intrinsic evaluation context, where the tools are tested for their ability to identify specific mentions of genetic variants in a corpus of manually annotated papers, the Variome corpus. Second, they are compared in an extrinsic evaluation context based on our previous study of text mining support for curation of the COSMIC and InSiGHT databases. Our results demonstrate that no single tool covers the full range of genetic variants mentioned in the literature. Rather, several tools have complementary coverage and can be used together effectively. In the intrinsic evaluation on the Variome corpus, the combined performance is above 0.95 in F-measure, while in the extrinsic evaluation the combined recall performance is above 0.71 for COSMIC and above 0.62 for InSiGHT, a substantial improvement over the performance of any individual tool. Based on the analysis of these results, we suggest several directions for the improvement of text mining tools for genetic variant extraction from the literature.
Collapse
Affiliation(s)
- Antonio Jimeno Yepes
- National ICT Australia, Victoria Research Laboratory, Melbourne, Australia ; Department of Computing and Information Systems, The University of Melbourne, Melbourne, Australia
| | - Karin Verspoor
- National ICT Australia, Victoria Research Laboratory, Melbourne, Australia ; Department of Computing and Information Systems, The University of Melbourne, Melbourne, Australia
| |
Collapse
|
12
|
Byrne M, Fokkema IF, Lancaster O, Adamusiak T, Ahonen-Bishopp A, Atlan D, Béroud C, Cornell M, Dalgleish R, Devereau A, Patrinos GP, Swertz MA, Taschner PE, Thorisson GA, Vihinen M, Brookes AJ, Muilu J. VarioML framework for comprehensive variation data representation and exchange. BMC Bioinformatics 2012; 13:254. [PMID: 23031277 PMCID: PMC3507772 DOI: 10.1186/1471-2105-13-254] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2012] [Accepted: 09/23/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Sharing of data about variation and the associated phenotypes is a critical need, yet variant information can be arbitrarily complex, making a single standard vocabulary elusive and re-formatting difficult. Complex standards have proven too time-consuming to implement. RESULTS The GEN2PHEN project addressed these difficulties by developing a comprehensive data model for capturing biomedical observations, Observ-OM, and building the VarioML format around it. VarioML pairs a simplified open specification for describing variants, with a toolkit for adapting the specification into one's own research workflow. Straightforward variant data can be captured, federated, and exchanged with no overhead; more complex data can be described, without loss of compatibility. The open specification enables push-button submission to gene variant databases (LSDBs) e.g., the Leiden Open Variation Database, using the Cafe Variome data publishing service, while VarioML bidirectionally transforms data between XML and web-application code formats, opening up new possibilities for open source web applications building on shared data. A Java implementation toolkit makes VarioML easily integrated into biomedical applications. VarioML is designed primarily for LSDB data submission and transfer scenarios, but can also be used as a standard variation data format for JSON and XML document databases and user interface components. CONCLUSIONS VarioML is a set of tools and practices improving the availability, quality, and comprehensibility of human variation information. It enables researchers, diagnostic laboratories, and clinics to share that information with ease, clarity, and without ambiguity.
Collapse
Affiliation(s)
- Myles Byrne
- Institute for Molecular Medicine Finland-FIMM, University of Helsinki, Helsinki, Finland.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|