Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Claustres M, Horaitis O, Vanevski M, Cotton RGH. Time for a unified system of mutation description and reporting: a review of locus-specific mutation databases. Genome Res 2002;12:680-8. [PMID: 11997335 DOI: 10.1101/gr.217702] [Citation(s) in RCA: 104] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

For:	Claustres M, Horaitis O, Vanevski M, Cotton RGH. Time for a unified system of mutation description and reporting: a review of locus-specific mutation databases. Genome Res 2002;12:680-8. [PMID: 11997335 DOI: 10.1101/gr.217702] [Citation(s) in RCA: 104] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Number

Cited by Other Article(s)

Claustres M, Thèze C, des Georges M, Baux D, Girodon E, Bienvenu T, Audrezet MP, Dugueperoux I, Férec C, Lalau G, Pagin A, Kitzis A, Thoreau V, Gaston V, Bieth E, Malinge MC, Reboul MP, Fergelot P, Lemonnier L, Mekki C, Fanen P, Bergougnoux A, Sasorith S, Raynal C, Bareil C. CFTR-France, a national relational patient database for sharing genetic and phenotypic data associated with rare CFTR variants. Hum Mutat 2017;38:1297-1315. [PMID: 28603918 DOI: 10.1002/humu.23276] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2017] [Revised: 05/31/2017] [Accepted: 06/04/2017] [Indexed: 11/09/2022]

Affiliation(s)

Mireille Claustres Laboratoire de Génétique Moléculaire, Centre Hospitalier Universitaire et Université de Montpellier, Montpellier, France
Corinne Thèze Laboratoire de Génétique Moléculaire, Centre Hospitalier Universitaire et Université de Montpellier, Montpellier, France
Marie des Georges Laboratoire de Génétique Moléculaire, Centre Hospitalier Universitaire et Université de Montpellier, Montpellier, France
David Baux Laboratoire de Génétique Moléculaire, Centre Hospitalier Universitaire et Université de Montpellier, Montpellier, France
Emmanuelle Girodon Service de Génétique et Biologie Moléculaires, Groupe Hospitalier Cochin-Broca-Hotel Dieu, Paris, France
Thierry Bienvenu Service de Génétique et Biologie Moléculaires, Groupe Hospitalier Cochin-Broca-Hotel Dieu, Paris, France
Marie-Pierre Audrezet Laboratoire de Génétique Moléculaire et d'Histocompatibilité, Centre Hospitalier Régional Universitaire, Brest, France
Ingrid Dugueperoux Laboratoire de Génétique Moléculaire et d'Histocompatibilité, Centre Hospitalier Régional Universitaire, Brest, France
Claude Férec Laboratoire de Génétique Moléculaire et d'Histocompatibilité, Centre Hospitalier Régional Universitaire, Brest, France
Guy Lalau Centre de Biologie Pathologie Génétique, Centre Hospitalier Régional Universitaire, Lille, France
Adrien Pagin Centre de Biologie Pathologie Génétique, Centre Hospitalier Régional Universitaire, Lille, France
Alain Kitzis Département de Génétique, Centre Hospitalier Universitaire, Poitiers, France
Vincent Thoreau Département de Génétique, Centre Hospitalier Universitaire, Poitiers, France
Véronique Gaston Service de Génétique Médicale, Centre Hospitalier Universitaire, Toulouse, France
Eric Bieth Service de Génétique Médicale, Centre Hospitalier Universitaire, Toulouse, France
Marie-Claire Malinge Département de Biochimie Génétique, Institut de Biologie en Santé, Centre Hospitalier Universitaire, Angers, France
Marie-Pierre Reboul Laboratoire de Génétique Moléculaire, Centre Hospitalier Régional Universitaire, Bordeaux, France
Patricia Fergelot Laboratoire Maladies Rares, Génétique et Métabolisme, Bordeaux, France
Lydie Lemonnier Registre français de la mucoviscidose, Vaincre la Mucoviscidose, Paris, France
Chadia Mekki Laboratoire de Génétique, Hôpital Henri Mondor, Créteil, France
Pascale Fanen Laboratoire de Génétique, Hôpital Henri Mondor, Créteil, France
Anne Bergougnoux Laboratoire de Génétique Moléculaire, Centre Hospitalier Universitaire et Université de Montpellier, Montpellier, France
Souphatta Sasorith Laboratoire de Génétique Moléculaire, Centre Hospitalier Universitaire et Université de Montpellier, Montpellier, France
Caroline Raynal Laboratoire de Génétique Moléculaire, Centre Hospitalier Universitaire et Université de Montpellier, Montpellier, France
Corinne Bareil Laboratoire de Génétique Moléculaire, Centre Hospitalier Universitaire et Université de Montpellier, Montpellier, France

Collapse

Belhassan K, Ouldim K, Sefiani AA. Genetics and genomic medicine in Morocco: the present hope can make the future bright. Mol Genet Genomic Med 2016;4:588-598. [PMID: 27896281 PMCID: PMC5118203 DOI: 10.1002/mgg3.255] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open

Béroud C, Letovsky SI, Braastad CD, Caputo SM, Beaudoux O, Bignon YJ, Bressac-De Paillerets B, Bronner M, Buell CM, Collod-Béroud G, Coulet F, Derive N, Divincenzo C, Elzinga CD, Garrec C, Houdayer C, Karbassi I, Lizard S, Love A, Muller D, Nagan N, Nery CR, Rai G, Revillion F, Salgado D, Sévenet N, Sinilnikova O, Sobol H, Stoppa-Lyonnet D, Toulas C, Trautman E, Vaur D, Vilquin P, Weymouth KS, Willis A, Eisenberg M, Strom CM. BRCA Share: A Collection of Clinical BRCA Gene Variants. Hum Mutat 2016;37:1318-1328. [DOI: 10.1002/humu.23113] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2016] [Accepted: 09/02/2016] [Indexed: 12/12/2022]

Affiliation(s)

Christophe Béroud Aix Marseille Univ; INSERM, GMGF Marseille France APHM; Hôpital TIMONE Enfants; Laboratoire de Génétique Moléculaire; Marseille France
Stanley I. Letovsky Laboratory Corporation of America; Westborough Massachusetts
Corey D. Braastad Quest Diagnostics; Marlborough Massachusetts
Sandrine M. Caputo Service de Génétique; Department de Biologie des Tumeurs; Institut Curie; Paris France
Olivia Beaudoux CHU et Institut Jean Godinot; Reims France
Yves Jean Bignon Centre Jean Perrin; Clermont-Ferrand France
Brigitte Bressac-De Paillerets Institut Gustave Roussy; Villejuif France
Myriam Bronner CHU de Nancy-Brabois; Vandoeuvre-lés-Nancy France
Crystal M. Buell Quest Diagnostics; Marlborough Massachusetts
Gwenaëlle Collod-Béroud Aix Marseille Univ; INSERM, GMGF Marseille France
Florence Coulet Groupe hospitalier Pitié-Salpêtrière, Assistance Publique-Hôpitaux de Paris, Laboratoire d'Oncogénétique et Angiogénétique moléculaire; Université Pierre et Marie Curie; Paris France
Nicolas Derive Service de Génétique; Department de Biologie des Tumeurs; Institut Curie; Paris France
Christina Divincenzo Quest Diagnostics; Marlborough Massachusetts
Christopher D. Elzinga Quest Diagnostics; Marlborough Massachusetts
Céline Garrec CHU; Institut de Biologie; Hôtel Dieu Nantes France
Claude Houdayer Service de Génétique; Department de Biologie des Tumeurs; Institut Curie; Paris France Université Paris Descartes; Paris France
Izabela Karbassi Quest Diagnostics; Marlborough Massachusetts
Sarab Lizard CHU de Dijon; Hôpital d'Enfants; Service de Génétique Médicale Dijon France
Angela Love Quest Diagnostics; Marlborough Massachusetts
Danièle Muller Centre Paul Strauss; Strasbourg France
Narasimhan Nagan Laboratory Corporation of America; Westborough Massachusetts
Camille R. Nery Quest Diagnostics; San Juan Capistrano California
Ghadi Rai Aix Marseille Univ; INSERM, GMGF Marseille France
Françoise Revillion Centre Oscar Lambret; Unité d'Oncologie Moléculaire Humaine; Lille France
David Salgado Aix Marseille Univ; INSERM, GMGF Marseille France
Nicolas Sévenet Institut Bergonié; Bordeaux France
Olga Sinilnikova Hospices Civils de Lyon and Centre Léon Bérard; Lyon France
Hagay Sobol Institut Paoli-Calmettes; Marseille France
Dominique Stoppa-Lyonnet Service de Génétique; Department de Biologie des Tumeurs; Institut Curie; Paris France Université Paris Descartes; Paris France
Christine Toulas Institut Claudius Régaud; Toulouse France
Edwin Trautman Laboratory Corporation of America; Westborough Massachusetts
Dominique Vaur Laboratoire de biologie et de génétique du cancer; CLCC François Baclesse; INSERM 1079 Centre Normand de Génomique et de Médecine Personnalisée; Caen France
Paul Vilquin Laboratoire de Biologie Cellulaire et Hormonale (CHU Arnaud de Villeneuve); Montpellier France
Katelyn S. Weymouth Laboratory Corporation of America; Westborough Massachusetts
Alecia Willis Laboratory Corporation of America; Research Triangle Park North Carolina
Marcia Eisenberg Laboratory Corporation of America; Research Triangle Park North Carolina
Charles M Strom Quest Diagnostics; San Juan Capistrano California

Collapse

Verspoor KM, Heo GE, Kang KY, Song M. Establishing a baseline for literature mining human genetic variants and their relationships to disease cohorts. BMC Med Inform Decis Mak 2016;16 Suppl 1:68. [PMID: 27454860 PMCID: PMC4959367 DOI: 10.1186/s12911-016-0294-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

The Variome corpus, a small collection of published articles about inherited colorectal cancer, includes annotations of 11 entity types and 13 relation types related to the curation of the relationship between genetic variation and disease. Due to the richness of these annotations, the corpus provides a good testbed for evaluation of biomedical literature information extraction systems.

METHODS

In this paper, we focus on assessing performance on extracting the relations in the corpus, using gold standard entities as a starting point, to establish a baseline for extraction of relations important for extraction of genetic variant information from the literature. We test the application of the Public Knowledge Discovery Engine for Java (PKDE4J) system, a natural language processing system designed for information extraction of entities and relations in text, on the relation extraction task using this corpus.

RESULTS

For the relations which are attested at least 100 times in the Variome corpus, we realise a performance ranging from 0.78-0.84 Precision-weighted F-score, depending on the relation. We find that the PKDE4J system adapted straightforwardly to the range of relation types represented in the corpus; some extensions to the original methodology were required to adapt to the multi-relational classification context. The results are competitive with state-of-the-art relation extraction performance on more heavily studied corpora, although the analysis shows that the Recall of a co-occurrence baseline outweighs the benefit of improved Precision for many relations, indicating the value of simple semantic constraints on relations.

CONCLUSIONS

This work represents the first attempt to apply relation extraction methods to the Variome corpus. The results demonstrate that automated methods have good potential to structure the information expressed in the published literature related to genetic variants, connecting mutations to genes, diseases, and patient cohorts. Further development of such approaches will facilitate more efficient biocuration of genetic variant information into structured databases, leveraging the knowledge embedded in the vast publication literature.

Collapse

Singhal A, Simmons M, Lu Z. Text mining for precision medicine: automating disease-mutation relationship extraction from biomedical literature. J Am Med Inform Assoc 2016;23:766-72. [PMID: 27121612 DOI: 10.1093/jamia/ocw041] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 02/19/2016] [Indexed: 11/14/2022] Open

Dalgleish R. LSDBs and How They Have Evolved. Hum Mutat 2016;37:532-9. [DOI: 10.1002/humu.22979] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Accepted: 02/18/2016] [Indexed: 01/10/2023]

Savige J, Dalgleish R, Cotton RG, den Dunnen JT, Macrae F, Povey S. The Human Variome Project: ensuring the quality of DNA variant databases in inherited renal disease. Pediatr Nephrol 2015;30:1893-901. [PMID: 25384529 DOI: 10.1007/s00467-014-2994-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/17/2014] [Revised: 10/09/2014] [Accepted: 10/15/2014] [Indexed: 02/02/2023]

Aziz N, Zhao Q, Bry L, Driscoll DK, Funke B, Gibson JS, Grody WW, Hegde MR, Hoeltge GA, Leonard DGB, Merker JD, Nagarajan R, Palicki LA, Robetorye RS, Schrijver I, Weck KE, Voelkerding KV. College of American Pathologists' Laboratory Standards for Next-Generation Sequencing Clinical Tests. Arch Pathol Lab Med 2015;139:481-93. [DOI: 10.5858/arpa.2014-0250-cp] [Citation(s) in RCA: 265] [Impact Index Per Article: 29.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

New functional and structural insights from updated mutational databases for complement factor H, Factor I, membrane cofactor protein and C3. Biosci Rep 2014;34:BSR20140117. [PMID: 25188723 PMCID: PMC4206863 DOI: 10.1042/bsr20140117] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open

Kountouris P, Lederer CW, Fanis P, Feleki X, Old J, Kleanthous M. IthaGenes: an interactive database for haemoglobin variations and epidemiology. PLoS One 2014;9:e103020. [PMID: 25058394 PMCID: PMC4109966 DOI: 10.1371/journal.pone.0103020] [Citation(s) in RCA: 168] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2013] [Accepted: 06/27/2014] [Indexed: 02/07/2023] Open

Savige J, Dagher H, Povey S. Mutation databases for inherited renal disease: are they complete, accurate, clinically relevant, and freely available? Hum Mutat 2014;35:791-3. [PMID: 24826923 DOI: 10.1002/humu.22588] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2013] [Accepted: 04/09/2014] [Indexed: 12/22/2022]

Soussi T, Leroy B, Taschner PEM. Recommendations for analyzing and reporting TP53 gene variants in the high-throughput sequencing era. Hum Mutat 2014;35:766-78. [PMID: 24729566 DOI: 10.1002/humu.22561] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2013] [Accepted: 04/02/2014] [Indexed: 12/27/2022]

Horaitis O, Cotton RG. Human Mutation Databases. ACTA ACUST UNITED AC 2014;Chapter 7:Unit 7.11. [DOI: 10.1002/0471142905.hg0711s44] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Soussi T. Locus-Specific Databases in Cancer: What Future in a Post-Genomic Era? The TP53 LSDB paradigm. Hum Mutat 2014;35:643-53. [DOI: 10.1002/humu.22518] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2013] [Accepted: 01/16/2014] [Indexed: 11/08/2022]

Jimeno Yepes A, Verspoor K. Literature mining of genetic variants for curation: quantifying the importance of supplementary material. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014;2014:bau003. [PMID: 24520105 PMCID: PMC3920087 DOI: 10.1093/database/bau003] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Abstract

A major focus of modern biological research is the understanding of how genomic variation relates to disease. Although there are significant ongoing efforts to capture this understanding in curated resources, much of the information remains locked in unstructured sources, in particular, the scientific literature. Thus, there have been several text mining systems developed to target extraction of mutations and other genetic variation from the literature. We have performed the first study of the use of text mining for the recovery of genetic variants curated directly from the literature. We consider two curated databases, COSMIC (Catalogue Of Somatic Mutations In Cancer) and InSiGHT (International Society for Gastro-intestinal Hereditary Tumours), that contain explicit links to the source literature for each included mutation. Our analysis shows that the recall of the mutations catalogued in the databases using a text mining tool is very low, despite the well-established good performance of the tool and even when the full text of the associated article is available for processing. We demonstrate that this discrepancy can be explained by considering the supplementary material linked to the published articles, not previously considered by text mining tools. Although it is anecdotally known that supplementary material contains 'all of the information', and some researchers have speculated about the role of supplementary material (Schenck et al. Extraction of genetic mutations associated with cancer from public literature. J Health Med Inform 2012;S2:2.), our analysis substantiates the significant extent to which this material is critical. Our results highlight the need for literature mining tools to consider not only the narrative content of a publication but also the full set of material related to a publication.

Collapse

Jimeno Yepes A, Verspoor K. Mutation extraction tools can be combined for robust recognition of genetic variants in the literature. F1000Res 2014;3:18. [PMID: 25285203 PMCID: PMC4176422 DOI: 10.12688/f1000research.3-18.v2] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/27/2014] [Indexed: 11/20/2022] Open

Abstract

As the cost of genomic sequencing continues to fall, the amount of data being collected and studied for the purpose of understanding the genetic basis of disease is increasing dramatically. Much of the source information relevant to such efforts is available only from unstructured sources such as the scientific literature, and significant resources are expended in manually curating and structuring the information in the literature. As such, there have been a number of systems developed to target automatic extraction of mutations and other genetic variation from the literature using text mining tools. We have performed a broad survey of the existing publicly available tools for extraction of genetic variants from the scientific literature. We consider not just one tool but a number of different tools, individually and in combination, and apply the tools in two scenarios. First, they are compared in an intrinsic evaluation context, where the tools are tested for their ability to identify specific mentions of genetic variants in a corpus of manually annotated papers, the Variome corpus. Second, they are compared in an extrinsic evaluation context based on our previous study of text mining support for curation of the COSMIC and InSiGHT databases. Our results demonstrate that no single tool covers the full range of genetic variants mentioned in the literature. Rather, several tools have complementary coverage and can be used together effectively. In the intrinsic evaluation on the Variome corpus, the combined performance is above 0.95 in F-measure, while in the extrinsic evaluation the combined recall performance is above 0.71 for COSMIC and above 0.62 for InSiGHT, a substantial improvement over the performance of any individual tool. Based on the analysis of these results, we suggest several directions for the improvement of text mining tools for genetic variant extraction from the literature.

Collapse

Plazzer JP, Macrae F. DNA variant databases: current state and future directions. Methods Mol Biol 2014;1168:263-73. [PMID: 24870141 DOI: 10.1007/978-1-4939-0847-9_15] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]

Peterson TA, Doughty E, Kann MG. Towards precision medicine: advances in computational approaches for the analysis of human variants. J Mol Biol 2013;425:4047-63. [PMID: 23962656 PMCID: PMC3807015 DOI: 10.1016/j.jmb.2013.08.008] [Citation(s) in RCA: 106] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2013] [Revised: 08/07/2013] [Accepted: 08/08/2013] [Indexed: 12/26/2022]

The Moroccan Genetic Disease Database (MGDD): a database for DNA variations related to inherited disorders and disease susceptibility. Eur J Hum Genet 2013;22:322-6. [PMID: 23860041 DOI: 10.1038/ejhg.2013.151] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2013] [Revised: 05/28/2013] [Accepted: 06/11/2013] [Indexed: 11/09/2022] Open

Rallapalli PM, Kemball-Cook G, Tuddenham EG, Gomez K, Perkins SJ. An interactive mutation database for human coagulation factor IX provides novel insights into the phenotypes and genetics of hemophilia B. J Thromb Haemost 2013;11:1329-40. [PMID: 23617593 DOI: 10.1111/jth.12276] [Citation(s) in RCA: 112] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2012] [Accepted: 04/18/2013] [Indexed: 11/27/2022]

Al-Numair NS, Martin ACR. The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations. BMC Genomics 2013;14 Suppl 3:S4. [PMID: 23819919 PMCID: PMC3665582 DOI: 10.1186/1471-2164-14-s3-s4] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Verspoor K, Jimeno Yepes A, Cavedon L, McIntosh T, Herten-Crabb A, Thomas Z, Plazzer JP. Annotating the biomedical literature for the human variome. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2013;2013:bat019. [PMID: 23584833 PMCID: PMC3676157 DOI: 10.1093/database/bat019] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Georgitsi M, Patrinos GP. Genetic databases in pharmacogenomics: the Frequency of Inherited Disorders Database (FINDbase). Methods Mol Biol 2013;1015:321-336. [PMID: 23824866 DOI: 10.1007/978-1-62703-435-7_21] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]

Abel O, Powell JF, Andersen PM, Al-Chalabi A. ALSoD: A user-friendly online bioinformatics tool for amyotrophic lateral sclerosis genetics. Hum Mutat 2012;33:1345-51. [PMID: 22753137 DOI: 10.1002/humu.22157] [Citation(s) in RCA: 216] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Accepted: 06/19/2012] [Indexed: 12/11/2022]

Humbertclaude V, Hamroun D, Bezzou K, Bérard C, Boespflug-Tanguy O, Bommelaer C, Campana-Salort E, Cances C, Chabrol B, Commare MC, Cuisset JM, de Lattre C, Desnuelle C, Echenne B, Halbert C, Jonquet O, Labarre-Vila A, N'Guyen-Morel MA, Pages M, Pepin JL, Petitjean T, Pouget J, Ollagnon-Roman E, Richelme C, Rivier F, Sacconi S, Tiffreau V, Vuillerot C, Picot MC, Claustres M, Béroud C, Tuffery-Giraud S. Motor and respiratory heterogeneity in Duchenne patients: implication for clinical trials. Eur J Paediatr Neurol 2012;16:149-60. [PMID: 21920787 DOI: 10.1016/j.ejpn.2011.07.001] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/09/2011] [Revised: 07/13/2011] [Accepted: 07/17/2011] [Indexed: 01/06/2023]

Li Z, Liu X, Wen J, Xu Y, Zhao X, Li X, Liu L, Zhang X. DRUMS: a human disease related unique gene mutation search engine. Hum Mutat 2012;32:E2259-65. [PMID: 21913285 DOI: 10.1002/humu.21556] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]

Vihinen M, den Dunnen JT, Dalgleish R, Cotton RGH. Guidelines for establishing locus specific databases. Hum Mutat 2011;33:298-305. [PMID: 22052659 DOI: 10.1002/humu.21646] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2011] [Accepted: 10/25/2011] [Indexed: 11/06/2022]

Patnaik SK, Helmberg W, Blumenfeld OO. BGMUT: NCBI dbRBC database of allelic variations of genes encoding antigens of blood group systems. Nucleic Acids Res 2011;40:D1023-9. [PMID: 22084196 PMCID: PMC3245102 DOI: 10.1093/nar/gkr958] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open

Celli J, Dalgleish R, Vihinen M, Taschner PEM, den Dunnen JT. Curating gene variant databases (LSDBs): toward a universal standard. Hum Mutat 2011;33:291-7. [PMID: 21990126 DOI: 10.1002/humu.21626] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2011] [Accepted: 09/21/2011] [Indexed: 01/27/2023]

Zatkova A, Sedlackova T, Radvansky J, Polakova H, Nemethova M, Aquaron R, Dursun I, Usher JL, Kadasi L. Identification of 11 Novel Homogentisate 1,2 Dioxygenase Variants in Alkaptonuria Patients and Establishment of a Novel LOVD-Based HGD Mutation Database. JIMD Rep 2011;4:55-65. [PMID: 23430897 DOI: 10.1007/8904_2011_68] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/22/2011] [Revised: 06/01/2011] [Accepted: 06/07/2011] [Indexed: 12/05/2022] Open

Lill CM, Abel O, Bertram L, Al-Chalabi A. Keeping up with genetic discoveries in amyotrophic lateral sclerosis: the ALSoD and ALSGene databases. ACTA ACUST UNITED AC 2011;12:238-49. [PMID: 21702733 DOI: 10.3109/17482968.2011.584629] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

TP53 mutations in human cancer: database reassessment and prospects for the next decade. Adv Cancer Res 2011;110:107-39. [PMID: 21704230 DOI: 10.1016/b978-0-12-386469-7.00005-0] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Cotton RGH. Rare disease registries and mutation/variation databases. Hum Mutat 2011;32:1073-4. [DOI: 10.1002/humu.21596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Webb EA, Smith TD, Cotton RGH. Difficulties in finding DNA mutations and associated phenotypic data in web resources using simple, uncomplicated search terms, and a suggested solution. Hum Genomics 2011;5:141-55. [PMID: 21504866 PMCID: PMC3500169 DOI: 10.1186/1479-7364-5-3-141] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open

van Baal S, Zlotogora J, Lagoumintzis G, Gkantouna V, Tzimas I, Poulas K, Tsakalidis A, Romeo G, Patrinos GP. ETHNOS : A versatile electronic tool for the development and curation of national genetic databases. Hum Genomics 2011;4:361-8. [PMID: 20650823 PMCID: PMC3500166 DOI: 10.1186/1479-7364-4-5-361] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023] Open

Fokkema IFAC, Taschner PEM, Schaafsma GCP, Celli J, Laros JFJ, den Dunnen JT. LOVD v.2.0: the next generation in gene variant databases. Hum Mutat 2011;32:557-63. [PMID: 21520333 DOI: 10.1002/humu.21438] [Citation(s) in RCA: 733] [Impact Index Per Article: 56.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2010] [Accepted: 12/14/2010] [Indexed: 01/14/2023]

Mitropoulou C, Webb AJ, Mitropoulos K, Brookes AJ, Patrinos GP. Locus-specific database domain and data content analysis: evolution and content maturation toward clinical use. Hum Mutat 2011;31:1109-16. [PMID: 20672379 DOI: 10.1002/humu.21332] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Doughty E, Kertesz-Farkas A, Bodenreider O, Thompson G, Adadey A, Peterson T, Kann MG. Toward an automatic method for extracting cancer- and other disease-related point mutations from the biomedical literature. ACTA ACUST UNITED AC 2010;27:408-15. [PMID: 21138947 DOI: 10.1093/bioinformatics/btq667] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Abstract

MOTIVATION

A major goal of biomedical research in personalized medicine is to find relationships between mutations and their corresponding disease phenotypes. However, most of the disease-related mutational data are currently buried in the biomedical literature in textual form and lack the necessary structure to allow easy retrieval and visualization. We introduce a high-throughput computational method for the identification of relevant disease mutations in PubMed abstracts applied to prostate (PCa) and breast cancer (BCa) mutations.

RESULTS

We developed the extractor of mutations (EMU) tool to identify mutations and their associated genes. We benchmarked EMU against MutationFinder--a tool to extract point mutations from text. Our results show that both methods achieve comparable performance on two manually curated datasets. We also benchmarked EMU's performance for extracting the complete mutational information and phenotype. Remarkably, we show that one of the steps in our approach, a filter based on sequence analysis, increases the precision for that task from 0.34 to 0.59 (PCa) and from 0.39 to 0.61 (BCa). We also show that this high-throughput approach can be extended to other diseases.

DISCUSSION

Our method improves the current status of disease-mutation databases by significantly increasing the number of annotated mutations. We found 51 and 128 mutations manually verified to be related to PCa and Bca, respectively, that are not currently annotated for these cancer types in the OMIM or Swiss-Prot databases. EMU's retrieval performance represents a 2-fold improvement in the number of annotated mutations for PCa and BCa. We further show that our method can benefit from full-text analysis once there is an increase in Open Access availability of full-text articles.

AVAILABILITY

Freely available at: http://bioinf.umbc.edu/EMU/ftp.

Collapse

Corrales I, Ramírez L, Ayats J, Altisent C, Parra R, Vidal F. Integration of molecular and clinical data of 40 unrelated von Willebrand Disease families in a Spanish locus-specific mutation database: first release including 58 mutations. Haematologica 2010;95:1982-4. [PMID: 20801902 DOI: 10.3324/haematol.2010.028977] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open

Küntzer J, Eggle D, Klostermann S, Burtscher H. Human variation databases. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2010;2010:baq015. [PMID: 20639550 PMCID: PMC2911800 DOI: 10.1093/database/baq015] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Bareil C, Thèze C, Béroud C, Hamroun D, Guittard C, René C, Paulet D, Georges MD, Claustres M. UMD-CFTR: A database dedicated to CF and CFTR-related disorders. Hum Mutat 2010;31:1011-9. [DOI: 10.1002/humu.21316] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Cotton RGH, Macrae FA. Reducing the burden of inherited disease: the Human Variome Project. Med J Aust 2010;192:628-9. [DOI: 10.5694/j.1326-5377.2010.tb03658.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Robinson PN, Mundlos S. The human phenotype ontology. Clin Genet 2010;77:525-34. [PMID: 20412080 DOI: 10.1111/j.1399-0004.2010.01436.x] [Citation(s) in RCA: 197] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

[Genetic mutation databases: stakes and perspectives for orphan genetic diseases]. PATHOLOGIE-BIOLOGIE 2009;58:387-95. [PMID: 19954899 DOI: 10.1016/j.patbio.2009.09.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2009] [Accepted: 09/14/2009] [Indexed: 12/30/2022]

Wei MH, Blake PW, Shevchenko J, Toro JR. The folliculin mutation database: an online database of mutations associated with Birt-Hogg-Dubé syndrome. Hum Mutat 2009;30:E880-90. [PMID: 19562744 DOI: 10.1002/humu.21075] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Izarzugaza JMG, Baresic A, McMillan LEM, Yeats C, Clegg AB, Orengo CA, Martin ACR, Valencia A. An integrated approach to the interpretation of single amino acid polymorphisms within the framework of CATH and Gene3D. BMC Bioinformatics 2009;10 Suppl 8:S5. [PMID: 19758469 PMCID: PMC2745587 DOI: 10.1186/1471-2105-10-s8-s5] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open

Abstract

BACKGROUND

The phenotypic effects of sequence variations in protein-coding regions come about primarily via their effects on the resulting structures, for example by disrupting active sites or affecting structural stability. In order better to understand the mechanisms behind known mutant phenotypes, and predict the effects of novel variations, biologists need tools to gauge the impacts of DNA mutations in terms of their structural manifestation. Although many mutations occur within domains whose structure has been solved, many more occur within genes whose protein products have not been structurally characterized.

RESULTS

Here we present 3DSim (3D Structural Implication of Mutations), a database and web application facilitating the localization and visualization of single amino acid polymorphisms (SAAPs) mapped to protein structures even where the structure of the protein of interest is unknown. The server displays information on 6514 point mutations, 4865 of them known to be associated with disease. These polymorphisms are drawn from SAAPdb, which aggregates data from various sources including dbSNP and several pathogenic mutation databases. While the SAAPdb interface displays mutations on known structures, 3DSim projects mutations onto known sequence domains in Gene3D. This resource contains sequences annotated with domains predicted to belong to structural families in the CATH database. Mappings between domain sequences in Gene3D and known structures in CATH are obtained using a MUSCLE alignment. 1210 three-dimensional structures corresponding to CATH structural domains are currently included in 3DSim; these domains are distributed across 396 CATH superfamilies, and provide a comprehensive overview of the distribution of mutations in structural space.

CONCLUSION

The server is publicly available at http://3DSim.bioinfo.cnio.es/. In addition, the database containing the mapping between SAAPdb, Gene3D and CATH is available on request and most of the functionality is available through programmatic web service access.

Collapse

Zaimidou S, van Baal S, Smith TD, Mitropoulos K, Ljujic M, Radojkovic D, Cotton RG, Patrinos GP. A1ATVar: a relational database of human SERPINA1 gene variants leading to alpha1-antitrypsin deficiency and application of the VariVis software. Hum Mutat 2009;30:308-13. [PMID: 19021233 DOI: 10.1002/humu.20857] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Hurst JM, McMillan LE, Porter CT, Allen J, Fakorede A, Martin AC. The SAAPdb web resource: A large-scale structural analysis of mutant proteins. Hum Mutat 2009;30:616-24. [DOI: 10.1002/humu.20898] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]

Reeves GA, Talavera D, Thornton JM. Genome and proteome annotation: organization, interpretation and integration. J R Soc Interface 2009;6:129-47. [PMID: 19019817 DOI: 10.1098/rsif.2008.0341] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Greenblatt MS, Brody LC, Foulkes WD, Genuardi M, Hofstra RMW, Olivier M, Plon SE, Sijmons RH, Sinilnikova O, Spurdle AB. Locus-specific databases and recommendations to strengthen their contribution to the classification of variants in cancer susceptibility genes. Hum Mutat 2008;29:1273-81. [PMID: 18951438 PMCID: PMC3446852 DOI: 10.1002/humu.20889] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]