1
|
Barnes MR. SNP and mutation data on the web - hidden treasures for uncovering. Comp Funct Genomics 2010; 3:67-74. [PMID: 18628874 PMCID: PMC2447234 DOI: 10.1002/cfg.131] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2001] [Accepted: 11/21/2001] [Indexed: 11/10/2022] Open
Abstract
SNP data has grown exponentially over the last two years, SNP database evolution has matched this growth, as initial development of several independent SNP databases has given way to one central SNP database, dbSNP. Other SNP databases have instead evolved to complement this central database by providing gene specific focus and an increased level of curation and analysis on subsets of data, derived from the central data set. By contrast, human mutation data, which has been collected over many years, is still stored in disparate sources, although moves are afoot to move to a similar central database. These developments are timely, human mutation and polymorphism data both hold complementary keys to a better understanding of how genes function and malfunction in disease. The impending availability of a complete human genome presents us with an ideal framework to integrate both these forms of data, as our understanding of the mechanisms of disease increase, the full genomic context of variation may become increasingly significant.
Collapse
Affiliation(s)
- Michael R Barnes
- Genetic Bioinformatics, GlaxoSmithKline Pharmaceuticals, New Frontiers Science Park (North), Third Avenue, Harlow, Essex CM19 5AW, UK.
| |
Collapse
|
2
|
Jiménez-Lozano N, Segura J, Macías JR, Vega J, Carazo JM. aGEM: an integrative system for analyzing spatial-temporal gene-expression information. ACTA ACUST UNITED AC 2009; 25:2566-72. [PMID: 19592395 PMCID: PMC2752607 DOI: 10.1093/bioinformatics/btp422] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Motivation: The work presented here describes the ‘anatomical Gene-Expression Mapping (aGEM)’ Platform, a development conceived to integrate phenotypic information with the spatial and temporal distributions of genes expressed in the mouse. The aGEM Platform has been built by extending the Distributed Annotation System (DAS) protocol, which was originally designed to share genome annotations over the WWW. DAS is a client-server system in which a single client integrates information from multiple distributed servers. Results: The aGEM Platform provides information to answer three main questions. (i) Which genes are expressed in a given mouse anatomical component? (ii) In which mouse anatomical structures are a given gene or set of genes expressed? And (iii) is there any correlation among these findings? Currently, this Platform includes several well-known mouse resources (EMAGE, GXD and GENSAT), hosting gene-expression data mostly obtained from in situ techniques together with a broad set of image-derived annotations. Availability: The Platform is optimized for Firefox 3.0 and it is accessed through a friendly and intuitive display: http://agem.cnb.csic.es Contact:natalia@cnb.csic.es Supplementary information:Supplementary data are available at http://bioweb.cnb.csic.es/VisualOmics/aGEM/home.html and http://bioweb.cnb.csic.es/VisualOmics/index_VO.html and Bioinformatics online.
Collapse
Affiliation(s)
- Natalia Jiménez-Lozano
- GN7 of the National Institute for Bioinformatics and Biocomputing Unit of the National Centre for Biotechnology, Darwin 3, Campus de Cantoblanco, 28049 Madrid, Spain.
| | | | | | | | | |
Collapse
|
3
|
Stenson PD, Mort M, Ball EV, Howells K, Phillips AD, Thomas NST, Cooper DN. The Human Gene Mutation Database: 2008 update. Genome Med 2009; 1:13. [PMID: 19348700 PMCID: PMC2651586 DOI: 10.1186/gm13] [Citation(s) in RCA: 633] [Impact Index Per Article: 42.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The Human Gene Mutation Database (HGMD((R))) is a comprehensive core collection of germline mutations in nuclear genes that underlie or are associated with human inherited disease. Here, we summarize the history of the database and its current resources. By December 2008, the database contained over 85,000 different lesions detected in 3,253 different genes, with new entries currently accumulating at a rate exceeding 9,000 per annum. Although originally established for the scientific study of mutational mechanisms in human genes, HGMD has since acquired a much broader utility for researchers, physicians, clinicians and genetic counselors as well as for companies specializing in biopharmaceuticals, bioinformatics and personalized genomics. HGMD was first made publicly available in April 1996, and a collaboration was initiated in 2006 between HGMD and BIOBASE GmbH. This cooperative agreement covers the exclusive worldwide marketing of the most up-to-date (subscription) version of HGMD, HGMD Professional, to academic, clinical and commercial users.
Collapse
Affiliation(s)
- Peter D Stenson
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Matthew Mort
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Edward V Ball
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Katy Howells
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Andrew D Phillips
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Nick ST Thomas
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - David N Cooper
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| |
Collapse
|
4
|
Cooper DN, Stenson PD, Chuzhanova NA. The Human Gene Mutation Database (HGMD) and its exploitation in the study of mutational mechanisms. ACTA ACUST UNITED AC 2008; Chapter 1:Unit 1.13. [PMID: 18428754 DOI: 10.1002/0471250953.bi0113s12] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
The Human Gene Mutation Database (HGMD) constitutes a comprehensive core collection of data on germ-line mutations in nuclear genes underlying or associated with human inherited disease (http://www.hgmd.org). Data cataloged include single base-pair substitutions in coding, regulatory, and splicing-relevant regions, microdeletions and microinsertions, indels, and triplet repeat expansions, as well as gross gene deletions, insertions, duplications, and complex rearrangements. Each mutation is entered into HGMD only once, in order to avoid confusion between recurrent and identical-by-descent lesions. By June 2005, the database contained in excess of 53,000 different lesions detected in 2029 different nuclear genes, with new entries currently accumulating at a rate in excess of 5000 per annum. HGMD includes cDNA reference sequences, now provided for more than 90% of the listed genes, splice junction data, disease-associated and functional polymorphisms, and links to data present in publicly available online locus-specific mutation databases.
Collapse
|
5
|
Abstract
It is known that cancers are caused by accumulated mutations in various genes and consequent functional alterations of proteins that are important for maintenance of normal cellular functions. The changes in nucleotide sequences and expression patterns of cancer-related genes are being extensively studied to better understand the mechanisms of tumorigenesis and to develop methods for DNA/protein [corrected] diagnosis and drug discovery. At present, a number of computer databases for molecular information on cancer-related genes are available publicly through the internet. These databases deal with familial cancer and sporadic cancer at the levels of germline mutation or somatic mutation, genomic or chromosomal abnormalities, and changes in the expression levels of relevant genes. Previously, we constructed a human gene mutation database named MutationView (http://mutview.dmb.med.keio.ac.jp/) and have accumulated mutation data for approximately 300 genes that are involved mainly in monogenic diseases. Forty-two genes are cancer-related and therefore a separate cancer database named KMcancerDB was constructed. MutationView/KMcancerDB utilizes a graphic display function for both queries and search results much more often than other existing databases, making the system quite user friendly. MutationView/KMcancerDB provides a highly sophisticated search function for all genes through a single internet URL. In the present paper, we briefly review various useful databases for cancer-related genes, and describe MutationView/KMcancerDB in more detail.
Collapse
Affiliation(s)
- Nobuyoshi Shimizu
- Department of Molecular Biology, Keio University School of Medicine, 35 Shinanomachi, Shinjuku-ku, Tokyo, Japan.
| | | | | |
Collapse
|
6
|
Macías JR, Jiménez-Lozano N, Carazo JM. Integrating electron microscopy information into existing Distributed Annotation Systems. J Struct Biol 2007; 158:205-13. [PMID: 17400476 DOI: 10.1016/j.jsb.2007.02.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2006] [Revised: 12/19/2006] [Accepted: 02/13/2007] [Indexed: 10/23/2022]
Abstract
The increase of daily released bioinformatic data has generated new ways of organising and disseminating information. Specifically, in the field of sequence data, many efforts have been made not only to store information in databases, but also to annotate it and then share these annotations through a standard XML (eXtensible Markup Language) protocol and appropriate integration clients. This is the context in which the Distributed Annotation System (DAS) has emerged in genomics. Additionally, initiatives in the field of structural data, such as the extension of DAS to atomic resolution data, which generated the SPICE client, have also occurred. This paper presents 3D-EM DAS, a further extension of the DAS protocol that allows sharing annotations about hybrid models. This annotation system has been built on the basis of the EMDB, which stores Three-dimensional Electron Microscopy (3D-EM) volumes, PDB, which houses atomic coordinates, and UniProt (for protein sequences) databases. In this way, annotations for sequences, atomic coordinates, and 3D-EM volumes are collected and displayed through a single graphical visualization client. Thus, users have an integrated view of all the annotations together with the whole macromolecule (3D-EM map coming from EMDB), the atomic resolution structures fitted into it (coordinates coming from PDB) and the sequences corresponding to each of the structures (from UniProt).
Collapse
Affiliation(s)
- J R Macías
- Unidad de Biocomputación, Centro Nacional de Biotecnología-CSIC, Campus de Cantoblanco UAM, c/ Darwin 3, 28049 Madrid, Spain.
| | | | | |
Collapse
|
7
|
Sharma VK, Sharma A, Kumar N, Khandelwal M, Mandapati KK, Horn-Saban S, Strichman-Almashanu L, Lancet D, Brahmachari SK, Ramachandran S. Expoldb: expression linked polymorphism database with inbuilt tools for analysis of expression and simple repeats. BMC Genomics 2006; 7:258. [PMID: 17038195 PMCID: PMC1618849 DOI: 10.1186/1471-2164-7-258] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2006] [Accepted: 10/13/2006] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Quantitative variation in gene expression has been proposed to underlie phenotypic variation among human individuals. A facilitating step towards understanding the basis for gene expression variability is associating genome wide transcription patterns with potential cis modifiers of gene expression. DESCRIPTION EXPOLDB, a novel Database, is a new effort addressing this need by providing information on gene expression levels variability across individuals, as well as the presence and features of potentially polymorphic (TG/CA)n repeats. EXPOLDB thus enables associating transcription levels with the presence and length of (TG/CA)n repeats. One of the unique features of this database is the display of expression data for 5 pairs of monozygotic twins, which allows identification of genes whose variability in expression, are influenced by non-genetic factors including environment. In addition to queries by gene name, EXPOLDB allows for queries by a pathway name. Users can also upload their list of HGNC (HUGO (The Human Genome Organisation) Gene Nomenclature Committee) symbols for interrogating expression patterns. The online application 'SimRep' can be used to find simple repeats in a given nucleotide sequence. To help illustrate primary applications, case examples of Housekeeping genes and the RUNX gene family, as well as one example of glycolytic pathway genes are provided. CONCLUSION The uniqueness of EXPOLDB is in facilitating the association of genome wide transcription variations with the presence and type of polymorphic repeats while offering the feature for identifying genes whose expression variability are influenced by non genetic factors including environment. In addition, the database allows comprehensive querying including functional information on biochemical pathways of the human genes. EXPOLDB can be accessed at http://expoldb.igib.res.in/expol.
Collapse
Affiliation(s)
- Vineet K Sharma
- G.N. Ramachandran Knowledge Centre for Genome Informatics, Institute of Genomics and Integrative Biology, Mall Road, Delhi 110 007, India
| | - Anu Sharma
- Functional Genomics Unit, Institute of Genomics and Integrative Biology, Mall Road, Delhi 110 007, India
| | - Naveen Kumar
- G.N. Ramachandran Knowledge Centre for Genome Informatics, Institute of Genomics and Integrative Biology, Mall Road, Delhi 110 007, India
| | - Mamta Khandelwal
- G.N. Ramachandran Knowledge Centre for Genome Informatics, Institute of Genomics and Integrative Biology, Mall Road, Delhi 110 007, India
| | - Kiran Kumar Mandapati
- Functional Genomics Unit, Institute of Genomics and Integrative Biology, Mall Road, Delhi 110 007, India
| | - Shirley Horn-Saban
- Microarray facility, Department of Biological Services, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Liora Strichman-Almashanu
- Department of Molecular Genetics and Crown Human Genome Center, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Doron Lancet
- Department of Molecular Genetics and Crown Human Genome Center, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Samir K Brahmachari
- G.N. Ramachandran Knowledge Centre for Genome Informatics, Institute of Genomics and Integrative Biology, Mall Road, Delhi 110 007, India
| | - Srinivasan Ramachandran
- G.N. Ramachandran Knowledge Centre for Genome Informatics, Institute of Genomics and Integrative Biology, Mall Road, Delhi 110 007, India
| |
Collapse
|
8
|
Alesci S, Manoli I, Michopoulos VJ, Brouwers FM, Le H, Gold PW, Blackman MR, Rennert OM, Su YA, Chrousos GP. Development of a human mitochondria-focused cDNA microarray (hMitChip) and validation in skeletal muscle cells: implications for pharmaco- and mitogenomics. THE PHARMACOGENOMICS JOURNAL 2006; 6:333-42. [PMID: 16534508 DOI: 10.1038/sj.tpj.6500377] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Mitochondrial research has influenced our understanding of human evolution, physiology and pathophysiology. Mitochondria, intracellular organelles widely known as 'energy factories' of the cell, also play fundamental roles in intermediary metabolism, steroid hormone and heme biosyntheses, calcium signaling, generation of radical oxygen species, and apoptosis. Mitochondria possess a distinct DNA (mitochondrial DNA); yet, the vast majority of mitochondrial proteins are encoded by the nuclear DNA. Mitochondria-related genetic defects have been described in a variety of mostly rare, often fatal, primary mitochondrial disorders; furthermore, they are increasingly reported in association with many common morbid conditions, such as cancer, obesity, diabetes and neurodegenerative disorders, although their role remains unclear. This study describes the creation of a human mitochondria-focused cDNA microarray (hMitChip) and its validation in human skeletal muscle cells treated with glucocorticoids. We suggest that hMitChip is a reliable and novel tool that will prove useful for systematically studying the contribution of mitochondrial genomics to human health and disease.
Collapse
Affiliation(s)
- S Alesci
- Clinical Neuroendocrinology Branch, NIMH, NIH, Bethesda, MD 20892-1284, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Cuticchia AJ, Kulkarni RD, Parris WE, Cooley PC, Hall RD, Silk GW. Inconsistencies between human genetic cytolocations and those derived using genomic sequence. Cytogenet Genome Res 2005; 112:1-5. [PMID: 16276083 DOI: 10.1159/000087506] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2005] [Accepted: 06/20/2005] [Indexed: 11/19/2022] Open
Abstract
One result of the publishing of the human genome sequence is the ability to define objects through their position on the consensus sequence. While this has simplified the process of creating order maps for genes on a chromosome, it has created discrepancies between the published cytolocations of human genes, as presented through genetic references, and those locations derived computationally from the genomic sequence. For the 6,830 records with HUGO gene symbols shared between the online version of Mendelian Inheritance in Man and Ensembl, 18% of the records have a discrepancy of at least one cytogenetic band between the datasets. Discordance between data sets at this frequency would have a significant impact on the utility of datasets created by the amalgamation of numerous biological databases.
Collapse
Affiliation(s)
- A J Cuticchia
- Research Triangle Institute, Research Triangle Park, NC 27709, USA.
| | | | | | | | | | | |
Collapse
|
10
|
Zhang J, Aizawa M, Amari S, Iwasawa Y, Nakano T, Nakata K. Development of KiBank, a database supporting structure-based drug design. Comput Biol Chem 2005; 28:401-7. [PMID: 15556481 DOI: 10.1016/j.compbiolchem.2004.09.003] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2004] [Revised: 09/13/2004] [Accepted: 09/15/2004] [Indexed: 11/29/2022]
Abstract
KiBank is a database of inhibition constant (Ki) values with 3D structures of target proteins and chemicals. Ki values were accumulated from peer-reviewed literature searched via PubMed. The 3D structure files of target proteins were originally from Protein Data Bank (PDB), while the 2D structure files of the chemicals were collected together with the Ki values and then converted into 3D ones. In KiBank, the chemical and protein 3D structures with hydrogen atoms were optimized by energy minimization and stored in MDL MOL and PDB format, respectively. KiBank is designed to support structure-based drug design. It provides structure files of proteins and chemicals ready for use in virtual screening through automated docking methods, while the Ki values can be applied for tests of docking/scoring combinations, program parameter settings, and calibration of empirical scoring functions. Additionally, the chemical structures and corresponding Ki values in KiBank are useful for lead optimization based on quantitative structure-activity relationship (QSAR) techniques. KiBank is updated on a daily basis and is freely available at . As of August 2004, KiBank contains 8000 Ki values, over 6000 chemicals and 166 proteins covering the subtypes of receptors and enzymes.
Collapse
Affiliation(s)
- Junwei Zhang
- Collaborative Research Center of Frontier Simulation Software for Industrial Science, Institute of Industrial Science, University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505, Japan.
| | | | | | | | | | | |
Collapse
|
11
|
Aizawa M, Onodera K, Zhang J, Amari S, Iwasawa Y, Nakano T, Nakata K. KiBank: A Database for Computer-Aided Drug Design Based on Protein-Chemical Interaction Analysis. YAKUGAKU ZASSHI 2004; 124:613-9. [PMID: 15340183 DOI: 10.1248/yakushi.124.613] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
KiBank is a database for computer-aided drug design and consists of binding affinities and chemical and target protein structures. Each chemical or protein structure with hydrogen atoms added was optimized by energy minimization and stored in PDB or MDL MOL file format, so that the structural data can be directly used for in silico binding studies. To describe the extent of inhibition, the inhibition constant (K(i)) value is used to simplify comparisons of strengths among chemical-protein bindings. As of April 2004, KiBank contained 142 proteins, over 5000 chemicals, and over 6000 binding affinity values that were published in peer-reviewed journals. The binding affinity values are currently mostly for membrane and nuclear receptors but are soon being expanded to other drug targets. KiBank is updated daily and can be accessed on the Web at http://kibank.iis.u-tokyo.ac.jp/at no charge.
Collapse
Affiliation(s)
- Masahiro Aizawa
- Collaborative Research Center of Frontier Simulation Software for Industrial Science, Institute of Industrial Science, University of Tokyo, Japan.
| | | | | | | | | | | | | |
Collapse
|
12
|
Stenson PD, Ball EV, Mort M, Phillips AD, Shiel JA, Thomas NST, Abeysinghe S, Krawczak M, Cooper DN. Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat 2003; 21:577-81. [PMID: 12754702 DOI: 10.1002/humu.10212] [Citation(s) in RCA: 1257] [Impact Index Per Article: 59.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
The Human Gene Mutation Database (HGMD) constitutes a comprehensive core collection of data on germ-line mutations in nuclear genes underlying or associated with human inherited disease (www.hgmd.org). Data catalogued includes: single base-pair substitutions in coding, regulatory and splicing-relevant regions; micro-deletions and micro-insertions; indels; triplet repeat expansions as well as gross deletions; insertions; duplications; and complex rearrangements. Each mutation is entered into HGMD only once in order to avoid confusion between recurrent and identical-by-descent lesions. By March 2003, the database contained in excess of 39,415 different lesions detected in 1,516 different nuclear genes, with new entries currently accumulating at a rate exceeding 5,000 per annum. Since its inception, HGMD has been expanded to include cDNA reference sequences for more than 87% of listed genes, splice junction sequences, disease-associated and functional polymorphisms, as well as links to data present in publicly available online locus-specific mutation databases. Although HGMD has recently entered into a licensing agreement with Celera Genomics (Rockville, MD), mutation data will continue to be made freely available via the Internet.
Collapse
Affiliation(s)
- Peter D Stenson
- Institute of Medical Genetics, University of Wales College of Medicine, Heath Park, Cardiff, UK
| | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Abstract
The tumor suppressor gene TP53 (p53) is the most extensively studied gene involved in human cancers. More than 1,400 publications have reported mutations of this gene in 150 cancer types for a total of 14,971 mutations. To exploit this huge bulk of data, specific analytic tools were highly warranted. We therefore developed a locus-specific database software called UMD-p53. This database compiles all somatic and germline mutations as well as polymorphisms of the TP53 gene which have been reported in the published literature since 1989, or unpublished data submitted to the database curators. The database is available at www.umd.necker.fr or at http://p53.curie.fr/. In this paper, we describe recent developments of the UMD-p53 database. These developments include new fields and routines. For example, the analysis of putative acceptor or donor splice sites is now automated and gives new insight for the causal role of "silent mutations." Other routines have also been created such as the prescreening module, the UV module, and the cancer distribution module. These new improvements will help users not only for molecular epidemiology and pharmacogenetic studies but also for patient-based studies. To achieve theses purposes we have designed a procedure to check and validate data in order to reach the highest quality data.
Collapse
Affiliation(s)
- Christophe Béroud
- Laboratoire de Génétique Moléculaire, CHU de Montpellier, Institut Universitaire de Recherche Clinique, Montpellier Cedex, France.
| | | |
Collapse
|
14
|
Tolle R. Information technology tools for efficient SNP studies. AMERICAN JOURNAL OF PHARMACOGENOMICS : GENOMICS-RELATED RESEARCH IN DRUG DEVELOPMENT AND CLINICAL PRACTICE 2002; 1:303-14. [PMID: 12083962 DOI: 10.2165/00129785-200101040-00007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
We are currently facing a new era of studies involving single nucleotide polymorphisms (SNPs). This increased attention is stimulated by interest in individual differences in disease susceptibility as well as individual responses to drug treatment and the falling cost of genotyping. This review is a guide to the numerous public data repositories and Information Technology (IT) tools that may aid planning, preparation, running and analysis of studies involving SNPs. I will also highlight areas where researchers will have to resort to home-made IT solutions. Unfortunately, both information and IT tools are scattered throughout the internet and a lack of data exchange conventions can hamper the efficient use of these existing resources. This can lead to situations where the planning, preparation and analysis of a SNP study can actually cost more than the actual genotyping. We propose that only a customizable backbone IT infrastructure for SNP studies can help reduce costs associated with SNP data handling and tool launching.
Collapse
Affiliation(s)
- R Tolle
- LION bioscience AG, Heidelberg, Germany.
| |
Collapse
|
15
|
Abstract
Pharmacogenomics requires the integration and analysis of genomic, molecular, cellular, and clinical data, and it thus offers a remarkable set of challenges to biomedical informatics. These include infrastructural challenges such as the creation of data models and databases for storing these data, the integration of these data with external databases, the extraction of information from natural language text, and the protection of databases with sensitive information. There are also scientific challenges in creating tools to support gene expression analysis, three-dimensional structural analysis, and comparative genomic analysis. In this review, we summarize the current uses of informatics within pharmacogenomics and show how the technical challenges that remain for biomedical informatics are typical of those that will be confronted in the postgenomic era.
Collapse
Affiliation(s)
- Russ B Altman
- Stanford Medical Informatics, Stanford, California 94305-5479, USA.
| | | |
Collapse
|
16
|
Abstract
HOWDY is an integrated database system for accessing and analyzing human genomic information (http://www-alis.tokyo.jst.go.jp/HOWDY/). HOWDY stores information about relationships between genetic objects and the data extracted from a number of databases. HOWDY consists of an Internet accessible user interface that allows thorough searching of the human genomic databases using the gene symbols and their aliases. It also permits flexible editing of the sequence data. The database can be searched using simple words and the search can be restricted to a specific cytogenetic location. Linear maps displaying markers and genes on contig sequences are available, from which an object can be chosen. Any search starting point identifies all the information matching the query. HOWDY provides a convenient search environment of human genomic data for scientists unsure which database is most appropriate for their search.
Collapse
Affiliation(s)
- Mika Hirakawa
- Bioinformatics Division, Advanced Databases Department, Japan Science and Technology Corporation (JST), 5-3 Yonban-cho, Chiyoda-ku, Tokyo 102-0081, Japan.
| |
Collapse
|
17
|
Dowell RD, Jokerst RM, Day A, Eddy SR, Stein L. The distributed annotation system. BMC Bioinformatics 2001; 2:7. [PMID: 11667947 PMCID: PMC58584 DOI: 10.1186/1471-2105-2-7] [Citation(s) in RCA: 319] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2001] [Accepted: 10/10/2001] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Currently, most genome annotation is curated by centralized groups with limited resources. Efforts to share annotations transparently among multiple groups have not yet been satisfactory. RESULTS Here we introduce a concept called the Distributed Annotation System (DAS). DAS allows sequence annotations to be decentralized among multiple third-party annotators and integrated on an as-needed basis by client-side software. The communication between client and servers in DAS is defined by the DAS XML specification. Annotations are displayed in layers, one per server. Any client or server adhering to the DAS XML specification can participate in the system; we describe a simple prototype client and server example. CONCLUSIONS The DAS specification is being used experimentally by Ensembl, WormBase, and the Berkeley Drosophila Genome Project. Continued success will depend on the readiness of the research community to adopt DAS and provide annotations. All components are freely available from the project website http://www.biodas.org/.
Collapse
Affiliation(s)
- Robin D Dowell
- Howard Hughes Medical Institute and Department of Genetics, Washington University, St. Louis, MO 63110 USA
| | - Rodney M Jokerst
- Howard Hughes Medical Institute and Department of Genetics, Washington University, St. Louis, MO 63110 USA
| | - Allen Day
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724 USA
| | - Sean R Eddy
- Howard Hughes Medical Institute and Department of Genetics, Washington University, St. Louis, MO 63110 USA
| | - Lincoln Stein
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724 USA
| |
Collapse
|
18
|
|
19
|
Abstract
The Internet has been a key component in the coordination of the diverse group of scientists involved in the Human Genome Project. Nowhere has this contribution been more critical than in the maintenance and exchange of information about genetic variation and mutation. Whereas the majority of DNA sequence is generated and stored by a relatively few sites, a far greater number of researchers investigate the variations in that sequence from sites scattered worldwide. It falls to central databases to utilize the Internet to assemble data from these sites and make them available to the greater human genomic community.
Collapse
Affiliation(s)
- C J Porter
- Johns Hopkins University, Baltimore, Maryland, USA
| | | | | |
Collapse
|
20
|
Abstract
As the human genome sequencing project nears completion, there has been a vast increase in the rate at which disease and nondisease associated variant sequences are being sought and detected. This has heightened the need for software with which to accumulate allelic variant (mutation) data, and with which to make the data accessible to the scientific community. Many ad hoc solutions have been developed by those interested in specific genes and diseases, and the creation of central databases which hold data for all genes has provided an alternative repository for some of the locus data. Despite this, few specialised software tools exist for researchers to create their own locus-specific allelic variant databases. This article describes methods available to potential curators, including software systems developed with the sole purpose of generating locus-specific mutation databases. In particular, the authors' own software, MuStaRtrade mark, is described. MuStaRtrade mark allows curators to maintain a database on a laptop computer if desired, while being able to export the data to an automatically generated Website which will run on any cgi compliant Web server. Searching the database and the submission of new mutations are made possible through fill-in Web forms. A number of other software tools which may be of use to curators are also described.
Collapse
Affiliation(s)
- A F Brown
- MRC Human Genetics Unit, Edinburgh, UK.
| | | |
Collapse
|
21
|
Affiliation(s)
- R G Cotton
- Mutation Research Centre, Fitzroy, Melbourne, Australia
| |
Collapse
|
22
|
Scriver CR, Nowacki PM, Lehväslaiho H. Guidelines and recommendations for content, structure, and deployment of mutation databases: II. Journey in progress. Hum Mutat 1999; 15:13-5. [PMID: 10612816 DOI: 10.1002/(sici)1098-1004(200001)15:1<13::aid-humu5>3.0.co;2-y] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The HUGO Mutation Database Initiative has produced guidelines and recommendations addressing uniform nomenclature of (human) genes and alleles, and computing standards to permit a moderate level of built-in redundancy, searchable interfaces, and compatibility between the comprehensive (genomic) and locus-specific types of databases. The participating community (developers and users) have been moving the project along rapidly, as described here.
Collapse
Affiliation(s)
- C R Scriver
- DeBelle Laboratory, McGill University-Montreal Children's Hospital Research Institute, Montreal, Canada.
| | | | | |
Collapse
|
23
|
Abstract
Online Mendelian Inheritance In Man (OMIM) is a public database of bibliographic information about human genes and genetic disorders. Begun by Dr. Victor McKusick as the authoritative reference Mendelian Inheritance in Man, it is now distributed electronically by the National Center for Biotechnology Information (NCBI). Material in OMIM is derived from the biomedical literature and is written by Dr. McKusick and his colleagues at Johns Hopkins University and elsewhere. Each OMIM entry has a full text summary of a genetic phenotype and/or gene and has copious links to other genetic resources such as DNA and protein sequence, PubMed references, mutation databases, approved gene nomenclature, and more. In addition, NCBI's neighboring feature allows users to identify related articles from PubMed selected on the basis of key words in the OMIM entry. Through its many features, OMIM is increasingly becoming a major gateway for clinicians, students, and basic researchers to the ever-growing literature and resources of human genetics.
Collapse
Affiliation(s)
- A Hamosh
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland 21287, USA
| | | | | | | | | |
Collapse
|
24
|
Krawczak M, Ball EV, Fenton I, Stenson PD, Abeysinghe S, Thomas N, Cooper DN. Human gene mutation database-a biomedical information and research resource. Hum Mutat 1999; 15:45-51. [PMID: 10612821 DOI: 10.1002/(sici)1098-1004(200001)15:1<45::aid-humu10>3.0.co;2-t] [Citation(s) in RCA: 199] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Although 20 years have elapsed since the first single basepair substitution underlying an inherited disease in humans was characterised at the DNA level, the initiative has only recently been taken to establish central database resources for pathological genetic variants. Disease-associated gene lesions are currently collected and publicised by the Human Gene Mutation Database (HGMD) in Cardiff, locus-specific mutation databases, and to some extent also by the Genome Database (GDB) and Online Mendelian Inheritance in Man (OMIM). To date, HGMD represents the only comprehensive and publicly available database of gene lesions underlying human inherited disease. By July 1999, HGMD contained over 18,000 different mutations from some 900 human genes, the majority being single basepair substitutions. In addition to its potential as an information resource for clinicians and genetic counsellors, HGMD has allowed molecular geneticists to address a variety of biological questions through meta-analysis of the collated data. HGMD also promises to assist research workers in optimising mutation search strategies for a given gene. A questionnaire sent out to, and answered by, the editors of 20 key journals revealed that human genetics journals are increasingly reluctant to publish mutation reports. Electronic data submission and publication facilities are therefore urgently required. The World Wide Web (WWW) provides an excellent medium within which to combine the centralised management of basic mutation data, including rigorous quality control, with the possibility of publishing additional mutation-related information. In response to these needs, HGMD has both instituted a collaboration with Springer-Verlag GmbH, Heidelberg, to potentiate free online submission and electronic publication of human gene mutation data and developed links with the curators of locus-specific mutation databases.
Collapse
Affiliation(s)
- M Krawczak
- Institute of Medical Genetics, University of Wales College of Medicine, Heath Park, UK.
| | | | | | | | | | | | | |
Collapse
|
25
|
Abstract
The KMeyeDB (http://mutview.dmb.med.keio.ac.jp) has been developed as a database of mutations in human eye disorder genes using software called MutationView which provides graphical presentation of various data analysis. Here, we present several display windows from the KMeyeDB for data analysis of mutations in the RB-1 gene, which is responsible for the pathogenesis of retinoblastoma, a malignant tumor in the retina.
Collapse
Affiliation(s)
- S Minoshima
- Center for Genomic Medicine, Department of Molecular Biology, Keio University School of Medicine, Tokyo, Japan
| | | | | | | | | |
Collapse
|