1451
|
Blake JA, Eppig JT, Kadin JA, Richardson JE, Smith CL, Bult CJ. Mouse Genome Database (MGD)-2017: community knowledge resource for the laboratory mouse. Nucleic Acids Res 2016; 45:D723-D729. [PMID: 27899570 PMCID: PMC5210536 DOI: 10.1093/nar/gkw1040] [Citation(s) in RCA: 199] [Impact Index Per Article: 22.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Accepted: 10/28/2016] [Indexed: 11/30/2022] Open
Abstract
The Mouse Genome Database (MGD: http://www.informatics.jax.org) is the primary community data resource for the laboratory mouse. It provides a highly integrated and highly curated system offering a comprehensive view of current knowledge about mouse genes, genetic markers and genomic features as well as the associations of those features with sequence, phenotypes, functional and comparative information, and their relationships to human diseases. MGD continues to enhance access to these data, to extend the scope of data content and visualizations, and to provide infrastructure and user support that ensures effective and efficient use of MGD in the advancement of scientific knowledge. Here, we report on recent enhancements made to the resource and new features.
Collapse
Affiliation(s)
- Judith A Blake
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Janan T Eppig
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - James A Kadin
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | | | - Cynthia L Smith
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Carol J Bult
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | | |
Collapse
|
1452
|
Aken BL, Achuthan P, Akanni W, Amode MR, Bernsdorff F, Bhai J, Billis K, Carvalho-Silva D, Cummins C, Clapham P, Gil L, Girón CG, Gordon L, Hourlier T, Hunt SE, Janacek SH, Juettemann T, Keenan S, Laird MR, Lavidas I, Maurel T, McLaren W, Moore B, Murphy DN, Nag R, Newman V, Nuhn M, Ong CK, Parker A, Patricio M, Riat HS, Sheppard D, Sparrow H, Taylor K, Thormann A, Vullo A, Walts B, Wilder SP, Zadissa A, Kostadima M, Martin FJ, Muffato M, Perry E, Ruffier M, Staines DM, Trevanion SJ, Cunningham F, Yates A, Zerbino DR, Flicek P. Ensembl 2017. Nucleic Acids Res 2016; 45:D635-D642. [PMID: 27899575 PMCID: PMC5210575 DOI: 10.1093/nar/gkw1104] [Citation(s) in RCA: 416] [Impact Index Per Article: 46.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2016] [Revised: 10/25/2016] [Accepted: 10/28/2016] [Indexed: 12/12/2022] Open
Abstract
Ensembl (www.ensembl.org) is a database and genome browser for enabling research on vertebrate genomes. We import, analyse, curate and integrate a diverse collection of large-scale reference data to create a more comprehensive view of genome biology than would be possible from any individual dataset. Our extensive data resources include evidence-based gene and regulatory region annotation, genome variation and gene trees. An accompanying suite of tools, infrastructure and programmatic access methods ensure uniform data analysis and distribution for all supported species. Together, these provide a comprehensive solution for large-scale and targeted genomics applications alike. Among many other developments over the past year, we have improved our resources for gene regulation and comparative genomics, and added CRISPR/Cas9 target sites. We released new browser functionality and tools, including improved filtering and prioritization of genome variation, Manhattan plot visualization for linkage disequilibrium and eQTL data, and an ontology search for phenotypes, traits and disease. We have also enhanced data discovery and access with a track hub registry and a selection of new REST end points. All Ensembl data are freely released to the scientific community and our source code is available via the open source Apache 2.0 license.
Collapse
Affiliation(s)
- Bronwen L Aken
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Premanand Achuthan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Wasiu Akanni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - M Ridwan Amode
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Friederike Bernsdorff
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jyothish Bhai
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Konstantinos Billis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Denise Carvalho-Silva
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peter Clapham
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Laurent Gil
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carlos García Girón
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Leo Gordon
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sophie H Janacek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Juettemann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen Keenan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthew R Laird
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ilias Lavidas
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Maurel
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - William McLaren
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Benjamin Moore
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel N Murphy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rishi Nag
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Victoria Newman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Michael Nuhn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Chuang Kee Ong
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anne Parker
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mateus Patricio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Harpreet Singh Riat
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel Sheppard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Helen Sparrow
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kieron Taylor
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anja Thormann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alessandro Vullo
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Brandon Walts
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Steven P Wilder
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Amonida Zadissa
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Myrto Kostadima
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthieu Muffato
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Emily Perry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Magali Ruffier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel M Staines
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen J Trevanion
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andrew Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel R Zerbino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK .,Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| |
Collapse
|
1453
|
Rappaport N, Twik M, Plaschkes I, Nudel R, Iny Stein T, Levitt J, Gershoni M, Morrey CP, Safran M, Lancet D. MalaCards: an amalgamated human disease compendium with diverse clinical and genetic annotation and structured search. Nucleic Acids Res 2016; 45:D877-D887. [PMID: 27899610 PMCID: PMC5210521 DOI: 10.1093/nar/gkw1012] [Citation(s) in RCA: 375] [Impact Index Per Article: 41.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Revised: 10/14/2016] [Accepted: 10/29/2016] [Indexed: 12/13/2022] Open
Abstract
The MalaCards human disease database (http://www.malacards.org/) is an integrated compendium of annotated diseases mined from 68 data sources. MalaCards has a web card for each of ∼20 000 disease entries, in six global categories. It portrays a broad array of annotation topics in 15 sections, including Summaries, Symptoms, Anatomical Context, Drugs, Genetic Tests, Variations and Publications. The Aliases and Classifications section reflects an algorithm for disease name integration across often-conflicting sources, providing effective annotation consolidation. A central feature is a balanced Genes section, with scores reflecting the strength of disease-gene associations. This is accompanied by other gene-related disease information such as pathways, mouse phenotypes and GO-terms, stemming from MalaCards’ affiliation with the GeneCards Suite of databases. MalaCards’ capacity to inter-link information from complementary sources, along with its elaborate search function, relational database infrastructure and convenient data dumps, allows it to tackle its rich disease annotation landscape, and facilitates systems analyses and genome sequence interpretation. MalaCards adopts a ‘flat’ disease-card approach, but each card is mapped to popular hierarchical ontologies (e.g. International Classification of Diseases, Human Phenotype Ontology and Unified Medical Language System) and also contains information about multi-level relations among diseases, thereby providing an optimal tool for disease representation and scrutiny.
Collapse
Affiliation(s)
- Noa Rappaport
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Michal Twik
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Inbar Plaschkes
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Ron Nudel
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Tsippi Iny Stein
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Jacob Levitt
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Moran Gershoni
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - C Paul Morrey
- Department of Information Systems and Technology, Utah Valley University, Orem, UT 84058, USA
| | - Marilyn Safran
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Doron Lancet
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| |
Collapse
|
1454
|
Trujillano D, Oprea GE, Schmitz Y, Bertoli-Avella AM, Abou Jamra R, Rolfs A. A comprehensive global genotype-phenotype database for rare diseases. Mol Genet Genomic Med 2016; 5:66-75. [PMID: 28116331 PMCID: PMC5241210 DOI: 10.1002/mgg3.262] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2016] [Revised: 10/03/2016] [Accepted: 11/01/2016] [Indexed: 12/20/2022] Open
Abstract
Background The ability to discover genetic variants in a patient runs far ahead of the ability to interpret them. Databases with accurate descriptions of the causal relationship between the variants and the phenotype are valuable since these are critical tools in clinical genetic diagnostics. Here, we introduce a comprehensive and global genotype–phenotype database focusing on rare diseases. Methods This database (CentoMD®) is a browser‐based tool that enables access to a comprehensive, independently curated system utilizing stringent high‐quality criteria and a quickly growing repository of genetic and human phenotype ontology (HPO)‐based clinical information. Its main goals are to aid the evaluation of genetic variants, to enhance the validity of the genetic analytical workflow, to increase the quality of genetic diagnoses, and to improve evaluation of treatment options for patients with hereditary diseases. The database software correlates clinical information from consented patients and probands of different geographical backgrounds with a large dataset of genetic variants and, when available, biomarker information. An automated follow‐up tool is incorporated that informs all users whenever a variant classification has changed. These unique features fully embedded in a CLIA/CAP‐accredited quality management system allow appropriate data quality and enhanced patient safety. Results More than 100,000 genetically screened individuals are documented in the database, resulting in more than 470 million variant detections. Approximately, 57% of the clinically relevant and uncertain variants in the database are novel. Notably, 3% of the genetic variants identified and previously reported in the literature as being associated with a particular rare disease were reclassified, based on internal evidence, as clinically irrelevant. Conclusions The database offers a comprehensive summary of the clinical validity and causality of detected gene variants with their associated phenotypes, and is a valuable tool for identifying new disease genes through the correlation of novel genetic variants with specific, well‐defined phenotypes.
Collapse
Affiliation(s)
| | | | | | | | - Rami Abou Jamra
- Centogene AGRostockGermany; Institute of Human GeneticsUniversity of Leipzig Hospitals and ClinicsLeipzigGermany
| | - Arndt Rolfs
- Centogene AGRostockGermany; Albrecht-Kossel-Institute for NeuroregenerationMedical University RostockRostockGermany
| |
Collapse
|
1455
|
Gomez-Cabrero D, Menche J, Vargas C, Cano I, Maier D, Barabási AL, Tegnér J, Roca J. From comorbidities of chronic obstructive pulmonary disease to identification of shared molecular mechanisms by data integration. BMC Bioinformatics 2016; 17:441. [PMID: 28185567 PMCID: PMC5133493 DOI: 10.1186/s12859-016-1291-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Background Deep mining of healthcare data has provided maps of comorbidity relationships between diseases. In parallel, integrative multi-omics investigations have generated high-resolution molecular maps of putative relevance for understanding disease initiation and progression. Yet, it is unclear how to advance an observation of comorbidity relations (one disease to others) to a molecular understanding of the driver processes and associated biomarkers. Results Since Chronic Obstructive Pulmonary disease (COPD) has emerged as a central hub in temporal comorbidity networks, we developed a systematic integrative data-driven framework to identify shared disease-associated genes and pathways, as a proxy for the underlying generative mechanisms inducing comorbidity. We integrated records from approximately 13 M patients from the Medicare database with disease-gene maps that we derived from several resources including a semantic-derived knowledge-base. Using rank-based statistics we not only recovered known comorbidities but also discovered a novel association between COPD and digestive diseases. Furthermore, our analysis provides the first set of COPD co-morbidity candidate biomarkers, including IL15, TNF and JUP, and characterizes their association to aging and life-style conditions, such as smoking and physical activity. Conclusions The developed framework provides novel insights in COPD and especially COPD co-morbidity associated mechanisms. The methodology could be used to discover and decipher the molecular underpinning of other comorbidity relationships and furthermore, allow the identification of candidate co-morbidity biomarkers. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1291-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- David Gomez-Cabrero
- Department of Medicine, Karolinska Institutet, Unit of Computational Medicine, Stockholm, 171 77, Sweden. .,Karolinska Institutet, Center for Molecular Medicine, Stockholm, 171 77, Sweden. .,Department of Medicine, Unit of Clinical Epidemiology, Karolinska University Hospital, Solna, L8, 17176, Sweden. .,Science for Life Laboratory, Solna, 17121, Sweden. .,Mucosal and Salivary Biology Division, King's College London Dental Institute, London, UK.
| | - Jörg Menche
- Center for Complex Networks Research and Department of Physics, Northeastern University, Boston, MA, USA.,Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.,Center for Network Science, Central European University, Budapest, Hungary
| | - Claudia Vargas
- Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clinic de Barcelona, Universitat de Barcelona, Barcelona, Spain.,Center for Biomedical Network Research in Respiratory Diseases (CIBERES), Madrid, Spain
| | - Isaac Cano
- Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clinic de Barcelona, Universitat de Barcelona, Barcelona, Spain.,Center for Biomedical Network Research in Respiratory Diseases (CIBERES), Madrid, Spain
| | | | - Albert-László Barabási
- Center for Complex Networks Research and Department of Physics, Northeastern University, Boston, MA, USA.,Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.,Center for Network Science, Central European University, Budapest, Hungary.,Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Jesper Tegnér
- Department of Medicine, Karolinska Institutet, Unit of Computational Medicine, Stockholm, 171 77, Sweden.,Karolinska Institutet, Center for Molecular Medicine, Stockholm, 171 77, Sweden.,Department of Medicine, Unit of Clinical Epidemiology, Karolinska University Hospital, Solna, L8, 17176, Sweden.,Science for Life Laboratory, Solna, 17121, Sweden
| | - Josep Roca
- Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clinic de Barcelona, Universitat de Barcelona, Barcelona, Spain. .,Center for Biomedical Network Research in Respiratory Diseases (CIBERES), Madrid, Spain.
| | | |
Collapse
|
1456
|
Abstract
RNA-binding proteins play a variety of roles in cellular physiology. Some regulate mRNA processing, mRNA abundance, and translation efficiency. Some fight off invader RNA through small RNA-driven silencing pathways. Others sense foreign sequences in the form of double-stranded RNA and activate the innate immune response. Yet others, for example cytoplasmic aconitase, act as bi-functional proteins, processing metabolites in one conformation and regulating metabolic gene expression in another. Not all are involved in gene regulation. Some play structural roles, for example, connecting the translational machinery to the endoplasmic reticulum outer membrane. Despite their pervasive role and relative importance, it has remained difficult to identify new RNA-binding proteins in a systematic, unbiased way. A recent body of literature from several independent labs has defined robust, easily adaptable protocols for mRNA interactome discovery. In this review, I summarize the methods and review some of the intriguing findings from their application to a wide variety of biological systems.
Collapse
Affiliation(s)
- Sean P Ryder
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA, 01605, USA
| |
Collapse
|
1457
|
Ordulu Z, Kammin T, Brand H, Pillalamarri V, Redin CE, Collins RL, Blumenthal I, Hanscom C, Pereira S, Bradley I, Crandall BF, Gerrol P, Hayden MA, Hussain N, Kanengisser-Pines B, Kantarci S, Levy B, Macera MJ, Quintero-Rivera F, Spiegel E, Stevens B, Ulm JE, Warburton D, Wilkins-Haug LE, Yachelevich N, Gusella JF, Talkowski ME, Morton CC. Structural Chromosomal Rearrangements Require Nucleotide-Level Resolution: Lessons from Next-Generation Sequencing in Prenatal Diagnosis. Am J Hum Genet 2016; 99:1015-1033. [PMID: 27745839 PMCID: PMC5097935 DOI: 10.1016/j.ajhg.2016.08.022] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2016] [Accepted: 08/26/2016] [Indexed: 12/27/2022] Open
Abstract
In this exciting era of "next-gen cytogenetics," integrating genomic sequencing into the prenatal diagnostic setting is possible within an actionable time frame and can provide precise delineation of balanced chromosomal rearrangements at the nucleotide level. Given the increased risk of congenital abnormalities in newborns with de novo balanced chromosomal rearrangements, comprehensive interpretation of breakpoints could substantially improve prediction of phenotypic outcomes and support perinatal medical care. Herein, we present and evaluate sequencing results of balanced chromosomal rearrangements in ten prenatal subjects with respect to the location of regulatory chromatin domains (topologically associated domains [TADs]). The genomic material from all subjects was interpreted to be "normal" by microarray analyses, and their rearrangements would not have been detected by cell-free DNA (cfDNA) screening. The findings of our systematic approach correlate with phenotypes of both pregnancies with untoward outcomes (5/10) and with healthy newborns (3/10). Two pregnancies, one with a chromosomal aberration predicted to be of unknown clinical significance and another one predicted to be likely benign, were terminated prior to phenotype-genotype correlation (2/10). We demonstrate that the clinical interpretation of structural rearrangements should not be limited to interruption, deletion, or duplication of specific genes and should also incorporate regulatory domains of the human genome with critical ramifications for the control of gene expression. As detailed in this study, our molecular approach to both detecting and interpreting the breakpoints of structural rearrangements yields unparalleled information in comparison to other commonly used first-tier diagnostic methods, such as non-invasive cfDNA screening and microarray analysis, to provide improved genetic counseling for phenotypic outcome in the prenatal setting.
Collapse
Affiliation(s)
- Zehra Ordulu
- Department of Obstetrics, Gynecology, and Reproductive Biology, Brigham and Women's Hospital, Boston, MA 02115, USA; Harvard Medical School, Boston, MA 02115, USA
| | - Tammy Kammin
- Department of Obstetrics, Gynecology, and Reproductive Biology, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Harrison Brand
- Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Boston, MA 02142, USA
| | - Vamsee Pillalamarri
- Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Claire E Redin
- Harvard Medical School, Boston, MA 02115, USA; Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Boston, MA 02142, USA
| | - Ryan L Collins
- Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Ian Blumenthal
- Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Carrie Hanscom
- Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Shahrin Pereira
- Department of Obstetrics, Gynecology, and Reproductive Biology, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - India Bradley
- Department of Psychiatry, Prenatal Diagnosis Center, David Geffen School of Medicine, University of California, Los Angeles, Medical Plaza, Los Angeles, CA 90095, USA
| | - Barbara F Crandall
- Department of Psychiatry, Prenatal Diagnosis Center, David Geffen School of Medicine, University of California, Los Angeles, Medical Plaza, Los Angeles, CA 90095, USA
| | - Pamela Gerrol
- Department of Obstetrics, Gynecology, and Reproductive Biology, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Mark A Hayden
- Department of Obstetrics, Gynecology, and Reproductive Biology, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Naveed Hussain
- Department of Pediatrics, Connecticut Children's Medical Center, University of Connecticut, Farmington, CT 06030, USA
| | | | - Sibel Kantarci
- Department of Pathology and Laboratory Medicine, UCLA Clinical Genomics Center, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Brynn Levy
- Department of Pathology and Cell Biology, College of Physicians and Surgeons, Columbia University, New York, NY 10032, USA
| | - Michael J Macera
- New York Presbyterian Hospital, Columbia University Medical Center, New York, NY 10032, USA
| | - Fabiola Quintero-Rivera
- Department of Pathology and Laboratory Medicine, UCLA Clinical Genomics Center, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Erica Spiegel
- Department of Maternal Fetal Medicine, Columbia University Medical Center, New York, NY 10032, USA
| | - Blair Stevens
- Department of Obstetrics, Gynecology, and Reproductive Sciences, University of Texas Medical School at Houston, Houston, TX 77030, USA
| | - Janet E Ulm
- Regional Obstetrical Consultants, Chattanooga, TN 37403, USA
| | - Dorothy Warburton
- Department of Genetics and Development, Columbia University, New York, NY 10032, USA; Department of Pediatrics, Columbia University, New York, NY 10032, USA
| | - Louise E Wilkins-Haug
- Department of Obstetrics, Gynecology, and Reproductive Biology, Brigham and Women's Hospital, Boston, MA 02115, USA; Harvard Medical School, Boston, MA 02115, USA
| | - Naomi Yachelevich
- Department of Pediatrics, Clinical Genetics Services, New York University School of Medicine, New York, NY 10003, USA
| | - James F Gusella
- Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Boston, MA 02142, USA; Department of Genetics, Harvard Medical School, Boson, MA 02115, USA
| | - Michael E Talkowski
- Harvard Medical School, Boston, MA 02115, USA; Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Boston, MA 02142, USA; Departments of Psychiatry and Pathology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Cynthia C Morton
- Department of Obstetrics, Gynecology, and Reproductive Biology, Brigham and Women's Hospital, Boston, MA 02115, USA; Harvard Medical School, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Boston, MA 02142, USA; Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Division of Evolution and Genomic Science, School of Biological Sciences, University of Manchester, Manchester Academic Health Science Center, Manchester 03101, UK.
| |
Collapse
|
1458
|
'IRDiRC Recognized Resources': a new mechanism to support scientists to conduct efficient, high-quality research for rare diseases. Eur J Hum Genet 2016; 25:162-165. [PMID: 27782107 PMCID: PMC5255942 DOI: 10.1038/ejhg.2016.137] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Accepted: 09/06/2016] [Indexed: 01/25/2023] Open
Abstract
The International Rare Diseases Research Consortium (IRDiRC) has created a quality label, 'IRDiRC Recognized Resources', formerly known as 'IRDiRC Recommended'. It is a peer-reviewed quality indicator process established based on the IRDiRC Policies and Guidelines to designate resources (ie, standards, guidelines, tools, and platforms) designed to accelerate the pace of discoveries and translation into clinical applications for the rare disease (RD) research community. In its first year of implementation, 13 resources successfully applied for this designation, each focused on key areas essential to IRDiRC objectives and to the field of RD research more broadly. These included data sharing for discovery, knowledge organisation and ontologies, networking patient registries, and therapeutic development. 'IRDiRC Recognized Resources' is a mechanism aimed to provide community-approved contributions to RD research higher visibility, and encourage researchers to adopt recognised standards, guidelines, tools, and platforms that facilitate research advances guided by the principles of interoperability and sharing.
Collapse
|
1459
|
Piñero J, Bravo À, Queralt-Rosinach N, Gutiérrez-Sacristán A, Deu-Pons J, Centeno E, García-García J, Sanz F, Furlong LI. DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res 2016; 45:D833-D839. [PMID: 27924018 PMCID: PMC5210640 DOI: 10.1093/nar/gkw943] [Citation(s) in RCA: 1587] [Impact Index Per Article: 176.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Revised: 09/29/2016] [Accepted: 10/18/2016] [Indexed: 12/12/2022] Open
Abstract
The information about the genetic basis of human diseases lies at the heart of precision medicine and drug discovery. However, to realize its full potential to support these goals, several problems, such as fragmentation, heterogeneity, availability and different conceptualization of the data must be overcome. To provide the community with a resource free of these hurdles, we have developed DisGeNET (http://www.disgenet.org), one of the largest available collections of genes and variants involved in human diseases. DisGeNET integrates data from expert curated repositories, GWAS catalogues, animal models and the scientific literature. DisGeNET data are homogeneously annotated with controlled vocabularies and community-driven ontologies. Additionally, several original metrics are provided to assist the prioritization of genotype-phenotype relationships. The information is accessible through a web interface, a Cytoscape App, an RDF SPARQL endpoint, scripts in several programming languages and an R package. DisGeNET is a versatile platform that can be used for different research purposes including the investigation of the molecular underpinnings of specific human diseases and their comorbidities, the analysis of the properties of disease genes, the generation of hypothesis on drug therapeutic action and drug adverse effects, the validation of computationally predicted disease genes and the evaluation of text-mining methods performance.
Collapse
Affiliation(s)
- Janet Piñero
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
| | - Àlex Bravo
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
| | - Núria Queralt-Rosinach
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
| | - Alba Gutiérrez-Sacristán
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
| | - Jordi Deu-Pons
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
| | - Emilio Centeno
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
| | - Javier García-García
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
| | - Ferran Sanz
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
| | - Laura I Furlong
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
| |
Collapse
|
1460
|
Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S, Simonovic M, Santos A, Doncheva NT, Roth A, Bork P, Jensen LJ, von Mering C. The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res 2016; 45:D362-D368. [PMID: 27924014 PMCID: PMC5210637 DOI: 10.1093/nar/gkw937] [Citation(s) in RCA: 4944] [Impact Index Per Article: 549.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Accepted: 10/06/2016] [Indexed: 02/06/2023] Open
Abstract
A system-wide understanding of cellular function requires knowledge of all functional interactions between the expressed proteins. The STRING database aims to collect and integrate this information, by consolidating known and predicted protein–protein association data for a large number of organisms. The associations in STRING include direct (physical) interactions, as well as indirect (functional) interactions, as long as both are specific and biologically meaningful. Apart from collecting and reassessing available experimental data on protein–protein interactions, and importing known pathways and protein complexes from curated databases, interaction predictions are derived from the following sources: (i) systematic co-expression analysis, (ii) detection of shared selective signals across genomes, (iii) automated text-mining of the scientific literature and (iv) computational transfer of interaction knowledge between organisms based on gene orthology. In the latest version 10.5 of STRING, the biggest changes are concerned with data dissemination: the web frontend has been completely redesigned to reduce dependency on outdated browser technologies, and the database can now also be queried from inside the popular Cytoscape software framework. Further improvements include automated background analysis of user inputs for functional enrichments, and streamlined download options. The STRING resource is available online, at http://string-db.org/.
Collapse
Affiliation(s)
- Damian Szklarczyk
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - John H Morris
- Resource on Biocomputing, Visualization, and Informatics, University of California, San Francisco, CA 94158-2517, USA
| | - Helen Cook
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Michael Kuhn
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Stefan Wyder
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Milan Simonovic
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Alberto Santos
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Alexander Roth
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany .,Molecular Medicine Partnership Unit, University of Heidelberg and European Molecular Biology Laboratory, 69117 Heidelberg, Germany.,Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany.,Department of Bioinformatics, Biocenter, University of Würzburg, 97074 Würzburg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, 8057 Zurich, Switzerland
| |
Collapse
|
1461
|
Challenges and disparities in the application of personalized genomic medicine to populations with African ancestry. Nat Commun 2016; 7:12521. [PMID: 27725664 PMCID: PMC5062569 DOI: 10.1038/ncomms12521] [Citation(s) in RCA: 57] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Accepted: 07/12/2016] [Indexed: 12/16/2022] Open
Abstract
To characterize the extent and impact of ancestry-related biases in precision genomic medicine, we use 642 whole-genome sequences from the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) project to evaluate typical filters and databases. We find significant correlations between estimated African ancestry proportions and the number of variants per individual in all variant classification sets but one. The source of these correlations is highlighted in more detail by looking at the interaction between filtering criteria and the ClinVar and Human Gene Mutation databases. ClinVar's correlation, representing African ancestry-related bias, has changed over time amidst monthly updates, with the most extreme switch happening between March and April of 2014 (r=0.733 to r=-0.683). We identify 68 SNPs as the major drivers of this change in correlation. As long as ancestry-related bias when using these clinical databases is minimally recognized, the genetics community will face challenges with implementation, interpretation and cost-effectiveness when treating minority populations.
Collapse
|
1462
|
Hu B, Yang YCT, Huang Y, Zhu Y, Lu ZJ. POSTAR: a platform for exploring post-transcriptional regulation coordinated by RNA-binding proteins. Nucleic Acids Res 2016; 45:D104-D114. [PMID: 28053162 PMCID: PMC5210617 DOI: 10.1093/nar/gkw888] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Revised: 09/23/2016] [Accepted: 09/27/2016] [Indexed: 01/01/2023] Open
Abstract
We present POSTAR (http://POSTAR.ncrnalab.org), a resource of POST-trAnscriptional Regulation coordinated by RNA-binding proteins (RBPs). Precise characterization of post-transcriptional regulatory maps has accelerated dramatically in the past few years. Based on new studies and resources, POSTAR supplies the largest collection of experimentally probed (∼23 million) and computationally predicted (approximately 117 million) RBP binding sites in the human and mouse transcriptomes. POSTAR annotates every transcript and its RBP binding sites using extensive information regarding various molecular regulatory events (e.g., splicing, editing, and modification), RNA secondary structures, disease-associated variants, and gene expression and function. Moreover, POSTAR provides a friendly, multi-mode, integrated search interface, which helps users to connect multiple RBP binding sites with post-transcriptional regulatory events, phenotypes, and diseases. Based on our platform, we were able to obtain novel insights into post-transcriptional regulation, such as the putative association between CPSF6 binding, RNA structural domains, and Li-Fraumeni syndrome SNPs. In summary, POSTAR represents an early effort to systematically annotate post-transcriptional regulatory maps and explore the putative roles of RBPs in human diseases.
Collapse
Affiliation(s)
- Boqin Hu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Yu-Cheng T Yang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China.,Department of Statistics, University of California Los Angeles, Los Angeles, CA 90095-1554, USA
| | - Yiming Huang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Yumin Zhu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Zhi John Lu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, Center for Plant Biology and Tsinghua-Peking Joint Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China
| |
Collapse
|
1463
|
Shim H, Kim JH, Kim CY, Hwang S, Kim H, Yang S, Lee JE, Lee I. Function-driven discovery of disease genes in zebrafish using an integrated genomics big data resource. Nucleic Acids Res 2016; 44:9611-9623. [PMID: 27903883 PMCID: PMC5175370 DOI: 10.1093/nar/gkw897] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Revised: 09/23/2016] [Accepted: 09/29/2016] [Indexed: 12/16/2022] Open
Abstract
Whole exome sequencing (WES) accelerates disease gene discovery using rare genetic variants, but further statistical and functional evidence is required to avoid false-discovery. To complement variant-driven disease gene discovery, here we present function-driven disease gene discovery in zebrafish (Danio rerio), a promising human disease model owing to its high anatomical and genomic similarity to humans. To facilitate zebrafish-based function-driven disease gene discovery, we developed a genome-scale co-functional network of zebrafish genes, DanioNet (www.inetbio.org/danionet), which was constructed by Bayesian integration of genomics big data. Rigorous statistical assessment confirmed the high prediction capacity of DanioNet for a wide variety of human diseases. To demonstrate the feasibility of the function-driven disease gene discovery using DanioNet, we predicted genes for ciliopathies and performed experimental validation for eight candidate genes. We also validated the existence of heterozygous rare variants in the candidate genes of individuals with ciliopathies yet not in controls derived from the UK10K consortium, suggesting that these variants are potentially involved in enhancing the risk of ciliopathies. These results showed that an integrated genomics big data for a model animal of diseases can expand our opportunity for harnessing WES data in disease gene discovery.
Collapse
Affiliation(s)
- Hongseok Shim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Ji Hyun Kim
- Department of Health Sciences & Technology, SAIHST, Sungkyunkwan University, Seoul 06351, Korea
| | - Chan Yeong Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Sohyun Hwang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Hyojin Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Sunmo Yang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Ji Eun Lee
- Department of Health Sciences & Technology, SAIHST, Sungkyunkwan University, Seoul 06351, Korea .,Samsung Genome Institute, Samsung Medical Center, Seoul 06351, Korea
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| |
Collapse
|
1464
|
Cook DE, Zdraljevic S, Roberts JP, Andersen EC. CeNDR, the Caenorhabditis elegans natural diversity resource. Nucleic Acids Res 2016; 45:D650-D657. [PMID: 27701074 PMCID: PMC5210618 DOI: 10.1093/nar/gkw893] [Citation(s) in RCA: 192] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Revised: 09/20/2016] [Accepted: 09/26/2016] [Indexed: 01/06/2023] Open
Abstract
Studies in model organisms have yielded considerable insights into the etiology of disease and our understanding of evolutionary processes. Caenorhabditis elegans is among the most powerful model organisms used to understand biology. However, C. elegans is not used as extensively as other model organisms to investigate how natural variation shapes traits, especially through the use of genome-wide association (GWA) analyses. Here, we introduce a new platform, the C. elegans Natural Diversity Resource (CeNDR) to enable statistical genetics and genomics studies of C. elegans and to connect the results to human disease. CeNDR provides the research community with wild strains, genome-wide sequence and variant data for every strain, and a GWA mapping portal for studying natural variation in C. elegans. Additionally, researchers outside of the C. elegans community can benefit from public mappings and integrated tools for comparative analyses. CeNDR uses several databases that are continually updated through the addition of new strains, sequencing data, and association mapping results. The CeNDR data are accessible through a freely available web portal located at http://www.elegansvariation.org or through an application programming interface.
Collapse
Affiliation(s)
- Daniel E Cook
- Interdisciplinary Biological Sciences Program, Northwestern University, Evanston, IL 60208, USA.,Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
| | - Stefan Zdraljevic
- Interdisciplinary Biological Sciences Program, Northwestern University, Evanston, IL 60208, USA.,Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
| | - Joshua P Roberts
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
| | - Erik C Andersen
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
| |
Collapse
|
1465
|
To Unveil the Molecular Mechanisms of Qi and Blood through Systems Biology-Based Investigation into Si-Jun-Zi-Tang and Si-Wu-Tang formulae. Sci Rep 2016; 6:34328. [PMID: 27677604 PMCID: PMC5039637 DOI: 10.1038/srep34328] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Accepted: 09/12/2016] [Indexed: 11/24/2022] Open
Abstract
Traditional Chinese Medicine (TCM) is increasingly getting clinical application worldwide. But its theory like QI-Blood is still abstract. Actually, Qi deficiency and blood deficiency, which were treated by Si-Jun-Zi-Tang (SJZT) and Si-Wu-Tang (SWT) respectively, have characteristic clinical manifestations. Here, we analyzed targets of the ingredients in SJZT and SWT to unveil potential biologic mechanisms between Qi deficiency and blood deficiency through biomedical approaches. First, ingredients in SWT and SJZT were retrieved from TCMID database. The genes targeted by these ingredients were chosen from STITCH. After enrichment analysis by Gene Ontology (GO) and DAVID, enriched GO terms with p-value less than 0.01 were collected and interpreted through DAVID and KEGG. Then a visualized network was constructed with ClueGO. Finally, a total of 243 genes targeted by 195 ingredients of SWT formula and 209 genes targeted by 61 ingredients of SJZT were obtained. Six metabolism pathways and two environmental information processing pathways enriched by targets were correlated with 2 or more herbs in SWT and SJZT formula, respectively.
Collapse
|
1466
|
Salgado D, Bellgard MI, Desvignes JP, Béroud C. How to Identify Pathogenic Mutations among All Those Variations: Variant Annotation and Filtration in the Genome Sequencing Era. Hum Mutat 2016; 37:1272-1282. [DOI: 10.1002/humu.23110] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2016] [Revised: 08/24/2016] [Accepted: 08/31/2016] [Indexed: 01/01/2023]
Affiliation(s)
- David Salgado
- Aix Marseille University; INSERM; GMGF; Marseille France
| | - Matthew I. Bellgard
- Centre for Comparative Genomics; Murdoch University; Perth Western Australia Australia
- Western Australian Neuroscience Research Institute; Perth Western Australia Australia
| | | | - Christophe Béroud
- Aix Marseille University; INSERM; GMGF; Marseille France
- APHM, Hôpital TIMONE Enfants; Laboratoire de Génétique Moléculaire; Marseille France
| |
Collapse
|
1467
|
Davis AP, Grondin CJ, Johnson RJ, Sciaky D, King BL, McMorran R, Wiegers J, Wiegers TC, Mattingly CJ. The Comparative Toxicogenomics Database: update 2017. Nucleic Acids Res 2016; 45:D972-D978. [PMID: 27651457 PMCID: PMC5210612 DOI: 10.1093/nar/gkw838] [Citation(s) in RCA: 386] [Impact Index Per Article: 42.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 09/09/2016] [Indexed: 12/19/2022] Open
Abstract
The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) provides information about interactions between chemicals and gene products, and their relationships to diseases. Core CTD content (chemical-gene, chemical-disease and gene-disease interactions manually curated from the literature) are integrated with each other as well as with select external datasets to generate expanded networks and predict novel associations. Today, core CTD includes more than 30.5 million toxicogenomic connections relating chemicals/drugs, genes/proteins, diseases, taxa, Gene Ontology (GO) annotations, pathways, and gene interaction modules. In this update, we report a 33% increase in our core data content since 2015, describe our new exposure module (that harmonizes exposure science information with core toxicogenomic data) and introduce a novel dataset of GO-disease inferences (that identify common molecular underpinnings for seemingly unrelated pathologies). These advancements centralize and contextualize real-world chemical exposures with molecular pathways to help scientists generate testable hypotheses in an effort to understand the etiology and mechanisms underlying environmentally influenced diseases.
Collapse
Affiliation(s)
- Allan Peter Davis
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Cynthia J Grondin
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Robin J Johnson
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Daniela Sciaky
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Benjamin L King
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | - Roy McMorran
- Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, ME 04672, USA
| | - Jolene Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Thomas C Wiegers
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Carolyn J Mattingly
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA.,Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695, USA
| |
Collapse
|
1468
|
Li P, Nie Y, Yu J. Fusing literature and full network data improves disease similarity computation. BMC Bioinformatics 2016; 17:326. [PMID: 27578323 PMCID: PMC5006367 DOI: 10.1186/s12859-016-1205-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2016] [Accepted: 08/24/2016] [Indexed: 01/01/2023] Open
Abstract
Background Identifying relatedness among diseases could help deepen understanding for the underlying pathogenic mechanisms of diseases, and facilitate drug repositioning projects. A number of methods for computing disease similarity had been developed; however, none of them were designed to utilize information of the entire protein interaction network, using instead only those interactions involving disease causing genes. Most of previously published methods required gene-disease association data, unfortunately, many diseases still have very few or no associated genes, which impeded broad adoption of those methods. In this study, we propose a new method (MedNetSim) for computing disease similarity by integrating medical literature and protein interaction network. MedNetSim consists of a network-based method (NetSim), which employs the entire protein interaction network, and a MEDLINE-based method (MedSim), which computes disease similarity by mining the biomedical literature. Results Among function-based methods, NetSim achieved the best performance. Its average AUC (area under the receiver operating characteristic curve) reached 95.2 %. MedSim, whose performance was even comparable to some function-based methods, acquired the highest average AUC in all semantic-based methods. Integration of MedSim and NetSim (MedNetSim) further improved the average AUC to 96.4 %. We further studied the effectiveness of different data sources. It was found that quality of protein interaction data was more important than its volume. On the contrary, higher volume of gene-disease association data was more beneficial, even with a lower reliability. Utilizing higher volume of disease-related gene data further improved the average AUC of MedNetSim and NetSim to 97.5 % and 96.7 %, respectively. Conclusions Integrating biomedical literature and protein interaction network can be an effective way to compute disease similarity. Lacking sufficient disease-related gene data, literature-based methods such as MedSim can be a great addition to function-based algorithms. It may be beneficial to steer more resources torward studying gene-disease associations and improving the quality of protein interaction data. Disease similarities can be computed using the proposed methods at http://www.digintelli.com:8000/. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1205-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ping Li
- State Key Laboratory of Biochemical Engineering, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yaling Nie
- State Key Laboratory of Biochemical Engineering, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jingkai Yu
- State Key Laboratory of Biochemical Engineering, Institute of Process Engineering, Chinese Academy of Sciences, Beijing, 100190, China.
| |
Collapse
|
1469
|
Bacchelli C, Williams HJ. Opportunities and technical challenges in next-generation sequencing for diagnosis of rare pediatric diseases. Expert Rev Mol Diagn 2016; 16:1073-1082. [PMID: 27560481 DOI: 10.1080/14737159.2016.1222906] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
INTRODUCTION Rare pediatric diseases are clinically severe with high rates of mortality and morbidity. This paper outlines how next-generation sequencing (NGS) can be used to greatly advance identification of the underlying genetic causes. Areas covered: This manuscript is a blend of evidence obtained from literature searches from PubMed and rare disease related websites, laboratory experience and the author's opinions. The paper covers the current state of the field and identifies where the challenges lie and how they are being overcome, using up-to-date references. Expert commentary: The field of NGS is still relatively new but it has already transformed the field of rare disease research. Technological advances in instrumentation, computational hardware and software have resulted in the identification of many causative genes, but as sequencing moves into population-scale initiatives standardisation and data sharing is going to be of paramount importance to ensure we derive the maximum benefit for patients.
Collapse
Affiliation(s)
- Chiara Bacchelli
- a Head of Experimental & Personalised Medicine Section , Genetics and Genomic Medicine Programme, UCL GOS Institute of Child Health , London , England
| | - Hywel J Williams
- b GOSgene, Genetics and Genomic Medicine Programme , UCL GOS Institute of Child Health , London , England
| |
Collapse
|
1470
|
Candidate SNP Markers of Chronopathologies Are Predicted by a Significant Change in the Affinity of TATA-Binding Protein for Human Gene Promoters. BIOMED RESEARCH INTERNATIONAL 2016; 2016:8642703. [PMID: 27635400 PMCID: PMC5011241 DOI: 10.1155/2016/8642703] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Revised: 06/25/2016] [Accepted: 06/28/2016] [Indexed: 01/14/2023]
Abstract
Variations in human genome (e.g., single nucleotide polymorphisms, SNPs) may be associated with hereditary diseases, their complications, comorbidities, and drug responses. Using Web service SNP_TATA_Comparator presented in our previous paper, here we analyzed immediate surroundings of known SNP markers of diseases and identified several candidate SNP markers that can significantly change the affinity of TATA-binding protein for human gene promoters, with circadian consequences. For example, rs572527200 may be related to asthma, where symptoms are circadian (worse at night), and rs367732974 may be associated with heart attacks that are characterized by a circadian preference (early morning). By the same method, we analyzed the 90 bp proximal promoter region of each protein-coding transcript of each human gene of the circadian clock core. This analysis yielded 53 candidate SNP markers, such as rs181985043 (susceptibility to acute Q fever in male patients), rs192518038 (higher risk of a heart attack in patients with diabetes), and rs374778785 (emphysema and lung cancer in smokers). If they are properly validated according to clinical standards, these candidate SNP markers may turn out to be useful for physicians (to select optimal treatment for each patient) and for the general population (to choose a lifestyle preventing possible circadian complications of diseases).
Collapse
|
1471
|
Pers TH. Gene set analysis for interpreting genetic studies. Hum Mol Genet 2016; 25:R133-R140. [PMID: 27511725 DOI: 10.1093/hmg/ddw249] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 07/18/2016] [Indexed: 02/03/2023] Open
Abstract
Interpretation of genome-wide association study (GWAS) results is lacking behind the discovery of new genetic associations. Consequently, there is an urgent need for data-driven methods for interpreting genetic association studies. Gene set analysis (GSA) can identify aetiologic pathways and functional annotations and may hence point towards novel biological insights. However, despite the growing availability of GSA tools, the sizeable amount of variants identified for a vast number of complex traits, and many irrefutably trait-associated gene sets, the gap between discovery and interpretation remains. More efficient interpretation requires more complete and consistent gene set representations of biological pathways, phenotypes and functional annotations. In this review, I examine different types of gene sets, discuss how inconsistencies in gene set definitions impact GSA, describe how GSA has helped to elucidate biology and outline potential future directions.
Collapse
Affiliation(s)
- Tune H Pers
- Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark Novo Nordisk Foundation Centre for Basic Metabolic Research, Section of Metabolic, Genetics, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
1472
|
Rost B, Radivojac P, Bromberg Y. Protein function in precision medicine: deep understanding with machine learning. FEBS Lett 2016; 590:2327-41. [PMID: 27423136 PMCID: PMC5937700 DOI: 10.1002/1873-3468.12307] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2016] [Revised: 07/12/2016] [Accepted: 07/12/2016] [Indexed: 12/21/2022]
Abstract
Precision medicine and personalized health efforts propose leveraging complex molecular, medical and family history, along with other types of personal data toward better life. We argue that this ambitious objective will require advanced and specialized machine learning solutions. Simply skimming some low-hanging results off the data wealth might have limited potential. Instead, we need to better understand all parts of the system to define medically relevant causes and effects: how do particular sequence variants affect particular proteins and pathways? How do these effects, in turn, cause the health or disease-related phenotype? Toward this end, deeper understanding will not simply diffuse from deeper machine learning, but from more explicit focus on understanding protein function, context-specific protein interaction networks, and impact of variation on both.
Collapse
Affiliation(s)
- Burkhard Rost
- Department of Informatics and Bioinformatics, Institute for Advanced Studies, Technical University of Munich, Garching, Germany
| | - Predrag Radivojac
- School of Informatics and Computing, Indiana University, Bloomington, IN, USA
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, USA
| |
Collapse
|
1473
|
Mih N, Brunk E, Bordbar A, Palsson BO. A Multi-scale Computational Platform to Mechanistically Assess the Effect of Genetic Variation on Drug Responses in Human Erythrocyte Metabolism. PLoS Comput Biol 2016; 12:e1005039. [PMID: 27467583 PMCID: PMC4965186 DOI: 10.1371/journal.pcbi.1005039] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2016] [Accepted: 06/27/2016] [Indexed: 12/31/2022] Open
Abstract
Progress in systems medicine brings promise to addressing patient heterogeneity and individualized therapies. Recently, genome-scale models of metabolism have been shown to provide insight into the mechanistic link between drug therapies and systems-level off-target effects while being expanded to explicitly include the three-dimensional structure of proteins. The integration of these molecular-level details, such as the physical, structural, and dynamical properties of proteins, notably expands the computational description of biochemical network-level properties and the possibility of understanding and predicting whole cell phenotypes. In this study, we present a multi-scale modeling framework that describes biological processes which range in scale from atomistic details to an entire metabolic network. Using this approach, we can understand how genetic variation, which impacts the structure and reactivity of a protein, influences both native and drug-induced metabolic states. As a proof-of-concept, we study three enzymes (catechol-O-methyltransferase, glucose-6-phosphate dehydrogenase, and glyceraldehyde-3-phosphate dehydrogenase) and their respective genetic variants which have clinically relevant associations. Using all-atom molecular dynamic simulations enables the sampling of long timescale conformational dynamics of the proteins (and their mutant variants) in complex with their respective native metabolites or drug molecules. We find that changes in a protein's structure due to a mutation influences protein binding affinity to metabolites and/or drug molecules, and inflicts large-scale changes in metabolism.
Collapse
Affiliation(s)
- Nathan Mih
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, California, United States of America
| | - Elizabeth Brunk
- Department of Bioengineering, University of California, San Diego, La Jolla, California, United States of America
- * E-mail: (EB); (BOP)
| | - Aarash Bordbar
- Department of Bioengineering, University of California, San Diego, La Jolla, California, United States of America
| | - Bernhard O. Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, California, United States of America
- Department of Pediatrics, University of California, San Diego, La Jolla, California, United States of America
- * E-mail: (EB); (BOP)
| |
Collapse
|
1474
|
Systematic reanalysis of clinical exome data yields additional diagnoses: implications for providers. Genet Med 2016; 19:209-214. [DOI: 10.1038/gim.2016.88] [Citation(s) in RCA: 225] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Accepted: 05/16/2016] [Indexed: 12/14/2022] Open
|
1475
|
Caracausi M, Piovesan A, Vitale L, Pelleri MC. Integrated Transcriptome Map Highlights Structural and Functional Aspects of the Normal Human Heart. J Cell Physiol 2016; 232:759-770. [PMID: 27345625 DOI: 10.1002/jcp.25471] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2016] [Accepted: 06/24/2016] [Indexed: 12/22/2022]
Abstract
A systematic meta-analysis of the available gene expression profiling datasets for the whole normal human heart generated a quantitative transcriptome reference map of this organ. Transcriptome Mapper (TRAM) software integrated 32 gene expression profile datasets from different sources returning a reference value of expression for each of the 43,360 known, mapped transcripts assayed by any of the experimental platforms used in this regard. Main findings include the visualization at the gene and chromosomal levels of the classical description of the basic histology and physiology of the heart, the identification of suitable housekeeping reference genes, the analysis of stoichiometry of gene products, and the focusing on chromosome 21 genes, which are present in one excess copy in Down syndrome subjects, presenting cardiovascular defects in 30-40% of cases. Independent in vitro validation showed an excellent correlation coefficient (r = 0.98) with the in silico data. Remarkably, heart/non-cardiac tissue expression ratio may also be used to anticipate that effects of mutations will most probably affect or not the heart. The quantitative reference global portrait of gene expression in the whole normal human heart illustrates the structural and functional aspects of the whole organ and is a general model to understand the mechanisms underlying heart pathophysiology. J. Cell. Physiol. 232: 759-770, 2017. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Maria Caracausi
- Unit of Histology, Embryology and Applied Biology, Department of Experimental, Diagnostic and Specialty Medicine (DIMES), University of Bologna, Bologna, Italy
| | - Allison Piovesan
- Unit of Histology, Embryology and Applied Biology, Department of Experimental, Diagnostic and Specialty Medicine (DIMES), University of Bologna, Bologna, Italy
| | - Lorenza Vitale
- Unit of Histology, Embryology and Applied Biology, Department of Experimental, Diagnostic and Specialty Medicine (DIMES), University of Bologna, Bologna, Italy
| | - Maria Chiara Pelleri
- Unit of Histology, Embryology and Applied Biology, Department of Experimental, Diagnostic and Specialty Medicine (DIMES), University of Bologna, Bologna, Italy
| |
Collapse
|
1476
|
Data and programs in support of network analysis of genes and their association with diseases. Data Brief 2016; 8:1036-9. [PMID: 27508260 PMCID: PMC4969244 DOI: 10.1016/j.dib.2016.07.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2016] [Revised: 07/06/2016] [Accepted: 07/13/2016] [Indexed: 01/06/2023] Open
Abstract
The network-based approaches that were employed in order to depict the relationships between human genetic diseases and their associated genes are described. Towards this direction, monopartite disease-disease and gene-gene networks were constructed from bipartite gene-disease association networks. The latter were created by collecting and integrating data from three diverse resources, each one with different content, covering from rare monogenic disorders to common complex diseases. Moreover, topological and clustering graph analyses were performed. The methodology and the programs presented in this article are related to the research article entitled “Network analysis of genes and their association with diseases” [1].
Collapse
|
1477
|
Sobreira NL, Valle D. Lessons learned from the search for genes responsible for rare Mendelian disorders. Mol Genet Genomic Med 2016; 4:371-5. [PMID: 27468413 PMCID: PMC4947856 DOI: 10.1002/mgg3.233] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Affiliation(s)
- Nara L Sobreira
- McKusick-Nathans Institute of Genetic MedicineJohns Hopkins University School of MedicineBaltimoreMaryland 21205; Department of PediatricsJohns Hopkins University School of MedicineBaltimoreMaryland 21205
| | - David Valle
- McKusick-Nathans Institute of Genetic MedicineJohns Hopkins University School of MedicineBaltimoreMaryland 21205; Department of PediatricsJohns Hopkins University School of MedicineBaltimoreMaryland 21205
| |
Collapse
|
1478
|
Brosens E, Burns AJ, Brooks AS, Matera I, Borrego S, Ceccherini I, Tam PK, García-Barceló MM, Thapar N, Benninga MA, Hofstra RMW, Alves MM. Genetics of enteric neuropathies. Dev Biol 2016; 417:198-208. [PMID: 27426273 DOI: 10.1016/j.ydbio.2016.07.008] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Revised: 07/13/2016] [Accepted: 07/13/2016] [Indexed: 12/23/2022]
Abstract
Abnormal development or disturbed functioning of the enteric nervous system (ENS), the intrinsic innervation of the gastrointestinal tract, is associated with the development of neuropathic gastrointestinal motility disorders. Here, we review the underlying molecular basis of these disorders and hypothesize that many of them have a common defective biological mechanism. Genetic burden and environmental components affecting this common mechanism are ultimately responsible for disease severity and symptom heterogeneity. We believe that they act together as the fulcrum in a seesaw balanced with harmful and protective factors, and are responsible for a continuum of symptoms ranging from neuronal hyperplasia to absence of neurons.
Collapse
Affiliation(s)
- Erwin Brosens
- Department of Clinical Genetics, Erasmus University Medical Centre - Sophia Children's Hospital, Rotterdam, The Netherlands.
| | - Alan J Burns
- Department of Clinical Genetics, Erasmus University Medical Centre - Sophia Children's Hospital, Rotterdam, The Netherlands; Stem Cells and Regenerative Medicine, Birth Defects Research Centre, UCL Institute of Child Health, London, UK
| | - Alice S Brooks
- Department of Clinical Genetics, Erasmus University Medical Centre - Sophia Children's Hospital, Rotterdam, The Netherlands
| | - Ivana Matera
- UOC Medical Genetics, Istituto Giannina Gaslini, Genova, Italy
| | - Salud Borrego
- Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), Seville, Spain; Centre for Biomedical Network Research on Rare Diseases (CIBERER), Seville, Spain
| | | | - Paul K Tam
- Division of Paediatric Surgery, Department of Surgery, Li Ka Shing Faculty of Medicine of the University of Hong Kong, Hong Kong, China
| | - Maria-Mercè García-Barceló
- State Key Laboratory of Brain and Cognitive Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China; Centre for Reproduction, Development, and Growth, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Nikhil Thapar
- Stem Cells and Regenerative Medicine, Birth Defects Research Centre, UCL Institute of Child Health, London, UK
| | - Marc A Benninga
- Pediatric Gastroenterology, Emma Children's Hospital/Academic Medical Center, Amsterdam, The Netherlands
| | - Robert M W Hofstra
- Department of Clinical Genetics, Erasmus University Medical Centre - Sophia Children's Hospital, Rotterdam, The Netherlands; Stem Cells and Regenerative Medicine, Birth Defects Research Centre, UCL Institute of Child Health, London, UK
| | - Maria M Alves
- Department of Clinical Genetics, Erasmus University Medical Centre - Sophia Children's Hospital, Rotterdam, The Netherlands
| |
Collapse
|
1479
|
Breschi A, Djebali S, Gillis J, Pervouchine DD, Dobin A, Davis CA, Gingeras TR, Guigó R. Gene-specific patterns of expression variation across organs and species. Genome Biol 2016; 17:151. [PMID: 27391956 PMCID: PMC4937605 DOI: 10.1186/s13059-016-1008-y] [Citation(s) in RCA: 57] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 06/14/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A comparison of transcriptional profiles derived from different tissues in a given species or among different species assumes that commonalities reflect evolutionarily conserved programs and that differences reflect species or tissue responses to environmental conditions or developmental program staging. Apparently conflicting results have been published regarding whether organ-specific transcriptional patterns dominate over species-specific patterns, or vice versa, making it unclear to what extent the biology of a given organism can be extrapolated to another. These studies have in common that they treat the transcriptomes monolithically, implicitly ignoring that each gene is likely to have a specific pattern of transcriptional variation across organs and species. RESULTS We use linear models to quantify this pattern. We find a continuum in the spectrum of expression variation: the expression of some genes varies considerably across species and little across organs, and simply reflects evolutionary distance. At the other extreme are genes whose expression varies considerably across organs and little across species; these genes are much more likely to be associated with diseases than are genes whose expression varies predominantly across species. CONCLUSIONS Whether transcriptomes, when considered globally, cluster preferentially according to one component or the other may not be a property of the transcriptomes, but rather a consequence of the dominant behavior of a subset of genes. Therefore, the values of the components of the variance of expression for each gene could become a useful resource when planning, interpreting, and extrapolating experimental data from mouse to humans.
Collapse
Affiliation(s)
- Alessandra Breschi
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Sarah Djebali
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- GenPhySE, Université de Toulouse, INRA, INPT, INP-ENVT, Castanet Tolosan, France
| | - Jesse Gillis
- Cold Spring Harbor LaboratoryCold Spring Harbor, NY, 11742, USA
| | - Dmitri D Pervouchine
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Alex Dobin
- Cold Spring Harbor LaboratoryCold Spring Harbor, NY, 11742, USA
| | - Carrie A Davis
- Cold Spring Harbor LaboratoryCold Spring Harbor, NY, 11742, USA
| | | | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
| |
Collapse
|
1480
|
Rouillard AD, Gundersen GW, Fernandez NF, Wang Z, Monteiro CD, McDermott MG, Ma'ayan A. The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database (Oxford) 2016; 2016:baw100. [PMID: 27374120 PMCID: PMC4930834 DOI: 10.1093/database/baw100] [Citation(s) in RCA: 962] [Impact Index Per Article: 106.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2016] [Revised: 05/15/2016] [Accepted: 05/31/2016] [Indexed: 12/18/2022]
Abstract
Genomics, epigenomics, transcriptomics, proteomics and metabolomics efforts rapidly generate a plethora of data on the activity and levels of biomolecules within mammalian cells. At the same time, curation projects that organize knowledge from the biomedical literature into online databases are expanding. Hence, there is a wealth of information about genes, proteins and their associations, with an urgent need for data integration to achieve better knowledge extraction and data reuse. For this purpose, we developed the Harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins from over 70 major online resources. We extracted, abstracted and organized data into ∼72 million functional associations between genes/proteins and their attributes. Such attributes could be physical relationships with other biomolecules, expression in cell lines and tissues, genetic associations with knockout mouse or human phenotypes, or changes in expression after drug treatment. We stored these associations in a relational database along with rich metadata for the genes/proteins, their attributes and the original resources. The freely available Harmonizome web portal provides a graphical user interface, a web service and a mobile app for querying, browsing and downloading all of the collected data. To demonstrate the utility of the Harmonizome, we computed and visualized gene-gene and attribute-attribute similarity networks, and through unsupervised clustering, identified many unexpected relationships by combining pairs of datasets such as the association between kinase perturbations and disease signatures. We also applied supervised machine learning methods to predict novel substrates for kinases, endogenous ligands for G-protein coupled receptors, mouse phenotypes for knockout genes, and classified unannotated transmembrane proteins for likelihood of being ion channels. The Harmonizome is a comprehensive resource of knowledge about genes and proteins, and as such, it enables researchers to discover novel relationships between biological entities, as well as form novel data-driven hypotheses for experimental validation.Database URL: http://amp.pharm.mssm.edu/Harmonizome.
Collapse
Affiliation(s)
- Andrew D Rouillard
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Gregory W Gundersen
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Nicolas F Fernandez
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Zichen Wang
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Caroline D Monteiro
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael G McDermott
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Avi Ma'ayan
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
1481
|
The Qatar genome: a population-specific tool for precision medicine in the Middle East. Hum Genome Var 2016; 3:16016. [PMID: 27408750 PMCID: PMC4927697 DOI: 10.1038/hgv.2016.16] [Citation(s) in RCA: 77] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2015] [Revised: 03/09/2016] [Accepted: 04/11/2016] [Indexed: 12/30/2022] Open
Abstract
Reaching the full potential of precision medicine depends on the quality of personalized genome interpretation. In order to facilitate precision medicine in regions of the Middle East and North Africa (MENA), a population-specific genome for the indigenous Arab population of Qatar (QTRG) was constructed by incorporating allele frequency data from sequencing of 1,161 Qataris, representing 0.4% of the population. A total of 20.9 million single nucleotide polymorphisms (SNPs) and 3.1 million indels were observed in Qatar, including an average of 1.79% novel variants per individual genome. Replacement of the GRCh37 standard reference with QTRG in a best practices genome analysis workflow resulted in an average of 7* deeper coverage depth (an improvement of 23%) and 756,671 fewer variants on average, a reduction of 16% that is attributed to common Qatari alleles being present in QTRG. The benefit for using QTRG varies across ancestries, a factor that should be taken into consideration when selecting an appropriate reference for analysis.
Collapse
|
1482
|
Aken BL, Ayling S, Barrell D, Clarke L, Curwen V, Fairley S, Fernandez Banet J, Billis K, García Girón C, Hourlier T, Howe K, Kähäri A, Kokocinski F, Martin FJ, Murphy DN, Nag R, Ruffier M, Schuster M, Tang YA, Vogel JH, White S, Zadissa A, Flicek P, Searle SMJ. The Ensembl gene annotation system. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw093. [PMID: 27337980 PMCID: PMC4919035 DOI: 10.1093/database/baw093] [Citation(s) in RCA: 742] [Impact Index Per Article: 82.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/11/2016] [Accepted: 05/09/2016] [Indexed: 12/12/2022]
Abstract
The Ensembl gene annotation system has been used to annotate over 70 different vertebrate species across a wide range of genome projects. Furthermore, it generates the automatic alignment-based annotation for the human and mouse GENCODE gene sets. The system is based on the alignment of biological sequences, including cDNAs, proteins and RNA-seq reads, to the target genome in order to construct candidate transcript models. Careful assessment and filtering of these candidate transcripts ultimately leads to the final gene set, which is made available on the Ensembl website. Here, we describe the annotation process in detail.Database URL: http://www.ensembl.org/index.html.
Collapse
Affiliation(s)
- Bronwen L Aken
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Sarah Ayling
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK Present addresses: The Genome Analysis Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Daniel Barrell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK Eagle Genomics Ltd, Babraham Research Campus, Cambridge CB22 3AT, UK
| | - Laura Clarke
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Valery Curwen
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Susan Fairley
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Julio Fernandez Banet
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK Pfizer Inc, 10646 Science Center Dr, San Diego, CA 92121, USA
| | - Konstantinos Billis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Carlos García Girón
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Kevin Howe
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andreas Kähäri
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK Institutionen för cell-och molekylärbiologi, Uppsala University, Husargatan 3, Uppsala 752 37, Sweden
| | - Felix Kokocinski
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Daniel N Murphy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Rishi Nag
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Magali Ruffier
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Michael Schuster
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna a-1090, Austria
| | - Y Amy Tang
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jan-Hinnerk Vogel
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK Genentech Inc, 1 DNA Way, South San Francisco, CA 94080, USA
| | - Simon White
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK The Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Amonida Zadissa
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Stephen M J Searle
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| |
Collapse
|
1483
|
Cheng L, Zhang S, Hu Y. BLAT2DOLite: An Online System for Identifying Significant Relationships between Genetic Sequences and Diseases. PLoS One 2016; 11:e0157274. [PMID: 27315278 PMCID: PMC4912091 DOI: 10.1371/journal.pone.0157274] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Accepted: 05/26/2016] [Indexed: 11/18/2022] Open
Abstract
The significantly related diseases of sequences could play an important role in understanding the functions of these sequences. In this paper, we introduced BLAT2DOLite, an online system for annotating human genes and diseases and identifying the significant relationships between sequences and diseases. Currently, BLAT2DOLite integrates Entrez Gene database and Disease Ontology Lite (DOLite), which contain loci of gene and relationships between genes and diseases. It utilizes hypergeometric test to calculate P-values between genes and diseases of DOLite. The system can be accessed from: http://123.59.132.21:8080/BLAT2DOLite. The corresponding web service is described in: http://123.59.132.21:8080/BLAT2DOLite/BLAT2DOLiteIDMappingPort?wsdl.
Collapse
Affiliation(s)
- Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, PR China
- * E-mail: (LC); (YH)
| | - Shuo Zhang
- School of Management, Harbin University of Commerce, Harbin 150028, PR China
| | - Yang Hu
- School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, PR China
- * E-mail: (LC); (YH)
| |
Collapse
|
1484
|
Mutational patterns in oncogenes and tumour suppressors. Biochem Soc Trans 2016; 44:925-31. [DOI: 10.1042/bst20160001] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2016] [Indexed: 12/24/2022]
Abstract
All cancers depend upon mutations in critical genes, which confer a selective advantage to the tumour cell. Knowledge of these mutations is crucial to understanding the biology of cancer initiation and progression, and to the development of targeted therapeutic strategies. The key to understanding the contribution of a disease-associated mutation to the development and progression of cancer, comes from an understanding of the consequences of that mutation on the function of the affected protein, and the impact on the pathways in which that protein is involved. In this paper we examine the mutation patterns observed in oncogenes and tumour suppressors, and discuss different approaches that have been developed to identify driver mutations within cancers that contribute to the disease progress. We also discuss the MOKCa database where we have developed an automatic pipeline that structurally and functionally annotates all proteins from the human proteome that are mutated in cancer.
Collapse
|
1485
|
Ostrow SL, Hershberg R. The Somatic Nature of Cancer Allows It to Affect Highly Constrained Genes. Genome Biol Evol 2016; 8:1614-20. [PMID: 27190005 PMCID: PMC4898816 DOI: 10.1093/gbe/evw110] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Cancer is special among genetic disorders in two major ways: first, cancer is a disease of the most basic of cellular functions, such as cell proliferation, differentiation, and the maintenance of genomic integrity. Second, in contrast to most genetic disorders that are mediated by germline (hereditary) mutations, cancer is largely a somatic disease. Here we show that these two traits are not detached and that it is the somatic nature of cancer that allows it to affect the most basic of cellular functions. We begin by demonstrating that cancer genes are both more functionally central (as measured by their patterns of expression and protein interaction) and more evolutionarily constrained than non-cancer genetic disease genes. We then compare genes that are only modified somatically in cancer (hereinafter referred to as “somatic cancer genes”) to those that can also be modified in a hereditary manner, contributing to cancer development (hereinafter referred to as “hereditary cancer genes”). We show that both somatic and hereditary cancer genes are much more functionally central than genes contributing to non-cancer genetic disorders. At the same time, hereditary cancer genes are only as constrained as non-cancer hereditary disease genes, while somatic cancer genes tend to be much more constrained in evolution. Thus, it appears that it is the somatic nature of cancer that allows it to modify the most constrained genes and, therefore, affect the most basic of cellular functions.
Collapse
Affiliation(s)
- Sheli L Ostrow
- Rachel & Menachem Mendelovitch Evolutionary Processes of Mutation & Natural Selection Research Laboratory, Department of Genetics and Developmental Biology, The Ruth and Bruce Rappaport Faculty of Medicine, Technion-Israel Institute of Technology, Haifa, Israel
| | - Ruth Hershberg
- Rachel & Menachem Mendelovitch Evolutionary Processes of Mutation & Natural Selection Research Laboratory, Department of Genetics and Developmental Biology, The Ruth and Bruce Rappaport Faculty of Medicine, Technion-Israel Institute of Technology, Haifa, Israel
| |
Collapse
|
1486
|
Kontou PI, Pavlopoulou A, Dimou NL, Pavlopoulos GA, Bagos PG. Network analysis of genes and their association with diseases. Gene 2016; 590:68-78. [PMID: 27265032 DOI: 10.1016/j.gene.2016.05.044] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Revised: 05/20/2016] [Accepted: 05/30/2016] [Indexed: 12/21/2022]
Abstract
A plethora of network-based approaches within the Systems Biology universe have been applied, to date, to investigate the underlying molecular mechanisms of various human diseases. In the present study, we perform a bipartite, topological and clustering graph analysis in order to gain a better understanding of the relationships between human genetic diseases and the relationships between the genes that are implicated in them. For this purpose, disease-disease and gene-gene networks were constructed from combined gene-disease association networks. The latter, were created by collecting and integrating data from three diverse resources, each one with different content covering from rare monogenic disorders to common complex diseases. This data pluralism enabled us to uncover important associations between diseases with unrelated phenotypic manifestations but with common genetic origin. For our analysis, the topological attributes and the functional implications of the individual networks were taken into account and are shortly discussed. We believe that some observations of this study could advance our understanding regarding the etiology of a disease with distinct pathological manifestations, and simultaneously provide the springboard for the development of preventive and therapeutic strategies and its underlying genetic mechanisms.
Collapse
Affiliation(s)
- Panagiota I Kontou
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Greece
| | - Athanasia Pavlopoulou
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Greece
| | - Niki L Dimou
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Greece
| | - Georgios A Pavlopoulos
- Lawrence Berkeley Lab, Joint Genome Institute, United States Department of Energy, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Greece.
| |
Collapse
|
1487
|
Liu ZP. Identifying network-based biomarkers of complex diseases from high-throughput data. Biomark Med 2016; 10:633-50. [PMID: 26786840 DOI: 10.2217/bmm-2015-0035] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
In this work, we review the main available computational methods of identifying biomarkers of complex diseases from high-throughput data. The emerging omics techniques provide powerful alternatives to measure thousands of molecules in cells in parallel manners. The generated genomic, transcriptomic, proteomic, metabolomic and phenomic data provide comprehensive molecular and cellular information for detecting critical signals served as biomarkers by classifying disease phenotypic states. Networks are often employed to organize these profiles in the identification of biomarkers to deal with complex diseases in diagnosis, prognosis and therapy as well as mechanism deciphering from systematic perspectives. Here, we summarize some representative network-based bioinformatics methods in order to highlight the importance of computational strategies in biomarker discovery.
Collapse
Affiliation(s)
- Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science & Engineering, Shandong University, Jinan, Shandong 250061, China
| |
Collapse
|
1488
|
Bendl J, Musil M, Štourač J, Zendulka J, Damborský J, Brezovský J. PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions. PLoS Comput Biol 2016; 12:e1004962. [PMID: 27224906 PMCID: PMC4880439 DOI: 10.1371/journal.pcbi.1004962] [Citation(s) in RCA: 140] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2015] [Accepted: 05/05/2016] [Indexed: 12/20/2022] Open
Abstract
An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools’ predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations. To enable comprehensive evaluation of variants, the predictions are complemented with annotations from eight databases. The web server is freely available to the community at http://loschmidt.chemi.muni.cz/predictsnp2.
Collapse
Affiliation(s)
- Jaroslav Bendl
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, Brno, Czech Republic
- Department of Information Systems, Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital Brno, Brno, Czech Republic
| | - Miloš Musil
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, Brno, Czech Republic
- Department of Information Systems, Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic
| | - Jan Štourač
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital Brno, Brno, Czech Republic
| | - Jaroslav Zendulka
- Department of Information Systems, Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic
| | - Jiří Damborský
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital Brno, Brno, Czech Republic
- * E-mail: (JD); (JBr)
| | - Jan Brezovský
- Loschmidt Laboratories, Department of Experimental Biology and Research Centre for Toxic Compounds in the Environment RECETOX, Masaryk University, Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital Brno, Brno, Czech Republic
- * E-mail: (JD); (JBr)
| |
Collapse
|
1489
|
Shim JE, Lee I. Weighted mutual information analysis substantially improves domain-based functional network models. Bioinformatics 2016; 32:2824-30. [PMID: 27207946 PMCID: PMC5018372 DOI: 10.1093/bioinformatics/btw320] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Accepted: 05/16/2016] [Indexed: 11/30/2022] Open
Abstract
Motivation: Functional protein–protein interaction (PPI) networks elucidate molecular pathways underlying complex phenotypes, including those of human diseases. Extrapolation of domain–domain interactions (DDIs) from known PPIs is a major domain-based method for inferring functional PPI networks. However, the protein domain is a functional unit of the protein. Therefore, we should be able to effectively infer functional interactions between proteins based on the co-occurrence of domains. Results: Here, we present a method for inferring accurate functional PPIs based on the similarity of domain composition between proteins by weighted mutual information (MI) that assigned different weights to the domains based on their genome-wide frequencies. Weighted MI outperforms other domain-based network inference methods and is highly predictive for pathways as well as phenotypes. A genome-scale human functional network determined by our method reveals numerous communities that are significantly associated with known pathways and diseases. Domain-based functional networks may, therefore, have potential applications in mapping domain-to-pathway or domain-to-phenotype associations. Availability and Implementation: Source code for calculating weighted mutual information based on the domain profile matrix is available from www.netbiolab.org/w/WMI. Contact:Insuklee@yonsei.ac.kr Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jung Eun Shim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| |
Collapse
|
1490
|
Shimoyama M, Smith JR, De Pons J, Tutaj M, Khampang P, Hong W, Erbe CB, Ehrlich GD, Bakaletz LO, Kerschner JE. The Chinchilla Research Resource Database: resource for an otolaryngology disease model. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw073. [PMID: 27173523 PMCID: PMC4865329 DOI: 10.1093/database/baw073] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Accepted: 04/18/2016] [Indexed: 12/04/2022]
Abstract
The long-tailed chinchilla (Chinchilla lanigera) is an established animal model for diseases of the inner and middle ear, among others. In particular, chinchilla is commonly used to study diseases involving viral and bacterial pathogens and polymicrobial infections of the upper respiratory tract and the ear, such as otitis media. The value of the chinchilla as a model for human diseases prompted the sequencing of its genome in 2012 and the more recent development of the Chinchilla Research Resource Database (http://crrd.mcw.edu) to provide investigators with easy access to relevant datasets and software tools to enhance their research. The Chinchilla Research Resource Database contains a complete catalog of genes for chinchilla and, for comparative purposes, human. Chinchilla genes can be viewed in the context of their genomic scaffold positions using the JBrowse genome browser. In contrast to the corresponding records at NCBI, individual gene reports at CRRD include functional annotations for Disease, Gene Ontology (GO) Biological Process, GO Molecular Function, GO Cellular Component and Pathway assigned to chinchilla genes based on annotations from the corresponding human orthologs. Data can be retrieved via keyword and gene-specific searches. Lists of genes with similar functional attributes can be assembled by leveraging the hierarchical structure of the Disease, GO and Pathway vocabularies through the Ontology Search and Browser tool. Such lists can then be further analyzed for commonalities using the Gene Annotator (GA) Tool. All data in the Chinchilla Research Resource Database is freely accessible and downloadable via the CRRD FTP site or using the download functions available in the search and analysis tools. The Chinchilla Research Resource Database is a rich resource for researchers using, or considering the use of, chinchilla as a model for human disease. Database URL: http://crrd.mcw.edu
Collapse
Affiliation(s)
- Mary Shimoyama
- Rat Genome Database, Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Jennifer R Smith
- Rat Genome Database, Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Jeff De Pons
- Rat Genome Database, Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Marek Tutaj
- Rat Genome Database, Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Pawjai Khampang
- Department of Otolaryngology and Communication Sciences, Medical College of Wisconsin, Children's Hospital of Wisconsin, Milwaukee, WI, USA
| | - Wenzhou Hong
- Department of Otolaryngology and Communication Sciences, Medical College of Wisconsin, Children's Hospital of Wisconsin, Milwaukee, WI, USA
| | - Christy B Erbe
- Department of Otolaryngology and Communication Sciences, Medical College of Wisconsin, Children's Hospital of Wisconsin, Milwaukee, WI, USA
| | - Garth D Ehrlich
- Department of Microbiology and Immunology Department of Otolaryngology-Head and Neck Surgery, Center for Genomic Sciences and Center for Advanced Microbial Processing, Institute of Molecular Medicine and Infectious Diseases, Drexel University College of Medicine, Philadelphia, PA, USA
| | - Lauren O Bakaletz
- Center for Microbial Pathogenesis, the Research Institute at Nationwide Children's Hospital and the Ohio State University College of Medicine, Columbus, OH, USA
| | - Joseph E Kerschner
- Department of Otolaryngology and Communication Sciences, Medical College of Wisconsin, Children's Hospital of Wisconsin, Milwaukee, WI, USA Division of Pediatric Otolaryngology, Medical College of Wisconsin, Children's Hospital of Wisconsin, Milwaukee, WI, USA
| |
Collapse
|
1491
|
Robinson PN, Mungall CJ, Haendel M. Capturing phenotypes for precision medicine. Cold Spring Harb Mol Case Stud 2016; 1:a000372. [PMID: 27148566 PMCID: PMC4850887 DOI: 10.1101/mcs.a000372] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Deep phenotyping followed by integrated computational analysis of genotype and phenotype is becoming ever more important for many areas of genomic diagnostics and translational research. The overwhelming majority of clinical descriptions in the medical literature are available only as natural language text, meaning that searching, analysis, and integration of medically relevant information in databases such as PubMed is challenging. The new journal Cold Spring Harbor Molecular Case Studies will require authors to select Human Phenotype Ontology terms for research papers that will be displayed alongside the manuscript, thereby providing a foundation for ontology-based indexing and searching of articles that contain descriptions of phenotypic abnormalities-an important step toward improving the ability of researchers and clinicians to get biomedical information that is critical for clinical care or translational research.
Collapse
Affiliation(s)
- Peter N Robinson
- Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany;; Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany;; Berlin Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, 13353 Berlin, Germany;; Institute for Bioinformatics, Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany
| | | | - Melissa Haendel
- Oregon Health and Science University, Portland, Oregon 97239, USA
| |
Collapse
|
1492
|
Gress A, Ramensky V, Büch J, Keller A, Kalinina OV. StructMAn: annotation of single-nucleotide polymorphisms in the structural context. Nucleic Acids Res 2016; 44:W463-8. [PMID: 27150811 PMCID: PMC4987916 DOI: 10.1093/nar/gkw364] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2016] [Accepted: 04/22/2016] [Indexed: 01/08/2023] Open
Abstract
The next generation sequencing technologies produce unprecedented amounts of data on the genetic sequence of individual organisms. These sequences carry a substantial amount of variation that may or may be not related to a phenotype. Phenotypically important part of this variation often comes in form of protein-sequence altering (non-synonymous) single nucleotide variants (nsSNVs). Here we present StructMAn, a Web-based tool for annotation of human and non-human nsSNVs in the structural context. StructMAn analyzes the spatial location of the amino acid residue corresponding to nsSNVs in the three-dimensional (3D) protein structure relative to other proteins, nucleic acids and low molecular-weight ligands. We make use of all experimentally available 3D structures of query proteins, and also, unlike other tools in the field, of structures of proteins with detectable sequence identity to them. This allows us to provide a structural context for around 20% of all nsSNVs in a typical human sequencing sample, for up to 60% of nsSNVs in genes related to human diseases and for around 35% of nsSNVs in a typical bacterial sample. Each nsSNV can be visualized and inspected by the user in the corresponding 3D structure of a protein or protein complex. The StructMAn server is available at http://structman.mpi-inf.mpg.de.
Collapse
Affiliation(s)
- Alexander Gress
- Department for Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Campus E1 4, 66123 Saarbrücken, Germany Graduate School of Computer Science, Saarland University, Campus E1 3, 66123 Saarbrücken, Germany
| | - Vasily Ramensky
- Center for Neurobehavioral Genetics, University of California, Los Angeles, 695 Charles E. Young Drive South, Los Angeles, CA 90095, USA
| | - Joachim Büch
- Department for Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Campus E1 4, 66123 Saarbrücken, Germany
| | - Andreas Keller
- Chair for Medical Bioinformatics, Saarland University, Campus E2 2, 66123 Saarbrücken, Germany
| | - Olga V Kalinina
- Department for Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Campus E1 4, 66123 Saarbrücken, Germany
| |
Collapse
|
1493
|
Vece TJ, Watkin LB, Nicholas S, Canter D, Braun MC, Guillerman RP, Eldin KW, Bertolet G, McKinley S, de Guzman M, Forbes L, Chinn I, Orange JS. Copa Syndrome: a Novel Autosomal Dominant Immune Dysregulatory Disease. J Clin Immunol 2016; 36:377-387. [PMID: 27048656 PMCID: PMC4842120 DOI: 10.1007/s10875-016-0271-8] [Citation(s) in RCA: 131] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Accepted: 03/11/2016] [Indexed: 11/30/2022]
Abstract
Inherently defective immunity typically results in either ineffective host defense, immune regulation, or both. As a category of primary immunodeficiency diseases, those that impair immune regulation can lead to autoimmunity and/or autoinflammation. In this review we focus on one of the most recently discovered primary immunodeficiencies that leads to immune dysregulation: "Copa syndrome". Copa syndrome is named for the gene mutated in the disease, which encodes the alpha subunit of the coatomer complex-I that, in aggregate, is devoted to transiting molecular cargo from the Golgi complex to the endoplasmic reticulum (ER). Copa syndrome is autosomal dominant with variable expressivity and results from mutations affecting a narrow amino acid stretch in the COPA gene-encoding COPα protein. Patients with these mutations typically develop arthritis and interstitial lung disease with pulmonary hemorrhage representing a striking feature. Immunologically Copa syndrome is associated with autoantibody development, increased Th17 cells and pro-inflammatory cytokine expression including IL-1β and IL-6. Insights have also been gained into the underlying mechanism of Copa syndrome, which include excessive ER stress owing to the impaired return of proteins from the Golgi, and presumably resulting aberrant cellular autophagy. As such it represents a novel cellular disorder of intracellular trafficking associated with a specific clinical presentation and phenotype.
Collapse
Affiliation(s)
- Timothy J. Vece
- Department of Pediatrics, Baylor College of Medicine, Houston, TX
| | - Levi B. Watkin
- Department of Pediatrics, Baylor College of Medicine, Houston, TX
- Texas Children’s Hospital Center for Human ImmunoBiology, Houston, TX
| | - Sarah Nicholas
- Department of Pediatrics, Baylor College of Medicine, Houston, TX
- Texas Children’s Hospital Center for Human ImmunoBiology, Houston, TX
| | - Debra Canter
- Texas Children’s Hospital Center for Human ImmunoBiology, Houston, TX
| | - Michael C. Braun
- Department of Pediatrics, Baylor College of Medicine, Houston, TX
| | | | - Karen W. Eldin
- Department of Pathology, Baylor College of Medicine, Houston, TX
| | - Grant Bertolet
- Department of Pediatrics, Baylor College of Medicine, Houston, TX
- Texas Children’s Hospital Center for Human ImmunoBiology, Houston, TX
| | - Scott McKinley
- Department of Pediatrics, Baylor College of Medicine, Houston, TX
| | - Marietta de Guzman
- Department of Pediatrics, Baylor College of Medicine, Houston, TX
- Texas Children’s Hospital Center for Human ImmunoBiology, Houston, TX
| | - Lisa Forbes
- Department of Pediatrics, Baylor College of Medicine, Houston, TX
- Texas Children’s Hospital Center for Human ImmunoBiology, Houston, TX
| | - Ivan Chinn
- Department of Pediatrics, Baylor College of Medicine, Houston, TX
- Texas Children’s Hospital Center for Human ImmunoBiology, Houston, TX
| | - Jordan S. Orange
- Department of Pediatrics, Baylor College of Medicine, Houston, TX
- Texas Children’s Hospital Center for Human ImmunoBiology, Houston, TX
| |
Collapse
|
1494
|
A Clinician's perspective on clinical exome sequencing. Hum Genet 2016; 135:643-54. [PMID: 27126233 DOI: 10.1007/s00439-016-1662-x] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2016] [Accepted: 03/23/2016] [Indexed: 12/22/2022]
Abstract
Clinical exome sequencing has clearly improved our ability as clinicians to identify the cause of a wide variety of disorders. Prior to exome sequencing, a majority of patients with apparent syndromes never received a specific molecular genetic diagnosis despite extensive diagnostic odysseys. Even for those receiving an answer to the question of what caused their disorder, the diagnostic odyssey often spanned years to decades. Determining the particular genetic cause in an individual patient can be challenging due to inherent phenotypic and genetic heterogeneity of disease, technical limitations of testing or both. Blended phenotypes, due to multiple monogenic disorders in the same patient, are true dilemmas for traditional genetic evaluations, but are increasingly being diagnosed through clinical exome sequencing. New sequencing technologies have increased the proportion of patients receiving molecular diagnoses, while significantly shortening the time scale, providing multiple benefits for the health-care team, patient and family.
Collapse
|
1495
|
Reimand J, Arak T, Adler P, Kolberg L, Reisberg S, Peterson H, Vilo J. g:Profiler-a web server for functional interpretation of gene lists (2016 update). Nucleic Acids Res 2016; 44:W83-9. [PMID: 27098042 PMCID: PMC4987867 DOI: 10.1093/nar/gkw199] [Citation(s) in RCA: 919] [Impact Index Per Article: 102.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Accepted: 03/13/2016] [Indexed: 12/13/2022] Open
Abstract
Functional enrichment analysis is a key step in interpreting gene lists discovered in diverse high-throughput experiments. g:Profiler studies flat and ranked gene lists and finds statistically significant Gene Ontology terms, pathways and other gene function related terms. Translation of hundreds of gene identifiers is another core feature of g:Profiler. Since its first publication in 2007, our web server has become a popular tool of choice among basic and translational researchers. Timeliness is a major advantage of g:Profiler as genome and pathway information is synchronized with the Ensembl database in quarterly updates. g:Profiler supports 213 species including mammals and other vertebrates, plants, insects and fungi. The 2016 update of g:Profiler introduces several novel features. We have added further functional datasets to interpret gene lists, including transcription factor binding site predictions, Mendelian disease annotations, information about protein expression and complexes and gene mappings of human genetic polymorphisms. Besides the interactive web interface, g:Profiler can be accessed in computational pipelines using our R package, Python interface and BioJS component. g:Profiler is freely available at http://biit.cs.ut.ee/gprofiler/.
Collapse
Affiliation(s)
- Jüri Reimand
- Ontario Institute for Cancer Research, 661 University Avenue, Toronto, ON M5G 0A3, Canada Department of Medical Biophysics, University of Toronto, 101 College Street, Toronto, ON M5G 1L7, Canada
| | - Tambet Arak
- Institute of Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia
| | - Priit Adler
- Institute of Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia
| | - Liis Kolberg
- Institute of Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia
| | - Sulev Reisberg
- Institute of Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia
| | - Hedi Peterson
- Institute of Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia
| | - Jaak Vilo
- Institute of Computer Science, University of Tartu, Liivi 2, 50409 Tartu, Estonia
| |
Collapse
|
1496
|
A flexible method for estimating the fraction of fitness influencing mutations from large sequencing data sets. Genome Res 2016; 26:834-43. [PMID: 27197222 PMCID: PMC4889975 DOI: 10.1101/gr.203059.115] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Accepted: 04/14/2016] [Indexed: 01/07/2023]
Abstract
A continuing challenge in the analysis of massively large sequencing data sets is quantifying and interpreting non-neutrally evolving mutations. Here, we describe a flexible and robust approach based on the site frequency spectrum to estimate the fraction of deleterious and adaptive variants from large-scale sequencing data sets. We applied our method to approximately 1 million single nucleotide variants (SNVs) identified in high-coverage exome sequences of 6515 individuals. We estimate that the fraction of deleterious nonsynonymous SNVs is higher than previously reported; quantify the effects of genomic context, codon bias, chromatin accessibility, and number of protein-protein interactions on deleterious protein-coding SNVs; and identify pathways and networks that have likely been influenced by positive selection. Furthermore, we show that the fraction of deleterious nonsynonymous SNVs is significantly higher for Mendelian versus complex disease loci and in exons harboring dominant versus recessive Mendelian mutations. In summary, as genome-scale sequencing data accumulate in progressively larger sample sizes, our method will enable increasingly high-resolution inferences into the characteristics and determinants of non-neutral variation.
Collapse
|
1497
|
Wang W, Wang C, Dawson DB, Thorland EC, Lundquist PA, Eckloff BW, Wu Y, Baheti S, Evans JM, Scherer SS, Dyck PJ, Klein CJ. Target-enrichment sequencing and copy number evaluation in inherited polyneuropathy. Neurology 2016; 86:1762-71. [PMID: 27164712 DOI: 10.1212/wnl.0000000000002659] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2015] [Accepted: 01/05/2016] [Indexed: 11/15/2022] Open
Abstract
OBJECTIVE To assess the efficiency of target-enrichment next-generation sequencing (NGS) with copy number assessment in inherited neuropathy diagnosis. METHODS A 197 polyneuropathy gene panel was designed to assess for mutations in 93 patients with inherited or idiopathic neuropathy without known genetic cause. We applied our novel copy number variation algorithm on NGS data, and validated the identified copy number mutations using CytoScan (Affymetrix). Cost and efficacy of this targeted NGS approach was compared to earlier evaluations. RESULTS Average coverage depth was ∼760× (median = 600, 99.4% > 100×). Among 93 patients, 18 mutations were identified in 17 cases (18%), including 3 copy number mutations: 2 PMP22 duplications and 1 MPZ duplication. The 2 patients with PMP22 duplication presented with bulbar and respiratory involvement and had absent extremity nerve conductions, leading to axonal diagnosis. Average onset age of these 17 patients was 25 years (2-61 years), vs 45 years for those without genetic discovery. Among those with onset age less than 40 years, the diagnostic yield of targeted NGS approach is high (27%) and cost savings is significant (∼20%). However, the cost savings for patients with late onset age and without family history is not demonstrated. CONCLUSIONS Incorporating copy number analysis in target-enrichment NGS approach improved the efficiency of mutation discovery for chronic, inherited, progressive length-dependent polyneuropathy diagnosis. The new technology is facilitating a simplified genetic diagnostic algorithm utilizing targeted NGS, clinical phenotypes, age at onset, and family history to improve diagnosis efficiency. Our findings prompt a need for updating the current practice parameters and payer guidelines.
Collapse
Affiliation(s)
- Wei Wang
- From the Departments of Neurology, Peripheral Nerve Division (W.W., P.J.D., C.J.K.), Department of Health Science Research (C.W., S.B., J.M.E.), Laboratory Medicine and Pathology (D.B.D., E.C.T., P.A.L., Y.W., C.J.K.), Medical Genome Facility (B.W.E., Y.W.), and Medical Genetics (C.J.K., D.B.D.), Mayo Clinic, Rochester, MN; Department of Neurology (W.W.), China-Japan Friendship Hospital, Beijing, China; and Department of Neurology (S.S.S.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Chen Wang
- From the Departments of Neurology, Peripheral Nerve Division (W.W., P.J.D., C.J.K.), Department of Health Science Research (C.W., S.B., J.M.E.), Laboratory Medicine and Pathology (D.B.D., E.C.T., P.A.L., Y.W., C.J.K.), Medical Genome Facility (B.W.E., Y.W.), and Medical Genetics (C.J.K., D.B.D.), Mayo Clinic, Rochester, MN; Department of Neurology (W.W.), China-Japan Friendship Hospital, Beijing, China; and Department of Neurology (S.S.S.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - D Brian Dawson
- From the Departments of Neurology, Peripheral Nerve Division (W.W., P.J.D., C.J.K.), Department of Health Science Research (C.W., S.B., J.M.E.), Laboratory Medicine and Pathology (D.B.D., E.C.T., P.A.L., Y.W., C.J.K.), Medical Genome Facility (B.W.E., Y.W.), and Medical Genetics (C.J.K., D.B.D.), Mayo Clinic, Rochester, MN; Department of Neurology (W.W.), China-Japan Friendship Hospital, Beijing, China; and Department of Neurology (S.S.S.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Erik C Thorland
- From the Departments of Neurology, Peripheral Nerve Division (W.W., P.J.D., C.J.K.), Department of Health Science Research (C.W., S.B., J.M.E.), Laboratory Medicine and Pathology (D.B.D., E.C.T., P.A.L., Y.W., C.J.K.), Medical Genome Facility (B.W.E., Y.W.), and Medical Genetics (C.J.K., D.B.D.), Mayo Clinic, Rochester, MN; Department of Neurology (W.W.), China-Japan Friendship Hospital, Beijing, China; and Department of Neurology (S.S.S.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Patrick A Lundquist
- From the Departments of Neurology, Peripheral Nerve Division (W.W., P.J.D., C.J.K.), Department of Health Science Research (C.W., S.B., J.M.E.), Laboratory Medicine and Pathology (D.B.D., E.C.T., P.A.L., Y.W., C.J.K.), Medical Genome Facility (B.W.E., Y.W.), and Medical Genetics (C.J.K., D.B.D.), Mayo Clinic, Rochester, MN; Department of Neurology (W.W.), China-Japan Friendship Hospital, Beijing, China; and Department of Neurology (S.S.S.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Bruce W Eckloff
- From the Departments of Neurology, Peripheral Nerve Division (W.W., P.J.D., C.J.K.), Department of Health Science Research (C.W., S.B., J.M.E.), Laboratory Medicine and Pathology (D.B.D., E.C.T., P.A.L., Y.W., C.J.K.), Medical Genome Facility (B.W.E., Y.W.), and Medical Genetics (C.J.K., D.B.D.), Mayo Clinic, Rochester, MN; Department of Neurology (W.W.), China-Japan Friendship Hospital, Beijing, China; and Department of Neurology (S.S.S.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Yanhong Wu
- From the Departments of Neurology, Peripheral Nerve Division (W.W., P.J.D., C.J.K.), Department of Health Science Research (C.W., S.B., J.M.E.), Laboratory Medicine and Pathology (D.B.D., E.C.T., P.A.L., Y.W., C.J.K.), Medical Genome Facility (B.W.E., Y.W.), and Medical Genetics (C.J.K., D.B.D.), Mayo Clinic, Rochester, MN; Department of Neurology (W.W.), China-Japan Friendship Hospital, Beijing, China; and Department of Neurology (S.S.S.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Saurabh Baheti
- From the Departments of Neurology, Peripheral Nerve Division (W.W., P.J.D., C.J.K.), Department of Health Science Research (C.W., S.B., J.M.E.), Laboratory Medicine and Pathology (D.B.D., E.C.T., P.A.L., Y.W., C.J.K.), Medical Genome Facility (B.W.E., Y.W.), and Medical Genetics (C.J.K., D.B.D.), Mayo Clinic, Rochester, MN; Department of Neurology (W.W.), China-Japan Friendship Hospital, Beijing, China; and Department of Neurology (S.S.S.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Jared M Evans
- From the Departments of Neurology, Peripheral Nerve Division (W.W., P.J.D., C.J.K.), Department of Health Science Research (C.W., S.B., J.M.E.), Laboratory Medicine and Pathology (D.B.D., E.C.T., P.A.L., Y.W., C.J.K.), Medical Genome Facility (B.W.E., Y.W.), and Medical Genetics (C.J.K., D.B.D.), Mayo Clinic, Rochester, MN; Department of Neurology (W.W.), China-Japan Friendship Hospital, Beijing, China; and Department of Neurology (S.S.S.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Steven S Scherer
- From the Departments of Neurology, Peripheral Nerve Division (W.W., P.J.D., C.J.K.), Department of Health Science Research (C.W., S.B., J.M.E.), Laboratory Medicine and Pathology (D.B.D., E.C.T., P.A.L., Y.W., C.J.K.), Medical Genome Facility (B.W.E., Y.W.), and Medical Genetics (C.J.K., D.B.D.), Mayo Clinic, Rochester, MN; Department of Neurology (W.W.), China-Japan Friendship Hospital, Beijing, China; and Department of Neurology (S.S.S.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Peter J Dyck
- From the Departments of Neurology, Peripheral Nerve Division (W.W., P.J.D., C.J.K.), Department of Health Science Research (C.W., S.B., J.M.E.), Laboratory Medicine and Pathology (D.B.D., E.C.T., P.A.L., Y.W., C.J.K.), Medical Genome Facility (B.W.E., Y.W.), and Medical Genetics (C.J.K., D.B.D.), Mayo Clinic, Rochester, MN; Department of Neurology (W.W.), China-Japan Friendship Hospital, Beijing, China; and Department of Neurology (S.S.S.), Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Christopher J Klein
- From the Departments of Neurology, Peripheral Nerve Division (W.W., P.J.D., C.J.K.), Department of Health Science Research (C.W., S.B., J.M.E.), Laboratory Medicine and Pathology (D.B.D., E.C.T., P.A.L., Y.W., C.J.K.), Medical Genome Facility (B.W.E., Y.W.), and Medical Genetics (C.J.K., D.B.D.), Mayo Clinic, Rochester, MN; Department of Neurology (W.W.), China-Japan Friendship Hospital, Beijing, China; and Department of Neurology (S.S.S.), Perelman School of Medicine, University of Pennsylvania, Philadelphia.
| |
Collapse
|
1498
|
Leslie E, Liu H, Carlson J, Shaffer J, Feingold E, Wehby G, Laurie C, Jain D, Laurie C, Doheny K, McHenry T, Resick J, Sanchez C, Jacobs J, Emanuele B, Vieira A, Neiswanger K, Standley J, Czeizel A, Deleyiannis F, Christensen K, Munger R, Lie R, Wilcox A, Romitti P, Field L, Padilla C, Cutiongco-de la Paz E, Lidral A, Valencia-Ramirez L, Lopez-Palacio A, Valencia D, Arcos-Burgos M, Castilla E, Mereb J, Poletta F, Orioli I, Carvalho F, Hecht J, Blanton S, Buxó C, Butali A, Mossey P, Adeyemo W, James O, Braimah R, Aregbesola B, Eshete M, Deribew M, Koruyucu M, Seymen F, Ma L, de Salamanca J, Weinberg S, Moreno L, Cornell R, Murray J, Marazita M. A Genome-wide Association Study of Nonsyndromic Cleft Palate Identifies an Etiologic Missense Variant in GRHL3. Am J Hum Genet 2016; 98:744-54. [PMID: 27018472 DOI: 10.1016/j.ajhg.2016.02.014] [Citation(s) in RCA: 126] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Accepted: 02/17/2016] [Indexed: 10/22/2022] Open
Abstract
Cleft palate (CP) is a common birth defect occurring in 1 in 2,500 live births. Approximately half of infants with CP have a syndromic form, exhibiting other physical and cognitive disabilities. The other half have nonsyndromic CP, and to date, few genes associated with risk for nonsyndromic CP have been characterized. To identify such risk factors, we performed a genome-wide association study of this disorder. We discovered a genome-wide significant association with a missense variant in GRHL3 (p.Thr454Met [c.1361C>T]; rs41268753; p = 4.08 × 10(-9)) and replicated the result in an independent sample of case and control subjects. In both the discovery and replication samples, rs41268753 conferred increased risk for CP (OR = 8.3, 95% CI 4.1-16.8; OR = 2.16, 95% CI 1.43-3.27, respectively). In luciferase transactivation assays, p.Thr454Met had about one-third of the activity of wild-type GRHL3, and in zebrafish embryos, perturbed periderm development. We conclude that this mutation is an etiologic variant for nonsyndromic CP and is one of few functional variants identified to date for nonsyndromic orofacial clefting. This finding advances our understanding of the genetic basis of craniofacial development and might ultimately lead to improvements in recurrence risk prediction, treatment, and prognosis.
Collapse
|
1499
|
Ponomarenko MP, Arkova O, Rasskazov D, Ponomarenko P, Savinkova L, Kolchanov N. Candidate SNP Markers of Gender-Biased Autoimmune Complications of Monogenic Diseases Are Predicted by a Significant Change in the Affinity of TATA-Binding Protein for Human Gene Promoters. Front Immunol 2016; 7:130. [PMID: 27092142 PMCID: PMC4819121 DOI: 10.3389/fimmu.2016.00130] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2015] [Accepted: 03/21/2016] [Indexed: 12/17/2022] Open
Abstract
Some variations of human genome [for example, single nucleotide polymorphisms (SNPs)] are markers of hereditary diseases and drug responses. Analysis of them can help to improve treatment. Computer-based analysis of millions of SNPs in the 1000 Genomes project makes a search for SNP markers more targeted. Here, we combined two computer-based approaches: DNA sequence analysis and keyword search in databases. In the binding sites for TATA-binding protein (TBP) in human gene promoters, we found candidate SNP markers of gender-biased autoimmune diseases, including rs1143627 [cachexia in rheumatoid arthritis (double prevalence among women)]; rs11557611 [demyelinating diseases (thrice more prevalent among young white women than among non-white individuals)]; rs17231520 and rs569033466 [both: atherosclerosis comorbid with related diseases (double prevalence among women)]; rs563763767 [Hughes syndrome-related thrombosis (lethal during pregnancy)]; rs2814778 [autoimmune diseases (excluding multiple sclerosis and rheumatoid arthritis) underlying hypergammaglobulinemia in women]; rs72661131 and rs562962093 (both: preterm delivery in pregnant diabetic women); and rs35518301, rs34166473, rs34500389, rs33981098, rs33980857, rs397509430, rs34598529, rs33931746, rs281864525, and rs63750953 (all: autoimmune diseases underlying hypergammaglobulinemia in women). Validation of these predicted candidate SNP markers using the clinical standards may advance personalized medicine.
Collapse
Affiliation(s)
- Mikhail P. Ponomarenko
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, Novosibirsk, Russia
- Novosibirsk State University, Novosibirsk, Russia
| | - Olga Arkova
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, Novosibirsk, Russia
| | - Dmitry Rasskazov
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, Novosibirsk, Russia
| | | | - Ludmila Savinkova
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, Novosibirsk, Russia
| | - Nikolay Kolchanov
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, Novosibirsk, Russia
- Novosibirsk State University, Novosibirsk, Russia
| |
Collapse
|
1500
|
Wang YZ, Qiu SC. Prediction of key genes in ovarian cancer treated with decitabine based on network strategy. Oncol Rep 2016; 35:3548-58. [PMID: 27035425 DOI: 10.3892/or.2016.4697] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Accepted: 01/26/2016] [Indexed: 11/06/2022] Open
Abstract
The objective of the present study was to predict key genes in ovarian cancer before and after treatment with decitabine utilizing a network approach and to reveal the molecular mechanism. Pathogenic networks of ovarian cancer before and after treatment were identified based on known pathogenic genes (seed genes) and differentially expressed genes (DEGs) detected by Significance Analysis of Microarrays (SAM) method. A weight was assigned to each gene in the pathogenic network and then candidate genes were evaluated. Topological properties (degree, betweenness, closeness and stress) of candidate genes were analyzed to investigate more confident pathogenic genes. Pathway enrichment analysis for candidate and seed genes were conducted. Validation of candidate gene expression in ovarian cancer was performed by reverse transcriptase-polymerase chain reaction (RT-PCR) assays. There were 73 nodes and 147 interactions in the pathogenic network before treatment, while 47 nodes and 66 interactions after treatment. A total of 32 candidate genes were identified in the before treatment group of ovarian cancer, of which 16 were rightly candidate genes after treatment and the others were silenced. We obtained 5 key genes (PIK3R2, CCNB1, IL2, IL1B and CDC6) for decitabine treatment that were validated by RT-PCR. In conclusion, we successfully identified 5 key genes (PIK3R2, CCNB1, IL2, IL1B and CDC6) and validated them, which provides insight into the molecular mechanisms of decitabine treatment and may be potential pathogenic biomarkers for the therapy of ovarian cancer.
Collapse
Affiliation(s)
- Yu-Zhen Wang
- Department of Pharmacy, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang 310016, P.R. China
| | - Sheng-Chun Qiu
- Department of Nursing, Zhejiang Provincial People's Hospital, Xiacheng, Hangzhou, Zhejiang 310014, P.R. China
| |
Collapse
|