151
|
Rappaport N, Fishilevich S, Nudel R, Twik M, Belinky F, Plaschkes I, Stein TI, Cohen D, Oz-Levi D, Safran M, Lancet D. Rational confederation of genes and diseases: NGS interpretation via GeneCards, MalaCards and VarElect. Biomed Eng Online 2017; 16:72. [PMID: 28830434 PMCID: PMC5568599 DOI: 10.1186/s12938-017-0359-2] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Background A key challenge in the realm of human disease research is next generation sequencing (NGS) interpretation, whereby identified filtered variant-harboring genes are associated with a patient’s disease phenotypes. This necessitates bioinformatics tools linked to comprehensive knowledgebases. The GeneCards suite databases, which include GeneCards (human genes), MalaCards (human diseases) and PathCards (human pathways) together with additional tools, are presented with the focus on MalaCards utility for NGS interpretation as well as for large scale bioinformatic analyses. Results VarElect, our NGS interpretation tool, leverages the broad information in the GeneCards suite databases. MalaCards algorithms unify disease-related terms and annotations from 69 sources. Further, MalaCards defines hierarchical relatedness—aliases, disease families, a related diseases network, categories and ontological classifications. GeneCards and MalaCards delineate and share a multi-tiered, scored gene-disease network, with stringency levels, including the definition of elite status—high quality gene-disease pairs, coming from manually curated trustworthy sources, that includes 4500 genes for 8000 diseases. This unique resource is key to NGS interpretation by VarElect. VarElect, a comprehensive search tool that helps infer both direct and indirect links between genes and user-supplied disease/phenotype terms, is robustly strengthened by the information found in MalaCards. The indirect mode benefits from GeneCards’ diverse gene-to-gene relationships, including SuperPaths—integrated biological pathways from 12 information sources. We are currently adding an important information layer in the form of “disease SuperPaths”, generated from the gene-disease matrix by an algorithm similar to that previously employed for biological pathway unification. This allows the discovery of novel gene-disease and disease–disease relationships. The advent of whole genome sequencing necessitates capacities to go beyond protein coding genes. GeneCards is highly useful in this respect, as it also addresses 101,976 non-protein-coding RNA genes. In a more recent development, we are currently adding an inclusive map of regulatory elements and their inferred target genes, generated by integration from 4 resources. Conclusions MalaCards provides a rich big-data scaffold for in silico biomedical discovery within the gene-disease universe. VarElect, which depends significantly on both GeneCards and MalaCards power, is a potent tool for supporting the interpretation of wet-lab experiments, notably NGS analyses of disease. The GeneCards suite has thus transcended its 2-decade role in biomedical research, maturing into a key player in clinical investigation.
Collapse
Affiliation(s)
- Noa Rappaport
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.,Institute for Systems Biology, Seattle, WA, USA
| | - Simon Fishilevich
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Ron Nudel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Michal Twik
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Frida Belinky
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.,National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, USA
| | - Inbar Plaschkes
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Tsippi Iny Stein
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Dana Cohen
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Danit Oz-Levi
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Marilyn Safran
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Doron Lancet
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
152
|
Identifying novel genes and biological processes relevant to the development of cancer therapy-induced mucositis: An informative gene network analysis. PLoS One 2017; 12:e0180396. [PMID: 28678827 PMCID: PMC5498049 DOI: 10.1371/journal.pone.0180396] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Accepted: 05/30/2017] [Indexed: 12/20/2022] Open
Abstract
Mucositis is a complex, dose-limiting toxicity of chemotherapy or radiotherapy that leads to painful mouth ulcers, difficulty eating or swallowing, gastrointestinal distress, and reduced quality of life for patients with cancer. Mucositis is most common for those undergoing high-dose chemotherapy and hematopoietic stem cell transplantation and for those being treated for malignancies of the head and neck. Treatment and management of mucositis remain challenging. It is expected that multiple genes are involved in the formation, severity, and persistence of mucositis. We used Ingenuity Pathway Analysis (IPA), a novel network-based approach that integrates complex intracellular and intercellular interactions involved in diseases, to systematically explore the molecular complexity of mucositis. As a first step, we searched the literature to identify genes that harbor or are close to the genetic variants significantly associated with mucositis. Our literature review identified 27 candidate genes, of which ERCC1, XRCC1, and MTHFR were the most frequently studied for mucositis. On the basis of this 27-gene list, we used IPA to generate gene networks for mucositis. The most biologically significant novel molecules identified through IPA analyses included TP53, CTNNB1, MYC, RB1, P38 MAPK, and EP300. Additionally, uracil degradation II (reductive) and thymine degradation pathways (p = 1.06-08) were most significant. Finally, utilizing 66 SNPs within the 8 most connected IPA-derived candidate molecules, we conducted a genetic association study for oral mucositis in the head and neck cancer patients who were treated using chemotherapy and/or radiation therapy (186 head and neck cancer patients with oral mucositis vs. 699 head and neck cancer patients without oral mucositis). The top ranked gene identified through this association analysis was RB1 (rs2227311, p-value = 0.034, odds ratio = 0.67). In conclusion, gene network analysis identified novel molecules and biological processes, including pathways related to inflammation and oxidative stress, that are relevant to mucositis development, thus providing the basis for future studies to improve the management and treatment of mucositis in patients with cancer.
Collapse
|
153
|
Martin KR, Zhou W, Bowman MJ, Shih J, Au KS, Dittenhafer-Reed KE, Sisson KA, Koeman J, Weisenberger DJ, Cottingham SL, DeRoos ST, Devinsky O, Winn ME, Cherniack AD, Shen H, Northrup H, Krueger DA, MacKeigan JP. The genomic landscape of tuberous sclerosis complex. Nat Commun 2017. [PMID: 28643795 PMCID: PMC5481739 DOI: 10.1038/ncomms15816] [Citation(s) in RCA: 130] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Tuberous sclerosis complex (TSC) is a rare genetic disease causing multisystem growth of benign tumours and other hamartomatous lesions, which leads to diverse and debilitating clinical symptoms. Patients are born with TSC1 or TSC2 mutations, and somatic inactivation of wild-type alleles drives MTOR activation; however, second hits to TSC1/TSC2 are not always observed. Here, we present the genomic landscape of TSC hamartomas. We determine that TSC lesions contain a low somatic mutational burden relative to carcinomas, a subset feature large-scale chromosomal aberrations, and highly conserved molecular signatures for each type exist. Analysis of the molecular signatures coupled with computational approaches reveals unique aspects of cellular heterogeneity and cell origin. Using immune data sets, we identify significant neuroinflammation in TSC-associated brain tumours. Taken together, this molecular catalogue of TSC serves as a resource into the origin of these hamartomas and provides a framework that unifies genomic and transcriptomic dimensions for complex tumours.
Collapse
Affiliation(s)
- Katie R Martin
- Center for Cancer and Cell Biology, Van Andel Research Institute, 333 Bostwick Avenue NE, Grand Rapids, Michigan 49503, USA
| | - Wanding Zhou
- Center for Epigenetics, Van Andel Research Institute, 333 Bostwick Avenue NE, Grand Rapids, Michigan 49503, USA
| | - Megan J Bowman
- Bioinformatics and Biostatistics Core, Van Andel Research Institute, 333 Bostwick Avenue NE, Grand Rapids, Michigan 49503, USA
| | - Juliann Shih
- Cancer Program, Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, Massachusetts 02142, USA
| | - Kit Sing Au
- Department of Pediatrics, University of Texas Health Science Center at Houston-McGovern Medical School, 6431 Fannin, Houston, Texas 77030, USA
| | - Kristin E Dittenhafer-Reed
- Center for Cancer and Cell Biology, Van Andel Research Institute, 333 Bostwick Avenue NE, Grand Rapids, Michigan 49503, USA
| | - Kellie A Sisson
- Center for Cancer and Cell Biology, Van Andel Research Institute, 333 Bostwick Avenue NE, Grand Rapids, Michigan 49503, USA
| | - Julie Koeman
- Cytogenetics and Pathology Core, Van Andel Research Institute, 333 Bostwick Avenue NE, Grand Rapids, Michigan 49503, USA
| | - Daniel J Weisenberger
- Norris Comprehensive Cancer Center, University of Southern California, 1450 Biggy Street, Los Angeles, California 90033, USA
| | - Sandra L Cottingham
- Department of Pathology, Spectrum Health System, 100 Michigan Street NE, Grand Rapids, Michigan 49503, USA
| | - Steven T DeRoos
- Division of Pediatric Neurology, Helen DeVos Children's Hospital, Spectrum Health System, 100 Michigan Street NE, Grand Rapids, Michigan 49503, USA
| | - Orrin Devinsky
- Department of Neurology, New York University School of Medicine, 223 E 34 Street, New York, New York 10016, USA
| | - Mary E Winn
- Bioinformatics and Biostatistics Core, Van Andel Research Institute, 333 Bostwick Avenue NE, Grand Rapids, Michigan 49503, USA
| | - Andrew D Cherniack
- Cancer Program, Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, Massachusetts 02142, USA
| | - Hui Shen
- Center for Epigenetics, Van Andel Research Institute, 333 Bostwick Avenue NE, Grand Rapids, Michigan 49503, USA
| | - Hope Northrup
- Department of Pediatrics, University of Texas Health Science Center at Houston-McGovern Medical School, 6431 Fannin, Houston, Texas 77030, USA
| | - Darcy A Krueger
- Division of Neurology, Cincinnati Children's Hospital Medical Center, 3333 Burnet Avenue, Cincinnati, Ohio 45229, USA
| | - Jeffrey P MacKeigan
- Center for Cancer and Cell Biology, Van Andel Research Institute, 333 Bostwick Avenue NE, Grand Rapids, Michigan 49503, USA.,College of Human Medicine, Michigan State University, 220 Trowbridge Road, East Lansing, Michigan 48824, USA
| |
Collapse
|
154
|
Gabrielli AP, Manzardo AM, Butler MG. Exploring genetic susceptibility to obesity through genome functional pathway analysis. Obesity (Silver Spring) 2017; 25:1136-1143. [PMID: 28474384 PMCID: PMC5444946 DOI: 10.1002/oby.21847] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/16/2017] [Revised: 03/16/2017] [Accepted: 03/21/2017] [Indexed: 12/24/2022]
Abstract
OBJECTIVE Obesity has been reaching epidemic levels in recent decades, with a growing body of research identifying predisposing genetic components. To explore the relationship of genetic factors contributing to obesity, an analytical computer-based gene-profiling approach utilizing an updated list of clinically relevant and known obesity-related genes was undertaken. METHODS An updated list of 494 genes reportedly associated with obesity was compiled, and the GeneAnalytics profiling software was utilized to interrogate genomic databases from GeneCards® to cross-reference obesity gene sets against tissues and cells, diseases, genetic pathways, gene ontology (GO)-biological processes and GO-molecular functions, phenotypes, and compounds. RESULTS Obesity-related fields identified by GeneAnalytics algorithms included 8 diseases, 46 pathways, 62 biological processes, 22 molecular functions, 148 phenotypes, and 286 compounds impacting adipogenesis, signal transduction by G-protein coupled receptors, and lipid metabolism involving insulin-related genes (IGF1, INS, IRS1). GO-biological processes identified feeding behavior, cholesterol metabolic process, and glucose and cholesterol homeostasis pathways, while GO-molecular processes pertained to receptor binding, affecting glucose homeostasis, body weight, and circulating insulin and triglyceride levels. CONCLUSIONS The gene-profiling model suggests that pathogenesis of obesity relates to the coordination of biological responses to glucose and intracellular lipids possibly through a disruption of biochemical cascades and cellular signaling arising from affected receptors.
Collapse
Affiliation(s)
- Alexander P Gabrielli
- Departments of Psychiatry and Behavioral Sciences and Pediatrics, University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Ann M Manzardo
- Departments of Psychiatry and Behavioral Sciences and Pediatrics, University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Merlin G Butler
- Departments of Psychiatry and Behavioral Sciences and Pediatrics, University of Kansas Medical Center, Kansas City, Kansas, USA
| |
Collapse
|
155
|
Gowtham YK, Saski CA, Harcum SW. Low glucose concentrations within typical industrial operating conditions have minimal effect on the transcriptome of recombinant CHO cells. Biotechnol Prog 2017; 33:771-785. [PMID: 28371311 DOI: 10.1002/btpr.2462] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 01/07/2017] [Indexed: 12/16/2022]
Abstract
Typically, mammalian cell culture medium contains high glucose concentrations that are analogous to diabetic levels in humans, suggesting that mammalian cells are cultivated in excessive glucose. Using RNA-Seq, this study characterized the Chinese hamster ovary (CHO) cell transcriptome under two glucose concentrations to assess the genetic effects associated with metabolic pathways, in addition to other global responses. The initial extracellular glucose concentrations used represented high (30 mM) and low (10 mM) glucose conditions, where at the time the transcriptomes were compared, the glucose concentrations were approximately 24 and 4.4 mM for the mid-exponential cultures, where 4.4 mM represents a common target concentration in the biopharmaceutical industry for controlled fed-batch cultures. A recombinant CHO cell line producing a monoclonal antibody was used, such that the impact on glycosylation genes could be evaluated. Relatively few genes were identified as being significantly different (FDR ≤ 0.01) between the high and low glucose conditions, for example, only 575 genes, and only 40 of these genes had 2-fold or greater differences. Gene expression differences for glycolysis, TCA cycle, and glycosylation-related reactions were minimal and unlikely to have biological significance. This transcriptome study indicates that low glucose concentrations in the culture medium are unlikely to cause any biologically significant or detrimental changes to CHO cells at the transcriptome level. Furthermore, it is well-known that maintaining low glucose concentrations in fed-batch cultures can reduce lactate production, which in turn improves process outcomes. Taken together, the transcriptome data supports the continued development of low glucose-based processes to control lactate. © 2017 American Institute of Chemical Engineers Biotechnol. Prog., 33:771-785, 2017.
Collapse
Affiliation(s)
| | - Christopher A Saski
- Inst. of Translational Genomics, Clemson University, Clemson, SC, 29634.,Dept. of Genetics and Biochemistry, Clemson University, Clemson, SC, 29634
| | - Sarah W Harcum
- Dept. of Bioengineering, Clemson University, Clemson, SC, 29634
| |
Collapse
|
156
|
Malakar P, Shilo A, Mogilevsky A, Stein I, Pikarsky E, Nevo Y, Benyamini H, Elgavish S, Zong X, Prasanth KV, Karni R. Long Noncoding RNA MALAT1 Promotes Hepatocellular Carcinoma Development by SRSF1 Upregulation and mTOR Activation. Cancer Res 2017; 77:1155-1167. [PMID: 27993818 PMCID: PMC5334181 DOI: 10.1158/0008-5472.can-16-1508] [Citation(s) in RCA: 235] [Impact Index Per Article: 33.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Revised: 11/26/2016] [Accepted: 12/06/2016] [Indexed: 12/30/2022]
Abstract
Several long noncoding RNAs (lncRNA) are abrogated in cancer but their precise contributions to oncogenesis are still emerging. Here we report that the lncRNA MALAT1 is upregulated in hepatocellular carcinoma and acts as a proto-oncogene through Wnt pathway activation and induction of the oncogenic splicing factor SRSF1. Induction of SRSF1 by MALAT1 modulates SRSF1 splicing targets, enhancing the production of antiapoptotic splicing isoforms and activating the mTOR pathway by modulating the alternative splicing of S6K1. Inhibition of SRSF1 expression or mTOR activity abolishes the oncogenic properties of MALAT1, suggesting that SRSF1 induction and mTOR activation are essential for MALAT1-induced transformation. Our results reveal a mechanism by which lncRNA MALAT1 acts as a proto-oncogene in hepatocellular carcinoma, modulating oncogenic alternative splicing through SRSF1 upregulation. Cancer Res; 77(5); 1155-67. ©2016 AACR.
Collapse
Affiliation(s)
- Pushkar Malakar
- Department of Biochemistry and Molecular Biology, Hebrew University-Hadassah Medical School, Ein Karem, Jerusalem, Israel
| | - Asaf Shilo
- Department of Biochemistry and Molecular Biology, Hebrew University-Hadassah Medical School, Ein Karem, Jerusalem, Israel
| | - Adi Mogilevsky
- Department of Biochemistry and Molecular Biology, Hebrew University-Hadassah Medical School, Ein Karem, Jerusalem, Israel
| | - Ilan Stein
- Department of Immunology and Cancer Research, Hebrew University-Hadassah Medical School, Ein Karem, Jerusalem, Israel
| | - Eli Pikarsky
- Department of Immunology and Cancer Research, Hebrew University-Hadassah Medical School, Ein Karem, Jerusalem, Israel
| | - Yuval Nevo
- Bioinformatics unit, the Institute for Medical Research Israel-Canada, Hebrew University-Hadassah Medical School, Ein Karem, Jerusalem, Israel
| | - Hadar Benyamini
- Bioinformatics unit, the Institute for Medical Research Israel-Canada, Hebrew University-Hadassah Medical School, Ein Karem, Jerusalem, Israel
| | - Sharona Elgavish
- Bioinformatics unit, the Institute for Medical Research Israel-Canada, Hebrew University-Hadassah Medical School, Ein Karem, Jerusalem, Israel
| | - Xinying Zong
- Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Kannanganattu V Prasanth
- Department of Cell and Developmental Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Rotem Karni
- Department of Biochemistry and Molecular Biology, Hebrew University-Hadassah Medical School, Ein Karem, Jerusalem, Israel.
| |
Collapse
|
157
|
Khanzada NS, Butler MG, Manzardo AM. GeneAnalytics Pathway Analysis and Genetic Overlap among Autism Spectrum Disorder, Bipolar Disorder and Schizophrenia. Int J Mol Sci 2017; 18:ijms18030527. [PMID: 28264500 PMCID: PMC5372543 DOI: 10.3390/ijms18030527] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Revised: 02/15/2017] [Accepted: 02/23/2017] [Indexed: 12/18/2022] Open
Abstract
Bipolar disorder (BPD) and schizophrenia (SCH) show similar neuropsychiatric behavioral disturbances, including impaired social interaction and communication, seen in autism spectrum disorder (ASD) with multiple overlapping genetic and environmental influences implicated in risk and course of illness. GeneAnalytics software was used for pathway analysis and genetic profiling to characterize common susceptibility genes obtained from published lists for ASD (792 genes), BPD (290 genes) and SCH (560 genes). Rank scores were derived from the number and nature of overlapping genes, gene-disease association, tissue specificity and gene functions subdivided into categories (e.g., diseases, tissues or functional pathways). Twenty-three genes were common to all three disorders and mapped to nine biological Superpathways including Circadian entrainment (10 genes, score = 37.0), Amphetamine addiction (five genes, score = 24.2), and Sudden infant death syndrome (six genes, score = 24.1). Brain tissues included the medulla oblongata (11 genes, score = 2.1), thalamus (10 genes, score = 2.0) and hypothalamus (nine genes, score = 2.0) with six common genes (BDNF, DRD2, CHRNA7, HTR2A, SLC6A3, and TPH2). Overlapping genes impacted dopamine and serotonin homeostasis and signal transduction pathways, impacting mood, behavior and physical activity level. Converging effects on pathways governing circadian rhythms support a core etiological relationship between neuropsychiatric illnesses and sleep disruption with hypoxia and central brain stem dysfunction.
Collapse
Affiliation(s)
- Naveen S Khanzada
- Department of Psychiatry and Behavioral Sciences, University of Kansas Medical Center, Kansas City, KS 66160, USA.
| | - Merlin G Butler
- Department of Psychiatry and Behavioral Sciences, University of Kansas Medical Center, Kansas City, KS 66160, USA.
- Department of Pediatrics, University of Kansas Medical Center, Kansas City, KS 66160, USA.
| | - Ann M Manzardo
- Department of Psychiatry and Behavioral Sciences, University of Kansas Medical Center, Kansas City, KS 66160, USA.
| |
Collapse
|
158
|
Gershoni M, Pietrokovski S. The landscape of sex-differential transcriptome and its consequent selection in human adults. BMC Biol 2017; 15:7. [PMID: 28173793 PMCID: PMC5297171 DOI: 10.1186/s12915-017-0352-z] [Citation(s) in RCA: 150] [Impact Index Per Article: 21.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2016] [Accepted: 01/19/2017] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND The prevalence of several human morbid phenotypes is sometimes much higher than intuitively expected. This can directly arise from the presence of two sexes, male and female, in one species. Men and women have almost identical genomes but are distinctly dimorphic, with dissimilar disease susceptibilities. Sexually dimorphic traits mainly result from differential expression of genes present in both sexes. Such genes can be subject to different, and even opposing, selection constraints in the two sexes. This can impact human evolution by differential selection on mutations with dissimilar effects on the two sexes. RESULTS We comprehensively mapped human sex-differential genetic architecture across 53 tissues. Analyzing available RNA-sequencing data from 544 adults revealed thousands of genes differentially expressed in the reproductive tracts and tissues common to both sexes. Sex-differential genes are related to various biological systems, and suggest new insights into the pathophysiology of diverse human diseases. We also identified a significant association between sex-specific gene transcription and reduced selection efficiency and accumulation of deleterious mutations, which might affect the prevalence of different traits and diseases. Interestingly, many of the sex-specific genes that also undergo reduced selection efficiency are essential for successful reproduction in men or women. This seeming paradox might partially explain the high incidence of human infertility. CONCLUSIONS This work provides a comprehensive overview of the sex-differential transcriptome and its importance to human evolution and human physiology in health and in disease.
Collapse
Affiliation(s)
- Moran Gershoni
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Shmuel Pietrokovski
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
159
|
Rappaport N, Twik M, Plaschkes I, Nudel R, Iny Stein T, Levitt J, Gershoni M, Morrey CP, Safran M, Lancet D. MalaCards: an amalgamated human disease compendium with diverse clinical and genetic annotation and structured search. Nucleic Acids Res 2016; 45:D877-D887. [PMID: 27899610 PMCID: PMC5210521 DOI: 10.1093/nar/gkw1012] [Citation(s) in RCA: 352] [Impact Index Per Article: 44.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Revised: 10/14/2016] [Accepted: 10/29/2016] [Indexed: 12/13/2022] Open
Abstract
The MalaCards human disease database (http://www.malacards.org/) is an integrated compendium of annotated diseases mined from 68 data sources. MalaCards has a web card for each of ∼20 000 disease entries, in six global categories. It portrays a broad array of annotation topics in 15 sections, including Summaries, Symptoms, Anatomical Context, Drugs, Genetic Tests, Variations and Publications. The Aliases and Classifications section reflects an algorithm for disease name integration across often-conflicting sources, providing effective annotation consolidation. A central feature is a balanced Genes section, with scores reflecting the strength of disease-gene associations. This is accompanied by other gene-related disease information such as pathways, mouse phenotypes and GO-terms, stemming from MalaCards’ affiliation with the GeneCards Suite of databases. MalaCards’ capacity to inter-link information from complementary sources, along with its elaborate search function, relational database infrastructure and convenient data dumps, allows it to tackle its rich disease annotation landscape, and facilitates systems analyses and genome sequence interpretation. MalaCards adopts a ‘flat’ disease-card approach, but each card is mapped to popular hierarchical ontologies (e.g. International Classification of Diseases, Human Phenotype Ontology and Unified Medical Language System) and also contains information about multi-level relations among diseases, thereby providing an optimal tool for disease representation and scrutiny.
Collapse
Affiliation(s)
- Noa Rappaport
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Michal Twik
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Inbar Plaschkes
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Ron Nudel
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Tsippi Iny Stein
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Jacob Levitt
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Moran Gershoni
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - C Paul Morrey
- Department of Information Systems and Technology, Utah Valley University, Orem, UT 84058, USA
| | - Marilyn Safran
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Doron Lancet
- Department of Molecular Genetics, the Weizmann Institute of Science, Rehovot, 76100, Israel
| |
Collapse
|
160
|
Mirsafian H, Ripen AM, Manaharan T, Mohamad SB, Merican AF. Toward a Reference Gene Catalog of Human Primary Monocytes. ACTA ACUST UNITED AC 2016; 20:627-634. [DOI: 10.1089/omi.2016.0124] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Hoda Mirsafian
- Faculty of Science, Institute of Biological Sciences, University of Malaya, Kuala Lumpur, Malaysia
| | - Adiratna Mat Ripen
- Allergy and Immunology Research Centre, Institute for Medical Research, Jalan Pahang, Kuala Lumpur, Malaysia
| | - Thamilvaani Manaharan
- Centre of Research for Computational Sciences and Informatics in Biology, Bioindustry, Environment, Agriculture and Healthcare (CRYSTAL), University of Malaya, Kuala Lumpur, Malaysia
| | - Saharuddin Bin Mohamad
- Faculty of Science, Institute of Biological Sciences, University of Malaya, Kuala Lumpur, Malaysia
- Centre of Research for Computational Sciences and Informatics in Biology, Bioindustry, Environment, Agriculture and Healthcare (CRYSTAL), University of Malaya, Kuala Lumpur, Malaysia
| | - Amir Feisal Merican
- Faculty of Science, Institute of Biological Sciences, University of Malaya, Kuala Lumpur, Malaysia
- Centre of Research for Computational Sciences and Informatics in Biology, Bioindustry, Environment, Agriculture and Healthcare (CRYSTAL), University of Malaya, Kuala Lumpur, Malaysia
| |
Collapse
|
161
|
Karthik D, Stelzer G, Gershanov S, Baranes D, Salmon-Divon M. Elucidating tissue specific genes using the Benford distribution. BMC Genomics 2016; 17:595. [PMID: 27506195 PMCID: PMC4979126 DOI: 10.1186/s12864-016-2921-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Accepted: 07/07/2016] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND The RNA-seq technique is applied for the investigation of transcriptional behaviour. The reduction in sequencing costs has led to an unprecedented trove of gene expression data from diverse biological systems. Subsequently, principles from other disciplines such as the Benford law, which can be properly judged only in data-rich systems, can now be examined on this high-throughput transcriptomic information. The Benford law, states that in many count-rich datasets the distribution of the first significant digit is not uniform but rather logarithmic. RESULTS All tested digital gene expression datasets showed a Benford-like distribution when observing an entire gene set. This phenomenon was conserved in development and does not demonstrate tissue specificity. However, when obedience to the Benford law is calculated for individual expressed genes across thousands of cells, genes that best and least adhere to the Benford law are enriched with tissue specific or cell maintenance descriptors, respectively. Surprisingly, a positive correlation was found between the obedience a gene exhibits to the Benford law and its expression level, despite the former being calculated solely according to first digit frequency while totally ignoring the expression value itself. Nevertheless, genes with low expression that exhibit Benford behavior demonstrate tissue specific associations. These observations were extended to predict the likelihood of tissue specificity based on Benford behaviour in a supervised learning approach. CONCLUSIONS These results demonstrate the applicability and potential predictability of the Benford law for gleaning biological insight from simple count data.
Collapse
Affiliation(s)
- Deepak Karthik
- Department of Molecular Biology, Ariel University, Ariel, 40700, Israel
| | - Gil Stelzer
- Department of Molecular Biology, Ariel University, Ariel, 40700, Israel
| | - Sivan Gershanov
- Department of Molecular Biology, Ariel University, Ariel, 40700, Israel
| | - Danny Baranes
- Department of Molecular Biology, Ariel University, Ariel, 40700, Israel
| | - Mali Salmon-Divon
- Department of Molecular Biology, Ariel University, Ariel, 40700, Israel.
| |
Collapse
|
162
|
Stelzer G, Rosen N, Plaschkes I, Zimmerman S, Twik M, Fishilevich S, Stein TI, Nudel R, Lieder I, Mazor Y, Kaplan S, Dahary D, Warshawsky D, Guan-Golan Y, Kohn A, Rappaport N, Safran M, Lancet D. The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses. ACTA ACUST UNITED AC 2016; 54:1.30.1-1.30.33. [PMID: 27322403 DOI: 10.1002/cpbi.5] [Citation(s) in RCA: 2318] [Impact Index Per Article: 289.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
GeneCards, the human gene compendium, enables researchers to effectively navigate and inter-relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways. Our recently launched Version 4 has a revamped infrastructure facilitating faster data updates, better-targeted data queries, and friendlier user experience. It also provides a stronger foundation for the GeneCards suite of companion databases and analysis tools. Improved data unification includes gene-disease links via MalaCards and merged biological pathways via PathCards, as well as drug information and proteome expression. VarElect, another suite member, is a phenotype prioritizer for next-generation sequencing, leveraging the GeneCards and MalaCards knowledgebase. It automatically infers direct and indirect scored associations between hundreds or even thousands of variant-containing genes and disease phenotype terms. VarElect's capabilities, either independently or within TGex, our comprehensive variant analysis pipeline, help prepare for the challenge of clinical projects that involve thousands of exome/genome NGS analyses. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Gil Stelzer
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.,These authors contributed equally to the paper
| | - Naomi Rosen
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.,These authors contributed equally to the paper
| | - Inbar Plaschkes
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.,LifeMap Sciences Ltd, Tel Aviv, Israel
| | - Shahar Zimmerman
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Michal Twik
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Simon Fishilevich
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Tsippi Iny Stein
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Ron Nudel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | | | | | | | - Dvir Dahary
- LifeMap Sciences Ltd, Tel Aviv, Israel.,Toldot Genetics Ltd, Hod Hasharon, Israel
| | | | | | - Asher Kohn
- LifeMap Sciences Inc, Marshfield, Massachusetts
| | - Noa Rappaport
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Marilyn Safran
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Doron Lancet
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.,Corresponding author
| |
Collapse
|
163
|
Fishilevich S, Zimmerman S, Kohn A, Iny Stein T, Olender T, Kolker E, Safran M, Lancet D. Genic insights from integrated human proteomics in GeneCards. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw030. [PMID: 27048349 PMCID: PMC4820835 DOI: 10.1093/database/baw030] [Citation(s) in RCA: 113] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 02/23/2016] [Indexed: 11/15/2022]
Abstract
GeneCards is a one-stop shop for searchable human gene annotations (http://www.genecards.org/). Data are automatically mined from ∼120 sources and presented in an integrated web card for every human gene. We report the application of recent advances in proteomics to enhance gene annotation and classification in GeneCards. First, we constructed the Human Integrated Protein Expression Database (HIPED), a unified database of protein abundance in human tissues, based on the publically available mass spectrometry (MS)-based proteomics sources ProteomicsDB, Multi-Omics Profiling Expression Database, Protein Abundance Across Organisms and The MaxQuant DataBase. The integrated database, residing within GeneCards, compares favourably with its individual sources, covering nearly 90% of human protein-coding genes. For gene annotation and comparisons, we first defined a protein expression vector for each gene, based on normalized abundances in 69 normal human tissues. This vector is portrayed in the GeneCards expression section as a bar graph, allowing visual inspection and comparison. These data are juxtaposed with transcriptome bar graphs. Using the protein expression vectors, we further defined a pairwise metric that helps assess expression-based pairwise proximity. This new metric for finding functional partners complements eight others, including sharing of pathways, gene ontology (GO) terms and domains, implemented in the GeneCards Suite. In parallel, we calculated proteome-based differential expression, highlighting a subset of tissues that overexpress a gene and subserving gene classification. This textual annotation allows users of VarElect, the suite’s next-generation phenotyper, to more effectively discover causative disease variants. Finally, we define the protein–RNA expression ratio and correlation as yet another attribute of every gene in each tissue, adding further annotative information. The results constitute a significant enhancement of several GeneCards sections and help promote and organize the genome-wide structural and functional knowledge of the human proteome. Database URL: http://www.genecards.org/
Collapse
Affiliation(s)
- Simon Fishilevich
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Shahar Zimmerman
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Asher Kohn
- LifeMap Sciences Ltd., Tel Aviv 69710, Israel
| | - Tsippi Iny Stein
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Tsviya Olender
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Eugene Kolker
- CDO Analytics, Seattle Children's Hospital, Seattle, WA 98101 USA Bioinformatics and High-Throughput Analysis Laboratory, Seattle Children's Research Institute, Seattle, WA 98101 USA Data-Enabled Life Sciences Alliance (DELSA), Seattle, Washington, 98101, USA Departments of Biomedical Informatics and Medical Education and Pediatrics, University of Washington School of Medicine, Seattle, WA 98109, USA Department of Chemistry and Chemical Biology, Northeastern University College of Science, Boston, MA 02115 USA
| | - Marilyn Safran
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Doron Lancet
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 7610001, Israel
| |
Collapse
|