201
|
Rouillard AD, Wang Z, Ma’ayan A. Publisher’s Note:Abstraction for data integration:Fusing mammalian molecular, cellular and phenotype big datasets for better knowledge extraction. Comput Biol Chem 2015; 58:104-19. [PMID: 26101093 PMCID: PMC4675694 DOI: 10.1016/j.compbiolchem.2015.06.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2015] [Revised: 06/04/2015] [Accepted: 06/05/2015] [Indexed: 12/27/2022]
Abstract
With advances in genomics, transcriptomics, metabolomics and proteomics, and more expansive electronic clinical record monitoring, as well as advances in computation, we have entered the Big Data era in biomedical research. Data gathering is growing rapidly while only a small fraction of this data is converted to useful knowledge or reused in future studies. To improve this, an important concept that is often overlooked is data abstraction. To fuse and reuse biomedical datasets from diverse resources, data abstraction is frequently required. Here we summarize some of the major Big Data biomedical research resources for genomics, proteomics and phenotype data, collected from mammalian cells, tissues and organisms. We then suggest simple data abstraction methods for fusing this diverse but related data. Finally, we demonstrate examples of the potential utility of such data integration efforts, while warning about the inherit biases that exist within such data.
Collapse
Affiliation(s)
- Andrew D. Rouillard
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029
- BD2K-LINCS Data Coordination and Integration Center
- Illuminating the Druggable Genome Knowledge Management Center
| | - Zichen Wang
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029
- BD2K-LINCS Data Coordination and Integration Center
- Illuminating the Druggable Genome Knowledge Management Center
| | - Avi Ma’ayan
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place Box 1215, New York, NY 10029
- BD2K-LINCS Data Coordination and Integration Center
- Illuminating the Druggable Genome Knowledge Management Center
| |
Collapse
|
202
|
Abstract
Deleterious or 'disease-associated' mutations are mutations that lead to disease with high phenotype penetrance: they are inherited in a simple Mendelian manner, or, in the case of cancer, accumulate in somatic cells leading directly to disease. However, in some cases, the amino acid that is substituted resulting in disease is the wild-type native residue in the functionally equivalent protein in another species. Such examples are known as 'compensated pathogenic deviations' (CPDs) because, somewhere in the second species, there must be compensatory mutations that allow the protein to function normally despite having a residue which would cause disease in the first species. Depending on the nature of the mutations, compensation can occur in the same protein, or in a different protein with which it interacts. In principle, compensation can be achieved by a single mutation (most probably structurally close to the CPD), or by the cumulative effect of several mutations. Although it is clear that these effects occur in proteins, compensatory mutations are also important in RNA potentially having an impact on disease. As a much simpler molecule, RNA provides an interesting model for understanding mechanisms of compensatory effects, both by looking at naturally occurring RNA molecules and as a means of computational simulation. This review surveys the rather limited literature that has explored these effects. Understanding the nature of CPDs is important in understanding traversal along fitness landscape valleys in evolution. It could also have applications in treating diseases that result from such mutations.
Collapse
|
203
|
Korla PK, Cheng J, Huang CH, Tsai JJP, Liu YH, Kurubanjerdjit N, Hsieh WT, Chen HY, Ng KL. FARE-CAFE: a database of functional and regulatory elements of cancer-associated fusion events. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav086. [PMID: 26384373 PMCID: PMC4684693 DOI: 10.1093/database/bav086] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2015] [Accepted: 08/18/2015] [Indexed: 01/08/2023]
Abstract
Chromosomal translocation (CT) is of enormous clinical interest because this disorder is associated with various major solid tumors and leukemia. A tumor-specific fusion gene event may occur when a translocation joins two separate genes. Currently, various CT databases provide information about fusion genes and their genomic elements. However, no database of the roles of fusion genes, in terms of essential functional and regulatory elements in oncogenesis, is available. FARE-CAFE is a unique combination of CTs, fusion proteins, protein domains, domain–domain interactions, protein–protein interactions, transcription factors and microRNAs, with subsequent experimental information, which cannot be found in any other CT database. Genomic DNA information including, for example, manually collected exact locations of the first and second break points, sequences and karyotypes of fusion genes are included. FARE-CAFE will substantially facilitate the cancer biologist’s mission of elucidating the pathogenesis of various types of cancer. This database will ultimately help to develop ‘novel’ therapeutic approaches. Database URL:http://ppi.bioinfo.asia.edu.tw/FARE-CAFE
Collapse
Affiliation(s)
- Praveen Kumar Korla
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan
| | - Jack Cheng
- Graduate Institute of Integrated Medicine, College of Chinese Medicine, China Medical University, Taichung 40402, Taiwan
| | - Chien-Hung Huang
- Department of Computer Science and Information Engineering, National Formosa University, Yunlin 632, Taiwan
| | - Jeffrey J P Tsai
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan
| | - Yu-Hsuan Liu
- Department of Computer Science and Information Engineering, National Formosa University, Yunlin 632, Taiwan
| | | | - Wen-Tsong Hsieh
- Department of Pharmacology, China Medical University, Taichung 40402, Taiwan
| | - Huey-Yi Chen
- Department of Obstetrics and Gynecology, China Medical University Hospital, China Medical University, Taichung 40402, Taiwan, and
| | - Ka-Lok Ng
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung 41354, Taiwan, Department of Medical Research, China Medical University Hospital, China Medical University, Taichung 40402, Taiwan
| |
Collapse
|
204
|
Thomas AL, Davis SM, Dierick HA. Of Fighting Flies, Mice, and Men: Are Some of the Molecular and Neuronal Mechanisms of Aggression Universal in the Animal Kingdom? PLoS Genet 2015; 11:e1005416. [PMID: 26312756 PMCID: PMC4551476 DOI: 10.1371/journal.pgen.1005416] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Aggressive behavior is widespread in the animal kingdom, but the degree of molecular conservation between distantly related species is still unclear. Recent reports suggest that at least some of the molecular mechanisms underlying this complex behavior in flies show remarkable similarities with such mechanisms in mice and even humans. Surprisingly, some aspects of neuronal control of aggression also show remarkable similarity between these distantly related species. We will review these recent findings, address the evolutionary implications, and discuss the potential impact for our understanding of human diseases characterized by excessive aggression.
Collapse
Affiliation(s)
- Amanda L. Thomas
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Shaun M. Davis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Herman A. Dierick
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Pathology and Immunology, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Neuroscience, Baylor College of Medicine, Houston, Texas, United States of America
- Program in Developmental Biology, Baylor College of Medicine, Houston, Texas, United States of America
- * E-mail:
| |
Collapse
|
205
|
Han SK, Kim I, Hwang J, Kim S. Network Modules of the Cross-Species Genotype-Phenotype Map Reflect the Clinical Severity of Human Diseases. PLoS One 2015; 10:e0136300. [PMID: 26301634 PMCID: PMC4547739 DOI: 10.1371/journal.pone.0136300] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2015] [Accepted: 08/02/2015] [Indexed: 01/09/2023] Open
Abstract
Recent advances in genome sequencing techniques have improved our understanding of the genotype-phenotype relationship between genetic variants and human diseases. However, genetic variations uncovered from patient populations do not provide enough information to understand the mechanisms underlying the progression and clinical severity of human diseases. Moreover, building a high-resolution genotype-phenotype map is difficult due to the diverse genetic backgrounds of the human population. We built a cross-species genotype-phenotype map to explain the clinical severity of human genetic diseases. We developed a data-integrative framework to investigate network modules composed of human diseases mapped with gene essentiality measured from a model organism. Essential and nonessential genes connect diseases of different types which form clusters in the human disease network. In a large patient population study, we found that disease classes enriched with essential genes tended to show a higher mortality rate than disease classes enriched with nonessential genes. Moreover, high disease mortality rates are explained by the multiple comorbid relationships and the high pleiotropy of disease genes found in the essential gene-enriched diseases. Our results reveal that the genotype-phenotype map of a model organism can facilitate the identification of human disease-gene associations and predict human disease progression.
Collapse
Affiliation(s)
- Seong Kyu Han
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, 790–784, Korea
| | - Inhae Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, 790–784, Korea
| | - Jihye Hwang
- Department of IT Convergence and Engineering, Pohang University of Science and Technology, Pohang, 790–784, Korea
| | - Sanguk Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, 790–784, Korea
- * E-mail:
| |
Collapse
|
206
|
Wang B, Gao L, Zhang Q, Li A, Deng Y, Guo X. Diversified Control Paths: A Significant Way Disease Genes Perturb the Human Regulatory Network. PLoS One 2015; 10:e0135491. [PMID: 26284649 PMCID: PMC4540569 DOI: 10.1371/journal.pone.0135491] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2015] [Accepted: 07/23/2015] [Indexed: 11/18/2022] Open
Abstract
Background The complexity of biological systems motivates us to use the underlying networks to provide deep understanding of disease etiology and the human diseases are viewed as perturbations of dynamic properties of networks. Control theory that deals with dynamic systems has been successfully used to capture systems-level knowledge in large amount of quantitative biological interactions. But from the perspective of system control, the ways by which multiple genetic factors jointly perturb a disease phenotype still remain. Results In this work, we combine tools from control theory and network science to address the diversified control paths in complex networks. Then the ways by which the disease genes perturb biological systems are identified and quantified by the control paths in a human regulatory network. Furthermore, as an application, prioritization of candidate genes is presented by use of control path analysis and gene ontology annotation for definition of similarities. We use leave-one-out cross-validation to evaluate the ability of finding the gene-disease relationship. Results have shown compatible performance with previous sophisticated works, especially in directed systems. Conclusions Our results inspire a deeper understanding of molecular mechanisms that drive pathological processes. Diversified control paths offer a basis for integrated intervention techniques which will ultimately lead to the development of novel therapeutic strategies.
Collapse
Affiliation(s)
- Bingbo Wang
- School of Computer Science and Technology, Xidian University, Xi'an, People’s Republic of China
- * E-mail: (BBW); (LG)
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an, People’s Republic of China
- * E-mail: (BBW); (LG)
| | - Qingfang Zhang
- School of Computer Science and Technology, Xidian University, Xi'an, People’s Republic of China
| | - Aimin Li
- School of Computer Science and Technology, Xi’an University of Technology, Xi'an, People’s Republic of China
| | - Yue Deng
- School of Computer Science and Technology, Xidian University, Xi'an, People’s Republic of China
- Institute of Software Engineering, Xidian University, Xi'an, People’s Republic of China
| | - Xingli Guo
- School of Computer Science and Technology, Xidian University, Xi'an, People’s Republic of China
| |
Collapse
|
207
|
James-Zorn C, Ponferrada VG, Burns KA, Fortriede JD, Lotay VS, Liu Y, Karpinka JB, Karimi K, Zorn AM, Vize PD. Xenbase: Core features, data acquisition, and data processing. Genesis 2015; 53:486-97. [PMID: 26150211 PMCID: PMC4545734 DOI: 10.1002/dvg.22873] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2015] [Revised: 06/15/2015] [Accepted: 06/22/2015] [Indexed: 01/18/2023]
Abstract
Xenbase, the Xenopus model organism database (www.xenbase.org), is a cloud-based, web-accessible resource that integrates the diverse genomic and biological data from Xenopus research. Xenopus frogs are one of the major vertebrate animal models used for biomedical research, and Xenbase is the central repository for the enormous amount of data generated using this model tetrapod. The goal of Xenbase is to accelerate discovery by enabling investigators to make novel connections between molecular pathways in Xenopus and human disease. Our relational database and user-friendly interface make these data easy to query and allows investigators to quickly interrogate and link different data types in ways that would otherwise be difficult, time consuming, or impossible. Xenbase also enhances the value of these data through high-quality gene expression curation and data integration, by providing bioinformatics tools optimized for Xenopus experiments, and by linking Xenopus data to other model organisms and to human data. Xenbase draws in data via pipelines that download data, parse the content, and save them into appropriate files and database tables. Furthermore, Xenbase makes these data accessible to the broader biomedical community by continually providing annotated data updates to organizations such as NCBI, UniProtKB, and Ensembl. Here, we describe our bioinformatics, genome-browsing tools, data acquisition and sharing, our community submitted and literature curation pipelines, text-mining support, gene page features, and the curation of gene nomenclature and gene models.
Collapse
Affiliation(s)
- Christina James-Zorn
- Division of Developmental Biology, Cincinnati Children’s Hospital, 3333 Burnet Ave, Cincinnati, OH 45229, USA
| | - Virgillio G. Ponferrada
- Division of Developmental Biology, Cincinnati Children’s Hospital, 3333 Burnet Ave, Cincinnati, OH 45229, USA
| | - Kevin A. Burns
- Division of Developmental Biology, Cincinnati Children’s Hospital, 3333 Burnet Ave, Cincinnati, OH 45229, USA
| | - Joshua D. Fortriede
- Division of Developmental Biology, Cincinnati Children’s Hospital, 3333 Burnet Ave, Cincinnati, OH 45229, USA
| | - Vaneet S. Lotay
- Departments of Biological Science and Computer Science, University of Calgary, 2500 University Drive NW, Calgary, AB T2N1N4, Canada
| | - Yu Liu
- Departments of Biological Science and Computer Science, University of Calgary, 2500 University Drive NW, Calgary, AB T2N1N4, Canada
| | - J. Brad Karpinka
- Departments of Biological Science and Computer Science, University of Calgary, 2500 University Drive NW, Calgary, AB T2N1N4, Canada
| | - Kamran Karimi
- Departments of Biological Science and Computer Science, University of Calgary, 2500 University Drive NW, Calgary, AB T2N1N4, Canada
| | - Aaron M. Zorn
- Division of Developmental Biology, Cincinnati Children’s Hospital, 3333 Burnet Ave, Cincinnati, OH 45229, USA
| | - Peter D. Vize
- Departments of Biological Science and Computer Science, University of Calgary, 2500 University Drive NW, Calgary, AB T2N1N4, Canada
| |
Collapse
|
208
|
Boellner S, Becker KF. Recent progress in protein profiling of clinical tissues for next-generation molecular diagnostics. Expert Rev Mol Diagn 2015. [DOI: 10.1586/14737159.2015.1070098] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
209
|
Zanzoni A, Chapple CE, Brun C. Relationships between predicted moonlighting proteins, human diseases, and comorbidities from a network perspective. Front Physiol 2015; 6:171. [PMID: 26157390 PMCID: PMC4477069 DOI: 10.3389/fphys.2015.00171] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2014] [Accepted: 05/20/2015] [Indexed: 12/26/2022] Open
Abstract
Moonlighting proteins are a subset of multifunctional proteins characterized by their multiple, independent, and unrelated biological functions. We recently set up a large-scale identification of moonlighting proteins using a protein-protein interaction (PPI) network approach. We established that 3% of the current human interactome is composed of predicted moonlighting proteins. We found that disease-related genes are over-represented among those candidates. Here, by comparing moonlighting candidates to non-candidates as groups, we further show that (i) they are significantly involved in more than one disease, (ii) they contribute to complex rather than monogenic diseases, (iii) the diseases in which they are involved are phenotypically different according to their annotations, finally, (iv) they are enriched for diseases pairs showing statistically significant comorbidity patterns based on Medicare records. Altogether, our results suggest that some observed comorbidities between phenotypically different diseases could be due to a shared protein involved in unrelated biological processes.
Collapse
Affiliation(s)
- Andreas Zanzoni
- INSERM, UMR_S1090 TAGC Marseille, France ; Aix-Marseille Université, UMR_S1090, TAGC Marseille, France
| | - Charles E Chapple
- INSERM, UMR_S1090 TAGC Marseille, France ; Aix-Marseille Université, UMR_S1090, TAGC Marseille, France
| | - Christine Brun
- INSERM, UMR_S1090 TAGC Marseille, France ; Aix-Marseille Université, UMR_S1090, TAGC Marseille, France ; Centre National de la Recherche Scientifique Marseille, France
| |
Collapse
|
210
|
Assembly of a comprehensive regulatory network for the mammalian circadian clock: a bioinformatics approach. PLoS One 2015; 10:e0126283. [PMID: 25945798 PMCID: PMC4422523 DOI: 10.1371/journal.pone.0126283] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2015] [Accepted: 03/31/2015] [Indexed: 12/12/2022] Open
Abstract
By regulating the timing of cellular processes, the circadian clock provides a way to adapt physiology and behaviour to the geophysical time. In mammals, a light-entrainable master clock located in the suprachiasmatic nucleus (SCN) controls peripheral clocks that are present in virtually every body cell. Defective circadian timing is associated with several pathologies such as cancer and metabolic and sleep disorders. To better understand the circadian regulation of cellular processes, we developed a bioinformatics pipeline encompassing the analysis of high-throughput data sets and the exploitation of published knowledge by text-mining. We identified 118 novel potential clock-regulated genes and integrated them into an existing high-quality circadian network, generating the to-date most comprehensive network of circadian regulated genes (NCRG). To validate particular elements in our network, we assessed publicly available ChIP-seq data for BMAL1, REV-ERBα/β and RORα/γ proteins and found strong evidence for circadian regulation of Elavl1, Nme1, Dhx6, Med1 and Rbbp7 all of which are involved in the regulation of tumourigenesis. Furthermore, we identified Ncl and Ddx6, as targets of RORγ and REV-ERBα, β, respectively. Most interestingly, these genes were also reported to be involved in miRNA regulation; in particular, NCL regulates several miRNAs, all involved in cancer aggressiveness. Thus, NCL represents a novel potential link via which the circadian clock, and specifically RORγ, regulates the expression of miRNAs, with particular consequences in breast cancer progression. Our findings bring us one step forward towards a mechanistic understanding of mammalian circadian regulation, and provide further evidence of the influence of circadian deregulation in cancer.
Collapse
|
211
|
PGMD: a comprehensive manually curated pharmacogenomic database. THE PHARMACOGENOMICS JOURNAL 2015; 16:124-8. [PMID: 25939485 PMCID: PMC4819767 DOI: 10.1038/tpj.2015.32] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Revised: 02/21/2015] [Accepted: 03/02/2015] [Indexed: 01/15/2023]
Abstract
The PharmacoGenomic Mutation Database (PGMD) is a comprehensive manually curated pharmacogenomics database. Two major sources of PGMD data are peer-reviewed literature and Food and Drug Administration (FDA) and European Medicines Agency (EMA) drug labels. PGMD curators capture information on exact genomic location and sequence changes, on resulting phenotype, drugs administered, patient population, study design, disease context, statistical significance and other properties of reported pharmacogenomic variants. Variants are annotated into functional categories on the basis of their influence on pharmacokinetics, pharmacodynamics, efficacy or clinical outcome. The current release of PGMD includes over 117 000 unique pharmacogenomic observations, covering all 24 disease superclasses and nearly 1400 drugs. Over 2800 genes have associated pharmacogenomic variants, including genes in proximity to intergenic variants. PGMD is optimized for use in annotating next-generation sequencing data by providing genomic coordinates for all covered variants, including Single Nucleotide Polymorphisms (SNPs), insertions, deletions, haplotypes, diplotypes, Variable Number Tandem Repeats (VNTR), copy number variations and structural variations.
Collapse
|
212
|
Gopinath K, Jayakumararaj R, Karthikeyan M. DAPD: A Knowledgebase for Diabetes Associated Proteins. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:604-610. [PMID: 26357271 DOI: 10.1109/tcbb.2014.2359442] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Recent advancements in genomics and proteomics provide a solid foundation for understanding the pathogenesis of diabetes. Proteomics of diabetes associated pathways help to identify the most potent target for the management of diabetes. The relevant datasets are scattered in various prominent sources which takes much time to select the therapeutic target for the clinical management of diabetes. However, additional information about target proteins is needed for validation. This lacuna may be resolved by linking diabetes associated genes, pathways and proteins and it will provide a strong base for the treatment and planning management strategies of diabetes. Thus, a web source "Diabetes Associated Proteins Database (DAPD)" has been developed to link the diabetes associated genes, pathways and proteins using PHP, MySQL. The current version of DAPD has been built with proteins associated with different types of diabetes. In addition, DAPD has been linked to external sources to gain the access to more participatory proteins and their pathway network. DAPD will reduce the time and it is expected to pave the way for the discovery of novel anti-diabetic leads using computational drug designing for diabetes management. DAPD is open accessed via following url www.mkarthikeyan.bioinfoau.org/dapd.
Collapse
|
213
|
Le DH. A novel method for identifying disease associated protein complexes based on functional similarity protein complex networks. Algorithms Mol Biol 2015; 10:14. [PMID: 25969691 PMCID: PMC4427953 DOI: 10.1186/s13015-015-0044-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Accepted: 04/01/2015] [Indexed: 12/21/2022] Open
Abstract
Background Protein complexes formed by non-covalent interaction among proteins play important roles in cellular functions. Computational and purification methods have been used to identify many protein complexes and their cellular functions. However, their roles in terms of causing disease have not been well discovered yet. There exist only a few studies for the identification of disease-associated protein complexes. However, they mostly utilize complicated heterogeneous networks which are constructed based on an out-of-date database of phenotype similarity network collected from literature. In addition, they only apply for diseases for which tissue-specific data exist. Methods In this study, we propose a method to identify novel disease-protein complex associations. First, we introduce a framework to construct functional similarity protein complex networks where two protein complexes are functionally connected by either shared protein elements, shared annotating GO terms or based on protein interactions between elements in each protein complex. Second, we propose a simple but effective neighborhood-based algorithm, which yields a local similarity measure, to rank disease candidate protein complexes. Results Comparing the predictive performance of our proposed algorithm with that of two state-of-the-art network propagation algorithms including one we used in our previous study, we found that it performed statistically significantly better than that of these two algorithms for all the constructed functional similarity protein complex networks. In addition, it ran about 32 times faster than these two algorithms. Moreover, our proposed method always achieved high performance in terms of AUC values irrespective of the ways to construct the functional similarity protein complex networks and the used algorithms. The performance of our method was also higher than that reported in some existing methods which were based on complicated heterogeneous networks. Finally, we also tested our method with prostate cancer and selected the top 100 highly ranked candidate protein complexes. Interestingly, 69 of them were evidenced since at least one of their protein elements are known to be associated with prostate cancer. Conclusions Our proposed method, including the framework to construct functional similarity protein complex networks and the neighborhood-based algorithm on these networks, could be used for identification of novel disease-protein complex associations. Electronic supplementary material The online version of this article (doi:10.1186/s13015-015-0044-6) contains supplementary material, which is available to authorized users.
Collapse
|
214
|
Medina DL, Di Paola S, Peluso I, Armani A, De Stefani D, Venditti R, Montefusco S, Scotto-Rosato A, Prezioso C, Forrester A, Settembre C, Wang W, Gao Q, Xu H, Sandri M, Rizzuto R, De Matteis MA, Ballabio A. Lysosomal calcium signalling regulates autophagy through calcineurin and TFEB. Nat Cell Biol 2015; 17:288-99. [PMID: 25720963 PMCID: PMC4801004 DOI: 10.1038/ncb3114] [Citation(s) in RCA: 969] [Impact Index Per Article: 107.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Accepted: 01/16/2015] [Indexed: 12/17/2022]
Abstract
The view of the lysosome as the terminal end of cellular catabolic pathways has been challenged by recent studies showing a central role of this organelle in the control of cell function. Here we show that a lysosomal Ca2+ signaling mechanism controls the activities of the phosphatase calcineurin and of its substrate TFEB, a master transcriptional regulator of lysosomal biogenesis and autophagy. Lysosomal Ca2+ release via mucolipin 1 (MCOLN1) activates calcineurin, which binds and de-phosphorylates TFEB, thus promoting its nuclear translocation. Genetic and pharmacological inhibition of calcineurin suppressed TFEB activity during starvation and physical exercise, while calcineurin overexpression and constitutive activation had the opposite effect. Induction of autophagy and lysosomal biogenesis via TFEB required MCOLN1-mediated calcineurin activation, linking lysosomal calcium signaling to both calcineurin regulation and autophagy induction. Thus, the lysosome reveals itself as a hub for the signaling pathways that regulate cellular homeostasis.
Collapse
|
215
|
Piñero J, Queralt-Rosinach N, Bravo À, Deu-Pons J, Bauer-Mehren A, Baron M, Sanz F, Furlong LI. DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav028. [PMID: 25877637 PMCID: PMC4397996 DOI: 10.1093/database/bav028] [Citation(s) in RCA: 630] [Impact Index Per Article: 70.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2014] [Accepted: 03/09/2015] [Indexed: 11/25/2022]
Abstract
DisGeNET is a comprehensive discovery platform designed to address a variety of questions concerning the genetic underpinning of human diseases. DisGeNET contains over 380 000 associations between >16 000 genes and 13 000 diseases, which makes it one of the largest repositories currently available of its kind. DisGeNET integrates expert-curated databases with text-mined data, covers information on Mendelian and complex diseases, and includes data from animal disease models. It features a score based on the supporting evidence to prioritize gene-disease associations. It is an open access resource available through a web interface, a Cytoscape plugin and as a Semantic Web resource. The web interface supports user-friendly data exploration and navigation. DisGeNET data can also be analysed via the DisGeNET Cytoscape plugin, and enriched with the annotations of other plugins of this popular network analysis software suite. Finally, the information contained in DisGeNET can be expanded and complemented using Semantic Web technologies and linked to a variety of resources already present in the Linked Data cloud. Hence, DisGeNET offers one of the most comprehensive collections of human gene-disease associations and a valuable set of tools for investigating the molecular mechanisms underlying diseases of genetic origin, designed to fulfill the needs of different user profiles, including bioinformaticians, biologists and health-care practitioners. Database URL: http://www.disgenet.org/
Collapse
Affiliation(s)
- Janet Piñero
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
| | - Núria Queralt-Rosinach
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
| | - Àlex Bravo
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
| | - Jordi Deu-Pons
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
| | - Anna Bauer-Mehren
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
| | - Martin Baron
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
| | - Ferran Sanz
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
| | - Laura I Furlong
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
| |
Collapse
|
216
|
Genome-wide methylation study on depression: differential methylation and variable methylation in monozygotic twins. Transl Psychiatry 2015; 5:e557. [PMID: 25918994 PMCID: PMC4462612 DOI: 10.1038/tp.2015.49] [Citation(s) in RCA: 91] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Revised: 03/02/2015] [Accepted: 03/09/2015] [Indexed: 12/13/2022] Open
Abstract
Depressive disorders have been shown to be highly influenced by environmental pathogenic factors, some of which are believed to exert stress on human brain functioning via epigenetic modifications. Previous genome-wide methylomic studies on depression have suggested that, along with differential DNA methylation, affected co-twins of monozygotic (MZ) pairs have increased DNA methylation variability, probably in line with theories of epigenetic stochasticity. Nevertheless, the potential biological roots of this variability remain largely unexplored. The current study aimed to evaluate whether DNA methylation differences within MZ twin pairs were related to differences in their psychopathological status. Data from the Illumina Infinium HumanMethylation450 Beadchip was used to evaluate peripheral blood DNA methylation of 34 twins (17 MZ pairs). Two analytical strategies were used to identify (a) differentially methylated probes (DMPs) and (b) variably methylated probes (VMPs). Most DMPs were located in genes previously related to neuropsychiatric phenotypes. Remarkably, one of these DMPs (cg01122889) was located in the WDR26 gene, the DNA sequence of which has been implicated in major depressive disorder from genome-wide association studies. Expression of WDR26 has also been proposed as a biomarker of depression in human blood. Complementarily, VMPs were located in genes such as CACNA1C, IGF2 and the p38 MAP kinase MAPK11, showing enrichment for biological processes such as glucocorticoid signaling. These results expand on previous research to indicate that both differential methylation and differential variability have a role in the etiology and clinical manifestation of depression, and provide clues on specific genomic loci of potential interest in the epigenetics of depression.
Collapse
|
217
|
Royer-Bertrand B, Rivolta C. Whole genome sequencing as a means to assess pathogenic mutations in medical genetics and cancer. Cell Mol Life Sci 2015; 72:1463-71. [PMID: 25548800 PMCID: PMC11113357 DOI: 10.1007/s00018-014-1807-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2014] [Revised: 12/12/2014] [Accepted: 12/15/2014] [Indexed: 12/17/2022]
Abstract
The past decade has seen the emergence of next-generation sequencing (NGS) technologies, which have revolutionized the field of human molecular genetics. With NGS, significant portions of the human genome can now be assessed by direct sequence analysis, highlighting normal and pathological variants of our DNA. Recent advances have also allowed the sequencing of complete genomes, by a method referred to as whole genome sequencing (WGS). In this work, we review the use of WGS in medical genetics, with specific emphasis on the benefits and the disadvantages of this technique for detecting genomic alterations leading to Mendelian human diseases and to cancer.
Collapse
Affiliation(s)
- Beryl Royer-Bertrand
- Department of Medical Genetics, University of Lausanne, Rue Du Bugnon 27, 1005 Lausanne, Switzerland
| | - Carlo Rivolta
- Department of Medical Genetics, University of Lausanne, Rue Du Bugnon 27, 1005 Lausanne, Switzerland
| |
Collapse
|
218
|
Onco-proteogenomics: cancer proteomics joins forces with genomics. Nat Methods 2015; 11:1107-13. [PMID: 25357240 DOI: 10.1038/nmeth.3138] [Citation(s) in RCA: 106] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2013] [Accepted: 06/26/2014] [Indexed: 12/21/2022]
Abstract
The complexities of tumor genomes are rapidly being uncovered, but how they are regulated into functional proteomes remains poorly understood. Standard proteomics workflows use databases of known proteins, but these databases do not capture the uniqueness of the cancer transcriptome, with its point mutations, unusual splice variants and gene fusions. Onco-proteogenomics integrates mass spectrometry-generated data with genomic information to identify tumor-specific peptides. Linking tumor-derived DNA, RNA and protein measurements into a central-dogma perspective has the potential to improve our understanding of cancer biology.
Collapse
|
219
|
Apalasamy YD, Mohamed Z. Obesity and genomics: role of technology in unraveling the complex genetic architecture of obesity. Hum Genet 2015; 134:361-74. [PMID: 25687726 DOI: 10.1007/s00439-015-1533-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2014] [Accepted: 02/02/2015] [Indexed: 01/15/2023]
Abstract
Obesity is a complex and multifactorial disease that occurs as a result of the interaction between "obesogenic" environmental factors and genetic components. Although the genetic component of obesity is clear from the heritability studies, the genetic basis remains largely elusive. Successes have been achieved in identifying the causal genes for monogenic obesity using animal models and linkage studies, but these approaches are not fruitful for polygenic obesity. The developments of genome-wide association approach have brought breakthrough discovery of genetic variants for polygenic obesity where tens of new susceptibility loci were identified. However, the common SNPs only accounted for a proportion of heritability. The arrival of NGS technologies and completion of 1000 Genomes Project have brought other new methods to dissect the genetic architecture of obesity, for example, the use of exome genotyping arrays and deep sequencing of candidate loci identified from GWAS to study rare variants. In this review, we summarize and discuss the developments of these genetic approaches in human obesity.
Collapse
Affiliation(s)
- Yamunah Devi Apalasamy
- Department of Pharmacology, Pharmacogenomics Laboratory, Faculty of Medicine, University of Malaya, 50603, Kuala Lumpur, Malaysia,
| | | |
Collapse
|
220
|
Priedigkeit N, Wolfe N, Clark NL. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks. PLoS Genet 2015; 11:e1004967. [PMID: 25679399 PMCID: PMC4334549 DOI: 10.1371/journal.pgen.1004967] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2014] [Accepted: 12/19/2014] [Indexed: 12/27/2022] Open
Abstract
Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC), is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting “disease map” network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks. Molecular evolution has informed our understanding of gene function; however, classical methods have largely been static in their implementation, focusing on single genes. Here, we present and prove the utility of a dynamic, network-based understanding of molecular evolution to infer relationships between genes associated with human diseases. We have shown previously that groups of genes within functional niches tend to share similar evolutionary histories. Exploiting the availability of whole genomes from multiple species, these histories can be numerically scored and dynamically compared to one another using a sequence-based signature termed Evolutionary Rate Covariation (ERC). To explore potential applications, we characterized ERC amongst disease genes and found that many diseases contain significant ERC signatures between their contributing genes. We show that ERC can also prioritize “true” disease genes amongst unrelated gene candidates. Lastly, these signatures can serve as a foundation for creating instructive gene-based networks, unveiling novel relationships between diseases thought to be clinically distinct. Our hope is that this study will add to the increasing evidence that advancing our understanding of molecular evolution can be a crucial asset in large-scale gene discovery pursuits (Link to our webserver that provides intuitive ERC analysis tools: http://csb.pitt.edu/erc_analysis/).
Collapse
Affiliation(s)
- Nolan Priedigkeit
- Medical Scientist Training Program, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, United States of America
- Department of Pharmacology and Chemical Biology, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Nicholas Wolfe
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Nathan L. Clark
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|
221
|
Jin K, Musso G, Vlasblom J, Jessulat M, Deineko V, Negroni J, Mosca R, Malty R, Nguyen-Tran DH, Aoki H, Minic Z, Freywald T, Phanse S, Xiang Q, Freywald A, Aloy P, Zhang Z, Babu M. Yeast Mitochondrial Protein–Protein Interactions Reveal Diverse Complexes and Disease-Relevant Functional Relationships. J Proteome Res 2015; 14:1220-37. [DOI: 10.1021/pr501148q] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Ke Jin
- Terrence
Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Gabriel Musso
- Cardiovascular
Division, Brigham and Women’s Hospital, Boston, Massachusetts 02115, United States
- Department
of Medicine, Harvard Medical School, Boston, Massachusetts 02115, United States
| | - James Vlasblom
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Matthew Jessulat
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Viktor Deineko
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Jacopo Negroni
- Joint
IRB−BSC Program in Computational Biology, IRB, Barcelona 08028, Spain
| | - Roberto Mosca
- Joint
IRB−BSC Program in Computational Biology, IRB, Barcelona 08028, Spain
| | - Ramy Malty
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Diem-Hang Nguyen-Tran
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Hiroyuki Aoki
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Zoran Minic
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Tanya Freywald
- Cancer Research
Unit, Saskatchewan Cancer Agency, Saskatoon, Saskatchewan S7N 5E5, Canada
| | - Sadhna Phanse
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Qian Xiang
- Terrence
Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Andrew Freywald
- Cancer Research
Unit, Saskatchewan Cancer Agency, Saskatoon, Saskatchewan S7N 5E5, Canada
| | - Patrick Aloy
- Joint
IRB−BSC Program in Computational Biology, IRB, Barcelona 08028, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona 08010, Spain
| | - Zhaolei Zhang
- Terrence
Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Mohan Babu
- Department
of Biochemistry, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| |
Collapse
|
222
|
Perkins JR, Sanak M, Canto G, Blanca M, Cornejo-García JA. Unravelling adverse reactions to NSAIDs using systems biology. Trends Pharmacol Sci 2015; 36:172-80. [PMID: 25577398 DOI: 10.1016/j.tips.2014.12.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2014] [Revised: 12/02/2014] [Accepted: 12/05/2014] [Indexed: 12/23/2022]
Abstract
We introduce the reader to systems biology, using adverse drug reactions (ADRs), specifically hypersensitivity reactions to multiple non-steroidal anti-inflammatory drugs (NSAIDs), as a model. To disentangle the different processes that contribute to these reactions - from drug intake to the appearance of symptoms - it will be necessary to create high-throughput datasets. Just as crucial will be the use of systems biology to integrate and make sense of them. We review previous work using systems biology to study related pathologies such as asthma/allergy, and NSAID metabolism. We show examples of their application to NSAIDs-hypersensitivity using current datasets. We describe breakthroughs in high-throughput technology and speculate on their use to improve our understanding of this and other drug-induced pathologies.
Collapse
Affiliation(s)
- James R Perkins
- Research Laboratory, IBIMA, Regional University Hospital of Malaga, UMA, Malaga, Spain
| | - Marek Sanak
- Division of Molecular Biology and Clinical Genetics, Department of Medicine, Jagiellonian University Medical College, Krakow, Poland
| | | | - Miguel Blanca
- Allergy Unit, IBIMA, Regional University Hospital of Malaga, UMA, Malaga, Spain.
| | - José Antonio Cornejo-García
- Research Laboratory, IBIMA, Regional University Hospital of Malaga, UMA, Malaga, Spain; Allergy Unit, IBIMA, Regional University Hospital of Malaga, UMA, Malaga, Spain
| |
Collapse
|
223
|
Aoki-Kinoshita KF, Kinjo AR, Morita M, Igarashi Y, Chen YA, Shigemoto Y, Fujisawa T, Akune Y, Katoda T, Kokubu A, Mori T, Nakao M, Kawashima S, Okamoto S, Katayama T, Ogishima S. Implementation of linked data in the life sciences at BioHackathon 2011. J Biomed Semantics 2015; 6:3. [PMID: 25973165 PMCID: PMC4429360 DOI: 10.1186/2041-1480-6-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2013] [Accepted: 11/27/2014] [Indexed: 01/23/2023] Open
Abstract
Background Linked Data has gained some attention recently in the life sciences as an effective way to provide and share data. As a part of the Semantic Web, data are linked so that a person or machine can explore the web of data. Resource Description Framework (RDF) is the standard means of implementing Linked Data. In the process of generating RDF data, not only are data simply linked to one another, the links themselves are characterized by ontologies, thereby allowing the types of links to be distinguished. Although there is a high labor cost to define an ontology for data providers, the merit lies in the higher level of interoperability with data analysis and visualization software. This increase in interoperability facilitates the multi-faceted retrieval of data, and the appropriate data can be quickly extracted and visualized. Such retrieval is usually performed using the SPARQL (SPARQL Protocol and RDF Query Language) query language, which is used to query RDF data stores. For the database provider, such interoperability will surely lead to an increase in the number of users. Results This manuscript describes the experiences and discussions shared among participants of the week-long BioHackathon 2011 who went through the development of RDF representations of their own data and developed specific RDF and SPARQL use cases. Advice regarding considerations to take when developing RDF representations of their data are provided for bioinformaticians considering making data available and interoperable. Conclusions Participants of the BioHackathon 2011 were able to produce RDF representations of their data and gain a better understanding of the requirements for producing such data in a period of just five days. We summarize the work accomplished with the hope that it will be useful for researchers involved in developing laboratory databases or data analysis, and those who are considering such technologies as RDF and Linked Data.
Collapse
Affiliation(s)
- Kiyoko F Aoki-Kinoshita
- Department of Bioinformatics, Faculty of Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo, 192-8577 Japan
| | - Akira R Kinjo
- Laboratory of Protein Informatics, Laboratory of Protein Databases, and Protein Data Bank Japan, Research Center for Structural and Functional Proteomics, Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita, Osaka, 565-0871 Japan
| | - Mizuki Morita
- Center for Knowledge Structuring, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, Japan
| | - Yoshinobu Igarashi
- National Institute of Biomedical Innovation, 7-6-8 Asagi Saito, Ibaraki-City, Osaka, 567-0085 Japan
| | - Yi-An Chen
- National Institute of Biomedical Innovation, 7-6-8 Asagi Saito, Ibaraki-City, Osaka, 567-0085 Japan
| | - Yasumasa Shigemoto
- DNA Data Bank of Japan, National Institute of Genetics, Yata 1111, Mishima, Shizuoka, 411-8540 Japan
| | - Takatomo Fujisawa
- DNA Data Bank of Japan, National Institute of Genetics, Yata 1111, Mishima, Shizuoka, 411-8540 Japan
| | - Yukie Akune
- Department of Bioinformatics, Faculty of Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo, 192-8577 Japan
| | - Takeo Katoda
- Department of Bioinformatics, Faculty of Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo, 192-8577 Japan
| | - Anna Kokubu
- Department of Bioinformatics, Faculty of Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo, 192-8577 Japan
| | - Takaaki Mori
- Department of Bioinformatics, Faculty of Engineering, Soka University, 1-236 Tangi-machi, Hachioji, Tokyo, 192-8577 Japan
| | - Mitsuteru Nakao
- Next Generation Systems Core Function Unit, Eisai Product Creation Systems, Eisai Co., Ltd, Tsukuba, Ibaraki, Japan
| | - Shuichi Kawashima
- Database Center for Life Science, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa-shi, Chiba, 277-0871 Japan
| | - Shinobu Okamoto
- Database Center for Life Science, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa-shi, Chiba, 277-0871 Japan
| | - Toshiaki Katayama
- Database Center for Life Science, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa-shi, Chiba, 277-0871 Japan
| | - Soichi Ogishima
- Department of Bioclinical informatics, Tohoku Medical Megabank Organization, Tohoku University, Seiryo-cho 4-1, Aoba-ku, Sendai-shi Miyagi, 980-8575 Japan
| |
Collapse
|
224
|
Claxton LD. The history, genotoxicity, and carcinogenicity of carbon-based fuels and their emissions: Part 5. Summary, comparisons, and conclusions. MUTATION RESEARCH-REVIEWS IN MUTATION RESEARCH 2015; 763:103-47. [DOI: 10.1016/j.mrrev.2014.10.001] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2014] [Revised: 10/04/2014] [Accepted: 10/06/2014] [Indexed: 12/19/2022]
|
225
|
Le DH, Xuan Hoai N, Kwon YK. A Comparative Study of Classification-Based Machine Learning Methods for Novel Disease Gene Prediction. ADVANCES IN INTELLIGENT SYSTEMS AND COMPUTING 2015. [DOI: 10.1007/978-3-319-11680-8_46] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
226
|
Taşan M, Musso G, Hao T, Vidal M, MacRae CA, Roth FP. Selecting causal genes from genome-wide association studies via functionally coherent subnetworks. Nat Methods 2014; 12:154-9. [PMID: 25532137 DOI: 10.1038/nmeth.3215] [Citation(s) in RCA: 75] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2014] [Accepted: 11/24/2014] [Indexed: 12/27/2022]
Abstract
Genome-wide association (GWA) studies have linked thousands of loci to human diseases, but the causal genes and variants at these loci generally remain unknown. Although investigators typically focus on genes closest to the associated polymorphisms, the causal gene is often more distal. Reliance on published work to prioritize candidates is biased toward well-characterized genes. We describe a 'prix fixe' strategy and software that uses genome-scale shared-function networks to identify sets of mutually functionally related genes spanning multiple GWA loci. Using associations from ∼100 GWA studies covering ten cancer types, our approach outperformed the common alternative strategy in ranking known cancer genes. As more GWA loci are discovered, the strategy will have increased power to elucidate the causes of human disease.
Collapse
Affiliation(s)
- Murat Taşan
- 1] Donnelly Centre, University of Toronto, Toronto, Ontario, Canada. [2] Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada. [3] Department of Computer Science, University of Toronto, Toronto, Ontario, Canada. [4] Center for Cancer Systems Biology (CCSB), Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA. [5] Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Gabriel Musso
- 1] Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA. [2] Cardiovascular Division, Brigham and Women's Hospital, Boston, Massachusetts, USA
| | - Tong Hao
- 1] Center for Cancer Systems Biology (CCSB), Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA. [2] Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
| | - Marc Vidal
- 1] Center for Cancer Systems Biology (CCSB), Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA. [2] Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
| | - Calum A MacRae
- 1] Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA. [2] Cardiovascular Division, Brigham and Women's Hospital, Boston, Massachusetts, USA
| | - Frederick P Roth
- 1] Donnelly Centre, University of Toronto, Toronto, Ontario, Canada. [2] Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada. [3] Department of Computer Science, University of Toronto, Toronto, Ontario, Canada. [4] Center for Cancer Systems Biology (CCSB), Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, USA. [5] Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada. [6] Canadian Institute for Advanced Research, Toronto, Ontario, Canada
| |
Collapse
|
227
|
Dubchak I, Balasubramanian S, Wang S, Meyden C, Sulakhe D, Poliakov A, Börnigen D, Xie B, Taylor A, Ma J, Paciorkowski AR, Mirzaa GM, Dave P, Agam G, Xu J, Al-Gazali L, Mason CE, Ross ME, Maltsev N, Gilliam TC. An integrative computational approach for prioritization of genomic variants. PLoS One 2014; 9:e114903. [PMID: 25506935 PMCID: PMC4266634 DOI: 10.1371/journal.pone.0114903] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Accepted: 11/15/2014] [Indexed: 12/27/2022] Open
Abstract
An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.
Collapse
Affiliation(s)
- Inna Dubchak
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
- * E-mail: (ID); (NM)
| | - Sandhya Balasubramanian
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Sheng Wang
- Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America
| | - Cem Meyden
- Department of Physiology and Biophysics, Weill Cornell Medical College, New York, New York, United States of America
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medical College, New York, New York, United States of America
- Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, New York, New York, United States of America
| | - Dinanath Sulakhe
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Computation Institute, University of Chicago/Argonne National Laboratory, Chicago, Illinois, United States of America
| | - Alexander Poliakov
- Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America
| | - Daniela Börnigen
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America
| | - Bingqing Xie
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Department of Computer Science, Illinois Institute of Technology, Chicago, Illinois, United States of America
| | - Andrew Taylor
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Jianzhu Ma
- Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America
| | - Alex R. Paciorkowski
- Departments of Neurology, Pediatrics, and Biomedical Genetics and Center for Neural Development and Disease, University of Rochester Medical Center, Rochester, New York, United States of America
| | - Ghayda M. Mirzaa
- Seattle Children's Research Institute and Department of Pediatrics, University of Washington, Seattle, Washington, United States of America
| | - Paul Dave
- Computation Institute, University of Chicago/Argonne National Laboratory, Chicago, Illinois, United States of America
| | - Gady Agam
- Department of Computer Science, Illinois Institute of Technology, Chicago, Illinois, United States of America
| | - Jinbo Xu
- Toyota Technological Institute at Chicago, Chicago, Illinois, United States of America
| | - Lihadh Al-Gazali
- Department of Pediatrics, Faculty of Medicine and Health Sciences, United Arab Emirates University, Al-Ain, UAE
| | - Christopher E. Mason
- Department of Physiology and Biophysics, Weill Cornell Medical College, New York, New York, United States of America
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medical College, New York, New York, United States of America
- Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, New York, New York, United States of America
| | - M. Elizabeth Ross
- Laboratory of Neurogenetics and Development, Weill Cornell Medical College, New York, New York, United States of America
| | - Natalia Maltsev
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Computation Institute, University of Chicago/Argonne National Laboratory, Chicago, Illinois, United States of America
- * E-mail: (ID); (NM)
| | - T. Conrad Gilliam
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Computation Institute, University of Chicago/Argonne National Laboratory, Chicago, Illinois, United States of America
| |
Collapse
|
228
|
Abstract
Background The majority of genetic biomarkers for human cancers are defined by statistical screening of high-throughput genomics data. While a large number of genetic biomarkers have been proposed for diagnostic and prognostic applications, only a small number have been applied in the clinic. Similarly, the use of proteomics methods for the discovery of cancer biomarkers is increasing. The emerging field of proteogenomics seeks to enrich the value of genomics and proteomics approaches by studying the intersection of genomics and proteomics data. This task is challenging due to the complex nature of transcriptional and translation regulatory mechanisms and the disparities between genomic and proteomic data from the same samples. In this study, we have examined tumor antigens as potential biomarkers for breast cancer using genomics and proteomics data from previously reported laser capture microdissected ER+ tumor samples. Results We applied proteogenomic analyses to study the genetic aberrations of 32 tumor antigens determined in the proteomic data. We found that tumor antigens that are aberrantly expressed at the genetic level and expressed at the protein level, are likely involved in perturbing pathways directly linked to the hallmarks of cancer. The results found by proteogenomic analysis of the 32 tumor antigens studied here, capture largely the same pathway irregularities as those elucidated from large-scale screening of genomics analyses, where several thousands of genes are often found to be perturbed. Conclusion Tumor antigens are a group of proteins recognized by the cells of the immune system. Specifically, they are recognized in tumor cells where they are present in larger than usual amounts, or are physiochemically altered to a degree at which they no longer resemble native human proteins. This proteogenomic analysis of 32 tumor antigens suggests that tumor antigens have the potential to be highly specific biomarkers for different cancers.
Collapse
|
229
|
Chang D, Gao F, Slavney A, Ma L, Waldman YY, Sams AJ, Billing-Ross P, Madar A, Spritz R, Keinan A. Accounting for eXentricities: analysis of the X chromosome in GWAS reveals X-linked genes implicated in autoimmune diseases. PLoS One 2014; 9:e113684. [PMID: 25479423 PMCID: PMC4257614 DOI: 10.1371/journal.pone.0113684] [Citation(s) in RCA: 84] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2014] [Accepted: 10/30/2014] [Indexed: 12/12/2022] Open
Abstract
Many complex human diseases are highly sexually dimorphic, suggesting a potential contribution of the X chromosome to disease risk. However, the X chromosome has been neglected or incorrectly analyzed in most genome-wide association studies (GWAS). We present tailored analytical methods and software that facilitate X-wide association studies (XWAS), which we further applied to reanalyze data from 16 GWAS of different autoimmune and related diseases (AID). We associated several X-linked genes with disease risk, among which (1) ARHGEF6 is associated with Crohn's disease and replicated in a study of ulcerative colitis, another inflammatory bowel disease (IBD). Indeed, ARHGEF6 interacts with a gastric bacterium that has been implicated in IBD. (2) CENPI is associated with three different AID, which is compelling in light of known associations with AID of autosomal genes encoding centromere proteins, as well as established autosomal evidence of pleiotropy between autoimmune diseases. (3) We replicated a previous association of FOXP3, a transcription factor that regulates T-cell development and function, with vitiligo; and (4) we discovered that C1GALT1C1 exhibits sex-specific effect on disease risk in both IBDs. These and other X-linked genes that we associated with AID tend to be highly expressed in tissues related to immune response, participate in major immune pathways, and display differential gene expression between males and females. Combined, the results demonstrate the importance of the X chromosome in autoimmunity, reveal the potential of extensive XWAS, even based on existing data, and provide the tools and incentive to properly include the X chromosome in future studies.
Collapse
Affiliation(s)
- Diana Chang
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
- Program in Computational Biology and Medicine, Cornell University, Ithaca, New York, United States of America
| | - Feng Gao
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
| | - Andrea Slavney
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
- Graduate Field of Genetics, Genomics and Development, Cornell University, Ithaca, New York, United States of America
| | - Li Ma
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
- Department of Animal and Avian Sciences, University of Maryland, College Park, Maryland, United States of America
| | - Yedael Y. Waldman
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
| | - Aaron J. Sams
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
| | - Paul Billing-Ross
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
- Graduate Field of Genetics, Genomics and Development, Cornell University, Ithaca, New York, United States of America
| | - Aviv Madar
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
| | - Richard Spritz
- Human Medical Genetics and Genomics Program, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Alon Keinan
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
- Program in Computational Biology and Medicine, Cornell University, Ithaca, New York, United States of America
- Graduate Field of Genetics, Genomics and Development, Cornell University, Ithaca, New York, United States of America
| |
Collapse
|
230
|
Hierarchical closeness efficiently predicts disease genes in a directed signaling network. Comput Biol Chem 2014; 53PB:191-197. [PMID: 25462327 DOI: 10.1016/j.compbiolchem.2014.08.023] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Revised: 08/13/2014] [Accepted: 08/25/2014] [Indexed: 11/21/2022]
Abstract
BACKGROUND Many structural centrality measures were proposed to predict putative disease genes on biological networks. Closeness is one of the best-known structural centrality measures, and its effectiveness for disease gene prediction on undirected biological networks has been frequently reported. However, it is not clear whether closeness is effective for disease gene prediction on directed biological networks such as signaling networks. RESULTS In this paper, we first show that closeness does not significantly outperform other well-known centrality measures such as Degree, Betweenness, and PageRank for disease gene prediction on a human signaling network. In addition, we observed that prediction accuracy by the closeness measure was worse than that by a reachability measure, but closeness could efficiently predict disease genes among a set of genes with the same reachability value. Based on this observation, we devised a novel structural measure, hierarchical closeness, by combining reachability and closeness such that all genes are first ranked by the degree of reachability and then the tied genes are further ranked by closeness. We discovered that hierarchical closeness outperforms other structural centrality measures in disease gene prediction. We also found that the set of highly ranked genes in terms of hierarchical closeness is clearly different from that of hub genes with high connectivity. More interestingly, these findings were consistently reproduced in a random Boolean network model. Finally, we found that genes with relatively high hierarchical closeness are significantly likely to encode proteins in the extracellular matrix and receptor proteins in a human signaling network, supporting the fact that half of all modern medicinal drugs target receptor-encoding genes. CONCLUSION Taken together, hierarchical closeness proposed in this study is a novel structural measure to efficiently predict putative disease genes in a directed signaling network.
Collapse
|
231
|
Podder A, Latha N. New Insights into Schizophrenia Disease Genes Interactome in the Human Brain: Emerging Targets and Therapeutic Implications in the Postgenomics Era. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2014; 18:754-66. [DOI: 10.1089/omi.2014.0082] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Avijit Podder
- Bioinformatics Infrastructure Facility, Sri Venkateswara College, University of Delhi, New Delhi, India
| | - Narayanan Latha
- Bioinformatics Infrastructure Facility, Sri Venkateswara College, University of Delhi, New Delhi, India
| |
Collapse
|
232
|
Magesh R, George Priya Doss C. Computational pipeline to identify and characterize functional mutations in ornithine transcarbamylase deficiency. 3 Biotech 2014; 4:621-634. [PMID: 28324312 PMCID: PMC4235886 DOI: 10.1007/s13205-014-0216-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2014] [Accepted: 04/01/2014] [Indexed: 11/28/2022] Open
Abstract
Ornithine transcarbamylase (OTC) (E.C. 2.1.3.3) is one of the enzymes in the urea cycle, which involves in a sequence of reactions in the liver cells. During protein assimilation in our body surplus nitrogen is made, this open nitrogen is altered into urea and expelled out of the body by kidneys, in this cycle OTC helps in the conversion of free toxic nitrogen into urea. Ornithine transcarbamylase deficiency (OTCD: OMIM#311250) is triggered by mutation in this OTC gene. To date more than 200 mutations have been noted. Mutation in OTC gene indicates alteration in enzyme production, which upsets the ability to carry out the chemical reaction. The computational analysis was initiated to identify the deleterious nsSNPs in OTC gene in causing OTCD using five different computational tools such as SIFT, PolyPhen 2, I-Mutant 3, SNPs&Go, and PhD-SNP. Studies on the molecular basis of OTC gene and OTCD have been done partially till date. Hence, in silico categorization of functional SNPs in OTC gene can provide valuable insight in near future in the diagnosis and treatment of OTCD.
Collapse
Affiliation(s)
- R Magesh
- Department of Biotechnology, Faculty of Biomedical Sciences, Technology and Research, Sri Ramachandra University, Chennai, 600116, India
| | - C George Priya Doss
- Medical Biotechnology Division, School of Biosciences and Technology, VIT University, Vellore, India.
| |
Collapse
|
233
|
Olsen LR, Campos B, Barnkob MS, Winther O, Brusic V, Andersen MH. Bioinformatics for cancer immunotherapy target discovery. Cancer Immunol Immunother 2014; 63:1235-49. [PMID: 25344903 PMCID: PMC11029190 DOI: 10.1007/s00262-014-1627-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Accepted: 10/08/2014] [Indexed: 12/13/2022]
Abstract
The mechanisms of immune response to cancer have been studied extensively and great effort has been invested into harnessing the therapeutic potential of the immune system. Immunotherapies have seen significant advances in the past 20 years, but the full potential of protective and therapeutic cancer immunotherapies has yet to be fulfilled. The insufficient efficacy of existing treatments can be attributed to a number of biological and technical issues. In this review, we detail the current limitations of immunotherapy target selection and design, and review computational methods to streamline therapy target discovery in a bioinformatics analysis pipeline. We describe specialized bioinformatics tools and databases for three main bottlenecks in immunotherapy target discovery: the cataloging of potentially antigenic proteins, the identification of potential HLA binders, and the selection epitopes and co-targets for single-epitope and multi-epitope strategies. We provide examples of application to the well-known tumor antigen HER2 and suggest bioinformatics methods to ameliorate therapy resistance and ensure efficient and lasting control of tumors.
Collapse
Affiliation(s)
- Lars Rønn Olsen
- Department of Biology, Bioinformatics Centre, University of Copenhagen, Ole Maaløes Vej 5, 2200, Copenhagen, Denmark,
| | | | | | | | | | | |
Collapse
|
234
|
Kriventseva EV, Tegenfeldt F, Petty TJ, Waterhouse RM, Simão FA, Pozdnyakov IA, Ioannidis P, Zdobnov EM. OrthoDB v8: update of the hierarchical catalog of orthologs and the underlying free software. Nucleic Acids Res 2014; 43:D250-6. [PMID: 25428351 PMCID: PMC4383991 DOI: 10.1093/nar/gku1220] [Citation(s) in RCA: 241] [Impact Index Per Article: 24.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Orthology, refining the concept of homology, is the cornerstone of evolutionary comparative studies. With the ever-increasing availability of genomic data, inference of orthology has become instrumental for generating hypotheses about gene functions crucial to many studies. This update of the OrthoDB hierarchical catalog of orthologs (http://www.orthodb.org) covers 3027 complete genomes, including the most comprehensive set of 87 arthropods, 61 vertebrates, 227 fungi and 2627 bacteria (sampling the most complete and representative genomes from over 11,000 available). In addition to the most extensive integration of functional annotations from UniProt, InterPro, GO, OMIM, model organism phenotypes and COG functional categories, OrthoDB uniquely provides evolutionary annotations including rates of ortholog sequence divergence, copy-number profiles, sibling groups and gene architectures. We re-designed the entirety of the OrthoDB website from the underlying technology to the user interface, enabling the user to specify species of interest and to select the relevant orthology level by the NCBI taxonomy. The text searches allow use of complex logic with various identifiers of genes, proteins, domains, ontologies or annotation keywords and phrases. Gene copy-number profiles can also be queried. This release comes with the freely available underlying ortholog clustering pipeline (http://www.orthodb.org/software).
Collapse
Affiliation(s)
- Evgenia V Kriventseva
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Fredrik Tegenfeldt
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Tom J Petty
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Robert M Waterhouse
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Felipe A Simão
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Igor A Pozdnyakov
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Panagiotis Ioannidis
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| | - Evgeny M Zdobnov
- Department of Genetic Medicine and Development, University of Geneva Medical School, rue Michel-Servet 1, 1211 Geneva, Switzerland Swiss Institute of Bioinformatics, rue Michel-Servet 1, 1211 Geneva, Switzerland
| |
Collapse
|
235
|
Chen YJ, Lu CT, Su MG, Huang KY, Ching WC, Yang HH, Liao YC, Chen YJ, Lee TY. dbSNO 2.0: a resource for exploring structural environment, functional and disease association and regulatory network of protein S-nitrosylation. Nucleic Acids Res 2014; 43:D503-11. [PMID: 25399423 PMCID: PMC4383970 DOI: 10.1093/nar/gku1176] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Given the increasing number of proteins reported to be regulated by S-nitrosylation (SNO), it is considered to act, in a manner analogous to phosphorylation, as a pleiotropic regulator that elicits dual effects to regulate diverse pathophysiological processes by altering protein function, stability, and conformation change in various cancers and human disorders. Due to its importance in regulating protein functions and cell signaling, dbSNO (http://dbSNO.mbc.nctu.edu.tw) is extended as a resource for exploring structural environment of SNO substrate sites and regulatory networks of S-nitrosylated proteins. An increasing interest in the structural environment of PTM substrate sites motivated us to map all manually curated SNO peptides (4165 SNO sites within 2277 proteins) to PDB protein entries by sequence identity, which provides the information of spatial amino acid composition, solvent-accessible surface area, spatially neighboring amino acids, and side chain orientation for 298 substrate cysteine residues. Additionally, the annotations of protein molecular functions, biological processes, functional domains and human diseases are integrated to explore the functional and disease associations for S-nitrosoproteome. In this update, users are allowed to search a group of interested proteins/genes and the system reconstructs the SNO regulatory network based on the information of metabolic pathways and protein-protein interactions. Most importantly, an endogenous yet pathophysiological S-nitrosoproteomic dataset from colorectal cancer patients was adopted to demonstrate that dbSNO could discover potential SNO proteins involving in the regulation of NO signaling for cancer pathways.
Collapse
Affiliation(s)
- Yi-Ju Chen
- Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan
| | - Cheng-Tsung Lu
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan
| | - Min-Gang Su
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan
| | - Kai-Yao Huang
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan
| | - Wei-Chieh Ching
- Graduate Institute of Life Sciences, National Defense Medical Center, Taipei 114, Taiwan
| | - Hsiao-Hsiang Yang
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan
| | - Yen-Chen Liao
- Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan Department of Chemistry, National Taiwan University, Taipei 114, Taiwan
| | - Yu-Ju Chen
- Institute of Chemistry, Academia Sinica, Taipei 115, Taiwan Department of Chemistry, National Taiwan University, Taipei 114, Taiwan
| | - Tzong-Yi Lee
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan Innovation Center for Big Data and Digital Convergence, Yuan Ze University, Taoyuan 320, Taiwan
| |
Collapse
|
236
|
Bin Raies A, Mansour H, Incitti R, Bajic VB. DDMGD: the database of text-mined associations between genes methylated in diseases from different species. Nucleic Acids Res 2014; 43:D879-86. [PMID: 25398897 PMCID: PMC4383966 DOI: 10.1093/nar/gku1168] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Gathering information about associations between methylated genes and diseases is important for diseases diagnosis and treatment decisions. Recent advancements in epigenetics research allow for large-scale discoveries of associations of genes methylated in diseases in different species. Searching manually for such information is not easy, as it is scattered across a large number of electronic publications and repositories. Therefore, we developed DDMGD database (http://www.cbrc.kaust.edu.sa/ddmgd/) to provide a comprehensive repository of information related to genes methylated in diseases that can be found through text mining. DDMGD's scope is not limited to a particular group of genes, diseases or species. Using the text mining system DEMGD we developed earlier and additional post-processing, we extracted associations of genes methylated in different diseases from PubMed Central articles and PubMed abstracts. The accuracy of extracted associations is 82% as estimated on 2500 hand-curated entries. DDMGD provides a user-friendly interface facilitating retrieval of these associations ranked according to confidence scores. Submission of new associations to DDMGD is provided. A comparison analysis of DDMGD with several other databases focused on genes methylated in diseases shows that DDMGD is comprehensive and includes most of the recent information on genes methylated in diseases.
Collapse
Affiliation(s)
- Arwa Bin Raies
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Hicham Mansour
- Bioscience Core Laboratories, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Roberto Incitti
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Vladimir B Bajic
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| |
Collapse
|
237
|
Abstract
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov.
Collapse
|
238
|
Chen J, Sun M, Shen B. Deciphering oncogenic drivers: from single genes to integrated pathways. Brief Bioinform 2014; 16:413-28. [PMID: 25378434 DOI: 10.1093/bib/bbu039] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2014] [Accepted: 09/18/2014] [Indexed: 12/12/2022] Open
Abstract
Technological advances in next-generation sequencing have uncovered a wide spectrum of aberrations in cancer genomes. The extreme diversity in cancer mutations necessitates computational approaches to differentiate between the 'drivers' with vital function in cancer progression and those nonfunctional 'passengers'. Although individual driver mutations are routinely identified, mutational profiles of different tumors are highly heterogeneous. There is growing consensus that pathways rather than single genes are the primary target of mutations. Here we review extant bioinformatics approaches to identifying oncogenic drivers at different mutational levels, highlighting the strategies for discovering driver pathways and networks from cancer mutation data. These approaches will help reduce the mutation complexity, thus providing a simplified picture of cancer.
Collapse
|
239
|
Pavlopoulou A, Spandidos DA, Michalopoulos I. Human cancer databases (review). Oncol Rep 2014; 33:3-18. [PMID: 25369839 PMCID: PMC4254674 DOI: 10.3892/or.2014.3579] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Accepted: 10/31/2014] [Indexed: 12/20/2022] Open
Abstract
Cancer is one of the four major non‑communicable diseases (NCD), responsible for ~14.6% of all human deaths. Currently, there are >100 different known types of cancer and >500 genes involved in cancer. Ongoing research efforts have been focused on cancer etiology and therapy. As a result, there is an exponential growth of cancer‑associated data from diverse resources, such as scientific publications, genome‑wide association studies, gene expression experiments, gene‑gene or protein‑protein interaction data, enzymatic assays, epigenomics, immunomics and cytogenetics, stored in relevant repositories. These data are complex and heterogeneous, ranging from unprocessed, unstructured data in the form of raw sequences and polymorphisms to well‑annotated, structured data. Consequently, the storage, mining, retrieval and analysis of these data in an efficient and meaningful manner pose a major challenge to biomedical investigators. In the current review, we present the central, publicly accessible databases that contain data pertinent to cancer, the resources available for delivering and analyzing information from these databases, as well as databases dedicated to specific types of cancer. Examples for this wealth of cancer‑related information and bioinformatic tools have also been provided.
Collapse
Affiliation(s)
- Athanasia Pavlopoulou
- Center of Systems Biology, Biomedical Research Foundation, Academy of Athens, Athens 11527, Greece
| | - Demetrios A Spandidos
- Laboratory of Clinical Virology, Medical School, University of Crete, Heraklion 71003, Crete, Greece
| | - Ioannis Michalopoulos
- Center of Systems Biology, Biomedical Research Foundation, Academy of Athens, Athens 11527, Greece
| |
Collapse
|
240
|
Brown GR, Hem V, Katz KS, Ovetsky M, Wallin C, Ermolaeva O, Tolstoy I, Tatusova T, Pruitt KD, Maglott DR, Murphy TD. Gene: a gene-centered information resource at NCBI. Nucleic Acids Res 2014; 43:D36-42. [PMID: 25355515 DOI: 10.1093/nar/gku1055] [Citation(s) in RCA: 431] [Impact Index Per Article: 43.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
The National Center for Biotechnology Information's (NCBI) Gene database (www.ncbi.nlm.nih.gov/gene) integrates gene-specific information from multiple data sources. NCBI Reference Sequence (RefSeq) genomes for viruses, prokaryotes and eukaryotes are the primary foundation for Gene records in that they form the critical association between sequence and a tracked gene upon which additional functional and descriptive content is anchored. Additional content is integrated based on the genomic location and RefSeq transcript and protein sequence data. The content of a Gene record represents the integration of curation and automated processing from RefSeq, collaborating model organism databases, consortia such as Gene Ontology, and other databases within NCBI. Records in Gene are assigned unique, tracked integers as identifiers. The content (citations, nomenclature, genomic location, gene products and their attributes, phenotypes, sequences, interactions, variation details, maps, expression, homologs, protein domains and external databases) is available via interactive browsing through NCBI's Entrez system, via NCBI's Entrez programming utilities (E-Utilities and Entrez Direct) and for bulk transfer by FTP.
Collapse
Affiliation(s)
- Garth R Brown
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Vichet Hem
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Kenneth S Katz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Michael Ovetsky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Craig Wallin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Olga Ermolaeva
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Igor Tolstoy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Tatiana Tatusova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Donna R Maglott
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892-6510, USA
| |
Collapse
|
241
|
Moni MA, Liò P. Network-based analysis of comorbidities risk during an infection: SARS and HIV case studies. BMC Bioinformatics 2014; 15:333. [PMID: 25344230 PMCID: PMC4363349 DOI: 10.1186/1471-2105-15-333] [Citation(s) in RCA: 75] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2013] [Accepted: 09/19/2014] [Indexed: 01/02/2023] Open
Abstract
Background Infections are often associated to comorbidity that increases the risk of medical conditions which can lead to further morbidity and mortality. SARS is a threat which is similar to MERS virus, but the comorbidity is the key aspect to underline their different impacts. One UK doctor says "I’d rather have HIV than diabetes" as life expectancy among diabetes patients is lower than that of HIV. However, HIV has a comorbidity impact on the diabetes. Results We present a quantitative framework to compare and explore comorbidity between diseases. By using neighbourhood based benchmark and topological methods, we have built comorbidity relationships network based on the OMIM and our identified significant genes. Then based on the gene expression, PPI and signalling pathways data, we investigate the comorbidity association of these 2 infective pathologies with other 7 diseases (heart failure, kidney disorder, breast cancer, neurodegenerative disorders, bone diseases, Type 1 and Type 2 diabetes). Phenotypic association is measured by calculating both the Relative Risk as the quantified measures of comorbidity tendency of two disease pairs and the ϕ-correlation to measure the robustness of the comorbidity associations. The differential gene expression profiling strongly suggests that the response of SARS affected patients seems to be mainly an innate inflammatory response and statistically dysregulates a large number of genes, pathways and PPIs subnetworks in different pathologies such as chronic heart failure (21 genes), breast cancer (16 genes) and bone diseases (11 genes). HIV-1 induces comorbidities relationship with many other diseases, particularly strong correlation with the neurological, cancer, metabolic and immunological diseases. Similar comorbidities risk is observed from the clinical information. Moreover, SARS and HIV infections dysregulate 4 genes (ANXA3, GNS, HIST1H1C, RASA3) and 3 genes (HBA1, TFRC, GHITM) respectively that affect the ageing process. It is notable that HIV and SARS similarly dysregulated 11 genes and 3 pathways. Only 4 significantly dysregulated genes are common between SARS-CoV and MERS-CoV, including NFKBIA that is a key regulator of immune responsiveness implicated in susceptibility to infectious and inflammatory diseases. Conclusions Our method presents a ripe opportunity to use data-driven approaches for advancing our current knowledge on disease mechanism and predicting disease comorbidities in a quantitative way. Electronic supplementary material The online version of this article (doi:10.1186/1471-2105-15-333) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mohammad Ali Moni
- Computer Laboratory, University of Cambridge, William Gates Building, 15 JJ Thomson Avenue, Cambridge CB3 0FD, UK.
| | | |
Collapse
|
242
|
Wan Juhari WK, Md Tamrin NA, Mat Daud MHR, Isa HW, Mohd Nasir N, Maran S, Abdul Rajab NS, Ahmad Amin Noordin KB, Nik Hassan NN, Tearle R, Razali R, Merican AF, Zilfalil BA. A whole genome analyses of genetic variants in two Kelantan Malay individuals. THE HUGO JOURNAL 2014; 8:4. [PMID: 27090252 PMCID: PMC4685156 DOI: 10.1186/s11568-014-0004-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 05/04/2014] [Accepted: 09/19/2014] [Indexed: 12/29/2022]
Abstract
Background The sequencing of two members of the Royal Kelantan Malay family genomes will provide insights on the Kelantan Malay whole genome sequences. The two Kelantan Malay genomes were analyzed for the SNP markers associated with thalassemia and Helicobacter pylori infection. Helicobacter pylori infection was reported to be low prevalence in the north-east as compared to the west coast of the Peninsular Malaysia and beta-thalassemia was known to be one of the most common inherited and genetic disorder in Malaysia. Result By combining SNP information from literatures, GWAS study and NCBI ClinVar, 18 unique SNPs were selected for further analysis. From these 18 SNPs, 10 SNPs came from previous study of Helicobacter pylori infection among Malay patients, 6 SNPs were from NCBI ClinVar and 2 SNPs from GWAS studies. The analysis reveals that both Royal Kelantan Malay genomes shared all the 10 SNPs identified by Maran (Single Nucleotide Polymorphims (SNPs) genotypic profiling of Malay patients with and without Helicobacter pylori infection in Kelantan, 2011) and one SNP from GWAS study. In addition, the analysis also reveals that both Royal Kelantan Malay genomes shared 3 SNP markers; HBG1 (rs1061234), HBB (rs1609812) and BCL11A (rs766432) where all three markers were associated with beta-thalassemia. Conclusions Our findings suggest that the Royal Kelantan Malays carry the SNPs which are associated with protection to Helicobacter pylori infection. In addition they also carry SNPs which are associated with beta-thalassemia. These findings are in line with the findings by other researchers who conducted studies on thalassemia and Helicobacter pylori infection in the non-royal Malay population.
Collapse
Affiliation(s)
- Wan Khairunnisa Wan Juhari
- Department of Pediatrics, School of Medical Sciences, Universiti Sains Malaysia, 16150 Kubang Kerian, Kelantan, Malaysia
| | - Nur Aida Md Tamrin
- Faculty of Resource Science and, Technology Universiti Malaysia Sarawak, Sarawak, Malaysia
| | | | - Hatin Wan Isa
- Human Genome Center, School of Medical Sciences, Universiti Sains Malaysia, Universiti Sains Malaysia, Kelantan, Malaysia
| | - Nurfazreen Mohd Nasir
- Human Genome Center, School of Medical Sciences, Universiti Sains Malaysia, Universiti Sains Malaysia, Kelantan, Malaysia
| | - Sathiya Maran
- Human Genome Center, School of Medical Sciences, Universiti Sains Malaysia, Universiti Sains Malaysia, Kelantan, Malaysia
| | - Nur Shafawati Abdul Rajab
- Human Genome Center, School of Medical Sciences, Universiti Sains Malaysia, Universiti Sains Malaysia, Kelantan, Malaysia
| | | | | | - Rick Tearle
- Complete Genomics Inc, 2071 Stierlin Court, Mountain View, 94043, CA, USA
| | | | - Amir Feisal Merican
- Centre of Research for Computational Sciences and Informatics in Biology, Bioindustry, Environment, Agriculture and Healthcare (CRYSTAL), Kuala Lumpur, Malaysia.,Institute of Biological Science, Faculty of Science, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Bin Alwi Zilfalil
- Department of Pediatrics, School of Medical Sciences, Universiti Sains Malaysia, 16150 Kubang Kerian, Kelantan, Malaysia.
| |
Collapse
|
243
|
Computational Analysis Reveals the Association of Threonine 118 Methionine Mutation in PMP22 Resulting in CMT-1A. Adv Bioinformatics 2014; 2014:502618. [PMID: 25400662 PMCID: PMC4220619 DOI: 10.1155/2014/502618] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2014] [Revised: 09/26/2014] [Accepted: 09/26/2014] [Indexed: 12/31/2022] Open
Abstract
The T118M mutation in PMP22 gene is associated with Charcot Marie Tooth, type 1A (CMT1A). CMT1A is a form of Charcot-Marie-Tooth disease, the most common inherited disorder of the peripheral nervous system. Mutations in CMT related disorder are seen to increase the stability of the protein resulting in the diseased state. We performed SNP analysis for all the nsSNPs of PMP22 protein and carried out molecular dynamics simulation for T118M mutation to compare the stability difference between the wild type protein structure and the mutant protein structure. The mutation T118M resulted in the overall increase in the stability of the mutant protein. The superimposed structure shows marked structural variation between the wild type and the mutant protein structures.
Collapse
|
244
|
Cogill SB, Wang L. Co-expression Network Analysis of Human lncRNAs and Cancer Genes. Cancer Inform 2014; 13:49-59. [PMID: 25392693 PMCID: PMC4218681 DOI: 10.4137/cin.s14070] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2014] [Revised: 06/27/2014] [Accepted: 07/01/2014] [Indexed: 12/30/2022] Open
Abstract
We used gene co-expression network analysis to functionally annotate long noncoding RNAs (lncRNAs) and identify their potential cancer associations. The integrated microarray data set from our previous study was used to extract the expression profiles of 1,865 lncRNAs. Known cancer genes were compiled from the Catalogue of Somatic Mutations in Cancer and UniProt databases. Co-expression analysis identified a list of previously uncharacterized lncRNAs that showed significant correlation in expression with core cancer genes. To further annotate the lncRNAs, we performed a weighted gene co-expression network analysis, which resulted in 37 co-expression modules. Three biologically interesting modules were analyzed in depth. Two of the modules showed relatively high expression in blood and brain tissues, whereas the third module was found to be downregulated in blood cells. Hub lncRNA genes and enriched functional annotation terms were identified within the modules. The results suggest the utility of this approach as well as potential roles of uncharacterized lncRNAs in leukemia and neuroblastoma.
Collapse
Affiliation(s)
- Steven B Cogill
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
| | - Liangjiang Wang
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
| |
Collapse
|
245
|
Luo Y, Riedlinger G, Szolovits P. Text mining in cancer gene and pathway prioritization. Cancer Inform 2014; 13:69-79. [PMID: 25392685 PMCID: PMC4216063 DOI: 10.4137/cin.s13874] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2014] [Revised: 05/18/2014] [Accepted: 05/18/2014] [Indexed: 12/18/2022] Open
Abstract
Prioritization of cancer implicated genes has received growing attention as an effective way to reduce wet lab cost by computational analysis that ranks candidate genes according to the likelihood that experimental verifications will succeed. A multitude of gene prioritization tools have been developed, each integrating different data sources covering gene sequences, differential expressions, function annotations, gene regulations, protein domains, protein interactions, and pathways. This review places existing gene prioritization tools against the backdrop of an integrative Omic hierarchy view toward cancer and focuses on the analysis of their text mining components. We explain the relatively slow progress of text mining in gene prioritization, identify several challenges to current text mining methods, and highlight a few directions where more effective text mining algorithms may improve the overall prioritization task and where prioritizing the pathways may be more desirable than prioritizing only genes.
Collapse
Affiliation(s)
- Yuan Luo
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Gregory Riedlinger
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
| | - Peter Szolovits
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
246
|
Kumar CV, Swetha RG, Ramaiah S, Anbarasu A. Tryptophan to Glycine mutation in the position 116 leads to protein aggregation and decreases the stability of the LITAF protein. J Biomol Struct Dyn 2014; 33:1695-709. [DOI: 10.1080/07391102.2014.968211] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
247
|
Begum T, Ghosh TC. Elucidating the genotype-phenotype relationships and network perturbations of human shared and specific disease genes from an evolutionary perspective. Genome Biol Evol 2014; 6:2741-53. [PMID: 25287147 PMCID: PMC4224346 DOI: 10.1093/gbe/evu220] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
To date, numerous studies have been attempted to determine the extent of variation in evolutionary rates between human disease and nondisease (ND) genes. In our present study, we have considered human autosomal monogenic (Mendelian) disease genes, which were classified into two groups according to the number of phenotypic defects, that is, specific disease (SPD) gene (one gene: one defect) and shared disease (SHD) gene (one gene: multiple defects). Here, we have compared the evolutionary rates of these two groups of genes, that is, SPD genes and SHD genes with respect to ND genes. We observed that the average evolutionary rates are slow in SHD group, intermediate in SPD group, and fast in ND group. Group-to-group evolutionary rate differences remain statistically significant regardless of their gene expression levels and number of defects. We demonstrated that disease genes are under strong selective constraint if they emerge through edgetic perturbation or drug-induced perturbation of the interactome network, show tissue-restricted expression, and are involved in transmembrane transport. Among all the factors, our regression analyses interestingly suggest the independent effects of 1) drug-induced perturbation and 2) the interaction term of expression breadth and transmembrane transport on protein evolutionary rates. We reasoned that the drug-induced network disruption is a combination of several edgetic perturbations and, thus, has more severe effect on gene phenotypes.
Collapse
Affiliation(s)
- Tina Begum
- Bioinformatics Centre, Bose Institute, Kolkata, West Bengal, India
| | | |
Collapse
|
248
|
Online Diagnosis System: a webserver for analysis of Sanger sequencing-based genetic testing data. Methods 2014; 69:230-6. [PMID: 25063568 DOI: 10.1016/j.ymeth.2014.07.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Revised: 06/25/2014] [Accepted: 07/05/2014] [Indexed: 11/24/2022] Open
Abstract
Sanger sequencing is a well-established molecular technique for diagnosis of genetic diseases. In these tests, DNA sequencers produce vast amounts of data that need to be examined and annotated within a short period of time. To achieve this goal, an online bioinformatics platform that can automate the process is essential. However, to date, there is no such integrated bioinformatics platform available. To fulfill this gap, we developed the Online Diagnosis System (ODS), which is a freely available webserver and supports the commonly used file format of Sanger sequencing data. ODS seamlessly integrates base calling, single nucleotide variation (SNV) identification, and SNV annotation into one single platform. It also allows laboratorians to manually inspect the quality of the identified SNVs in the final report. ODS can significantly reduce the data analysis time therefore allows Sanger sequencing-based genetic testing to be finished in a timely manner. ODS is freely available at http://sunlab.lihs.cuhk.edu.hk/ODS/.
Collapse
|
249
|
Burger JD, Doughty E, Khare R, Wei CH, Mishra R, Aberdeen J, Tresner-Kirsch D, Wellner B, Kann MG, Lu Z, Hirschman L. Hybrid curation of gene-mutation relations combining automated extraction and crowdsourcing. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau094. [PMID: 25246425 PMCID: PMC4170591 DOI: 10.1093/database/bau094] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Background: This article describes capture of biological information using a hybrid approach that combines natural language processing to extract biological entities and crowdsourcing with annotators recruited via Amazon Mechanical Turk to judge correctness of candidate biological relations. These techniques were applied to extract gene– mutation relations from biomedical abstracts with the goal of supporting production scale capture of gene–mutation–disease findings as an open source resource for personalized medicine. Results: The hybrid system could be configured to provide good performance for gene–mutation extraction (precision ∼82%; recall ∼70% against an expert-generated gold standard) at a cost of $0.76 per abstract. This demonstrates that crowd labor platforms such as Amazon Mechanical Turk can be used to recruit quality annotators, even in an application requiring subject matter expertise; aggregated Turker judgments for gene–mutation relations exceeded 90% accuracy. Over half of the precision errors were due to mismatches against the gold standard hidden from annotator view (e.g. incorrect EntrezGene identifier or incorrect mutation position extracted), or incomplete task instructions (e.g. the need to exclude nonhuman mutations). Conclusions: The hybrid curation model provides a readily scalable cost-effective approach to curation, particularly if coupled with expert human review to filter precision errors. We plan to generalize the framework and make it available as open source software. Database URL:http://www.mitre.org/publications/technical-papers/hybrid-curation-of-gene-mutation-relations-combining-automated
Collapse
Affiliation(s)
- John D Burger
- The MITRE Corporation, Bedford, MA 01730, USA, Biomedical Informatics Program, Stanford University, Stanford, CA 94305, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and The University of Maryland, Baltimore County, Baltimore MD 21250, USA
| | - Emily Doughty
- The MITRE Corporation, Bedford, MA 01730, USA, Biomedical Informatics Program, Stanford University, Stanford, CA 94305, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and The University of Maryland, Baltimore County, Baltimore MD 21250, USA
| | - Ritu Khare
- The MITRE Corporation, Bedford, MA 01730, USA, Biomedical Informatics Program, Stanford University, Stanford, CA 94305, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and The University of Maryland, Baltimore County, Baltimore MD 21250, USA
| | - Chih-Hsuan Wei
- The MITRE Corporation, Bedford, MA 01730, USA, Biomedical Informatics Program, Stanford University, Stanford, CA 94305, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and The University of Maryland, Baltimore County, Baltimore MD 21250, USA
| | - Rajashree Mishra
- The MITRE Corporation, Bedford, MA 01730, USA, Biomedical Informatics Program, Stanford University, Stanford, CA 94305, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and The University of Maryland, Baltimore County, Baltimore MD 21250, USA
| | - John Aberdeen
- The MITRE Corporation, Bedford, MA 01730, USA, Biomedical Informatics Program, Stanford University, Stanford, CA 94305, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and The University of Maryland, Baltimore County, Baltimore MD 21250, USA
| | - David Tresner-Kirsch
- The MITRE Corporation, Bedford, MA 01730, USA, Biomedical Informatics Program, Stanford University, Stanford, CA 94305, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and The University of Maryland, Baltimore County, Baltimore MD 21250, USA
| | - Ben Wellner
- The MITRE Corporation, Bedford, MA 01730, USA, Biomedical Informatics Program, Stanford University, Stanford, CA 94305, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and The University of Maryland, Baltimore County, Baltimore MD 21250, USA
| | - Maricel G Kann
- The MITRE Corporation, Bedford, MA 01730, USA, Biomedical Informatics Program, Stanford University, Stanford, CA 94305, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and The University of Maryland, Baltimore County, Baltimore MD 21250, USA
| | - Zhiyong Lu
- The MITRE Corporation, Bedford, MA 01730, USA, Biomedical Informatics Program, Stanford University, Stanford, CA 94305, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and The University of Maryland, Baltimore County, Baltimore MD 21250, USA
| | - Lynette Hirschman
- The MITRE Corporation, Bedford, MA 01730, USA, Biomedical Informatics Program, Stanford University, Stanford, CA 94305, USA, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and The University of Maryland, Baltimore County, Baltimore MD 21250, USA
| |
Collapse
|
250
|
Ma'ayan A, Rouillard AD, Clark NR, Wang Z, Duan Q, Kou Y. Lean Big Data integration in systems biology and systems pharmacology. Trends Pharmacol Sci 2014; 35:450-60. [PMID: 25109570 PMCID: PMC4153537 DOI: 10.1016/j.tips.2014.07.001] [Citation(s) in RCA: 75] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2014] [Revised: 07/01/2014] [Accepted: 07/08/2014] [Indexed: 12/11/2022]
Abstract
Data sets from recent large-scale projects can be integrated into one unified puzzle that can provide new insights into how drugs and genetic perturbations applied to human cells are linked to whole-organism phenotypes. Data that report how drugs affect the phenotype of human cell lines and how drugs induce changes in gene and protein expression in human cell lines can be combined with knowledge about human disease, side effects induced by drugs, and mouse phenotypes. Such data integration efforts can be achieved through the conversion of data from the various resources into single-node-type networks, gene-set libraries, or multipartite graphs. This approach can lead us to the identification of more relationships between genes, drugs, and phenotypes as well as benchmark computational and experimental methods. Overall, this lean 'Big Data' integration strategy will bring us closer toward the goal of realizing personalized medicine.
Collapse
Affiliation(s)
- Avi Ma'ayan
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA.
| | - Andrew D Rouillard
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| | - Neil R Clark
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| | - Zichen Wang
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| | - Qiaonan Duan
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| | - Yan Kou
- Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, Systems Biology Center New York (SBCNY), One Gustave L. Levy Place, Box 1215, New York, NY 10029, USA
| |
Collapse
|