1
|
Bajic VP, Salhi A, Lakota K, Radovanovic A, Razali R, Zivkovic L, Spremo-Potparevic B, Uludag M, Tifratene F, Motwalli O, Marchand B, Bajic VB, Gojobori T, Isenovic ER, Essack M. DES-Amyloidoses “Amyloidoses through the looking-glass”: A knowledgebase developed for exploring and linking information related to human amyloid-related diseases. PLoS One 2022; 17:e0271737. [PMID: 35877764 PMCID: PMC9312389 DOI: 10.1371/journal.pone.0271737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 07/06/2022] [Indexed: 11/23/2022] Open
Abstract
More than 30 types of amyloids are linked to close to 50 diseases in humans, the most prominent being Alzheimer’s disease (AD). AD is brain-related local amyloidosis, while another amyloidosis, such as AA amyloidosis, tends to be more systemic. Therefore, we need to know more about the biological entities’ influencing these amyloidosis processes. However, there is currently no support system developed specifically to handle this extraordinarily complex and demanding task. To acquire a systematic view of amyloidosis and how this may be relevant to the brain and other organs, we needed a means to explore "amyloid network systems" that may underly processes that leads to an amyloid-related disease. In this regard, we developed the DES-Amyloidoses knowledgebase (KB) to obtain fast and relevant information regarding the biological network related to amyloid proteins/peptides and amyloid-related diseases. This KB contains information obtained through text and data mining of available scientific literature and other public repositories. The information compiled into the DES-Amyloidoses system based on 19 topic-specific dictionaries resulted in 796,409 associations between terms from these dictionaries. Users can explore this information through various options, including enriched concepts, enriched pairs, and semantic similarity. We show the usefulness of the KB using an example focused on inflammasome-amyloid associations. To our knowledge, this is the only KB dedicated to human amyloid-related diseases derived primarily through literature text mining and complemented by data mining that provides a novel way of exploring information relevant to amyloidoses.
Collapse
Affiliation(s)
- Vladan P. Bajic
- Institute of Nuclear Sciences “VINCA", Laboratory for Radiobiology and Molecular Genetics, University of Belgrade, Belgrade, Republic of Serbia
- * E-mail: (ME); (VPB)
| | - Adil Salhi
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Katja Lakota
- Department of Physiology, Faculty of Pharmacy, University of Belgrade, Belgrade, Serbia
| | - Aleksandar Radovanovic
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Rozaimi Razali
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Lada Zivkovic
- Department of Physiology, Faculty of Pharmacy, University of Belgrade, Belgrade, Serbia
| | | | - Mahmut Uludag
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Faroug Tifratene
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Olaa Motwalli
- Saudi Electronic University (SEU), College of Computing and Informatics, Madinah, Kingdom of Saudi Arabia
| | | | - Vladimir B. Bajic
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Takashi Gojobori
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
- Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
| | - Esma R. Isenovic
- Institute of Nuclear Sciences “VINCA", Laboratory for Radiobiology and Molecular Genetics, University of Belgrade, Belgrade, Republic of Serbia
| | - Magbubah Essack
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia
- * E-mail: (ME); (VPB)
| |
Collapse
|
2
|
AlSaieedi A, Salhi A, Tifratene F, Raies AB, Hungler A, Uludag M, Van Neste C, Bajic VB, Gojobori T, Essack M. DES-Tcell is a knowledgebase for exploring immunology-related literature. Sci Rep 2021; 11:14344. [PMID: 34253812 PMCID: PMC8275784 DOI: 10.1038/s41598-021-93809-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 06/24/2021] [Indexed: 12/02/2022] Open
Abstract
T-cells are a subtype of white blood cells circulating throughout the body, searching for infected and abnormal cells. They have multifaceted functions that include scanning for and directly killing cells infected with intracellular pathogens, eradicating abnormal cells, orchestrating immune response by activating and helping other immune cells, memorizing encountered pathogens, and providing long-lasting protection upon recurrent infections. However, T-cells are also involved in immune responses that result in organ transplant rejection, autoimmune diseases, and some allergic diseases. To support T-cell research, we developed the DES-Tcell knowledgebase (KB). This KB incorporates text- and data-mined information that can expedite retrieval and exploration of T-cell relevant information from the large volume of published T-cell-related research. This KB enables exploration of data through concepts from 15 topic-specific dictionaries, including immunology-related genes, mutations, pathogens, and pathways. We developed three case studies using DES-Tcell, one of which validates effective retrieval of known associations by DES-Tcell. The second and third case studies focuses on concepts that are common to Grave’s disease (GD) and Hashimoto’s thyroiditis (HT). Several reports have shown that up to 20% of GD patients treated with antithyroid medication develop HT, thus suggesting a possible conversion or shift from GD to HT disease. DES-Tcell found miR-4442 links to both GD and HT, and that miR-4442 possibly targets the autoimmune disease risk factor CD6, which provides potential new knowledge derived through the use of DES-Tcell. According to our understanding, DES-Tcell is the first KB dedicated to exploring T-cell-relevant information via literature-mining, data-mining, and topic-specific dictionaries.
Collapse
Affiliation(s)
- Ahdab AlSaieedi
- Department of Medical Laboratory Technology (MLT), Faculty of Applied Medical Sciences (FAMS), King Abdulaziz University (KAU), Jeddah, 21589-80324, Saudi Arabia
| | - Adil Salhi
- Computer, Electrical, and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Faroug Tifratene
- Computer, Electrical, and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Arwa Bin Raies
- Computer, Electrical, and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Arnaud Hungler
- Computer, Electrical, and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Mahmut Uludag
- Computer, Electrical, and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Christophe Van Neste
- Computer, Electrical, and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Vladimir B Bajic
- Computer, Electrical, and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Takashi Gojobori
- Computer, Electrical, and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Magbubah Essack
- Computer, Electrical, and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
| |
Collapse
|
3
|
DES-ROD: Exploring Literature to Develop New Links between RNA Oxidation and Human Diseases. OXIDATIVE MEDICINE AND CELLULAR LONGEVITY 2020; 2020:5904315. [PMID: 32308806 PMCID: PMC7142358 DOI: 10.1155/2020/5904315] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 02/21/2020] [Indexed: 12/27/2022]
Abstract
Normal cellular physiology and biochemical processes require undamaged RNA molecules. However, RNAs are frequently subjected to oxidative damage. Overproduction of reactive oxygen species (ROS) leads to RNA oxidation and disturbs redox (oxidation-reduction reaction) homeostasis. When oxidation damage affects RNA carrying protein-coding information, this may result in the synthesis of aberrant proteins as well as a lower efficiency of translation. Both of these, as well as imbalanced redox homeostasis, may lead to numerous human diseases. The number of studies on the effects of RNA oxidative damage in mammals is increasing by year due to the understanding that this oxidation fundamentally leads to numerous human diseases. To enable researchers in this field to explore information relevant to RNA oxidation and effects on human diseases, we developed DES-ROD, an online knowledgebase that contains processed information from 298,603 relevant documents that consist of PubMed abstracts and PubMed Central full-text articles. The system utilizes concepts/terms from 38 curated thematic dictionaries mapped to the analyzed documents. Researchers can explore enriched concepts, as well as enriched pairs of putatively associated concepts. In this way, one can explore mutual relationships between any combinations of two concepts from used dictionaries. Dictionaries cover a wide range of biomedical topics, such as human genes and proteins, pathways, Gene Ontology categories, mutations, noncoding RNAs, enzymes, toxins, metabolites, and diseases. This makes insights into different facets of the effects of RNA oxidation and the control of this process possible. The usefulness of the DES-ROD system is demonstrated by case studies on some known information, as well as potentially novel information involving RNA oxidation and diseases. DES-ROD is the first knowledgebase based on text and data mining that focused on the exploration of RNA oxidation and human diseases.
Collapse
|
4
|
Lin Y, Zhao X, Miao Z, Ling Z, Wei X, Pu J, Hou J, Shen B. Data-driven translational prostate cancer research: from biomarker discovery to clinical decision. J Transl Med 2020; 18:119. [PMID: 32143723 PMCID: PMC7060655 DOI: 10.1186/s12967-020-02281-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Accepted: 02/26/2020] [Indexed: 02/08/2023] Open
Abstract
Prostate cancer (PCa) is a common malignant tumor with increasing incidence and high heterogeneity among males worldwide. In the era of big data and artificial intelligence, the paradigm of biomarker discovery is shifting from traditional experimental and small data-based identification toward big data-driven and systems-level screening. Complex interactions between genetic factors and environmental effects provide opportunities for systems modeling of PCa genesis and evolution. We hereby review the current research frontiers in informatics for PCa clinical translation. First, the heterogeneity and complexity in PCa development and clinical theranostics are introduced to raise the concern for PCa systems biology studies. Then biomarkers and risk factors ranging from molecular alternations to clinical phenotype and lifestyle changes are explicated for PCa personalized management. Methodologies and applications for multi-dimensional data integration and computational modeling are discussed. The future perspectives and challenges for PCa systems medicine and holistic healthcare are finally provided.
Collapse
Affiliation(s)
- Yuxin Lin
- Department of Urology, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - Xiaojun Zhao
- Department of Urology, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - Zhijun Miao
- Department of Urology, Suzhou Dushuhu Public Hospital, Suzhou, 215123, China
| | - Zhixin Ling
- Department of Urology, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - Xuedong Wei
- Department of Urology, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - Jinxian Pu
- Department of Urology, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - Jianquan Hou
- Department of Urology, The First Affiliated Hospital of Soochow University, Suzhou, 215006, China.
| | - Bairong Shen
- Institutes for Systems Genetics, West China Hospital, Sichuan University, Chengdu, 610041, China.
| |
Collapse
|
5
|
Obradovic M, Essack M, Zafirovic S, Sudar‐Milovanovic E, Bajic VP, Van Neste C, Trpkovic A, Stanimirovic J, Bajic VB, Isenovic ER. Redox control of vascular biology. Biofactors 2020; 46:246-262. [PMID: 31483915 PMCID: PMC7187163 DOI: 10.1002/biof.1559] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Accepted: 08/14/2019] [Indexed: 12/12/2022]
Abstract
Redox control is lost when the antioxidant defense system cannot remove abnormally high concentrations of signaling molecules, such as reactive oxygen species (ROS). Chronically elevated levels of ROS cause oxidative stress that may eventually lead to cancer and cardiovascular and neurodegenerative diseases. In this review, we focus on redox effects in the vascular system. We pay close attention to the subcompartments of the vascular system (endothelium, smooth muscle cell layer) and give an overview of how redox changes influence those different compartments. We also review the core aspects of redox biology, cardiovascular physiology, and pathophysiology. Moreover, the topic-specific knowledgebase DES-RedoxVasc was used to develop two case studies, one focused on endothelial cells and the other on the vascular smooth muscle cells, as a starting point to possibly extend our knowledge of redox control in vascular biology.
Collapse
Affiliation(s)
- Milan Obradovic
- Laboratory of Radiobiology and Molecular GeneticsVinca Institute of Nuclear Sciences, University of BelgradeBelgradeSerbia
| | - Magbubah Essack
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE)ThuwalKingdom of Saudi Arabia
| | - Sonja Zafirovic
- Laboratory of Radiobiology and Molecular GeneticsVinca Institute of Nuclear Sciences, University of BelgradeBelgradeSerbia
| | - Emina Sudar‐Milovanovic
- Laboratory of Radiobiology and Molecular GeneticsVinca Institute of Nuclear Sciences, University of BelgradeBelgradeSerbia
| | - Vladan P. Bajic
- Laboratory of Radiobiology and Molecular GeneticsVinca Institute of Nuclear Sciences, University of BelgradeBelgradeSerbia
| | - Christophe Van Neste
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE)ThuwalKingdom of Saudi Arabia
| | - Andreja Trpkovic
- Laboratory of Radiobiology and Molecular GeneticsVinca Institute of Nuclear Sciences, University of BelgradeBelgradeSerbia
| | - Julijana Stanimirovic
- Laboratory of Radiobiology and Molecular GeneticsVinca Institute of Nuclear Sciences, University of BelgradeBelgradeSerbia
| | - Vladimir B. Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE)ThuwalKingdom of Saudi Arabia
| | - Esma R. Isenovic
- Laboratory of Radiobiology and Molecular GeneticsVinca Institute of Nuclear Sciences, University of BelgradeBelgradeSerbia
| |
Collapse
|
6
|
Essack M, Salhi A, Stanimirovic J, Tifratene F, Bin Raies A, Hungler A, Uludag M, Van Neste C, Trpkovic A, Bajic VP, Bajic VB, Isenovic ER. Literature-Based Enrichment Insights into Redox Control of Vascular Biology. OXIDATIVE MEDICINE AND CELLULAR LONGEVITY 2019; 2019:1769437. [PMID: 31223421 PMCID: PMC6542245 DOI: 10.1155/2019/1769437] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Revised: 04/11/2019] [Accepted: 05/02/2019] [Indexed: 02/07/2023]
Abstract
In cellular physiology and signaling, reactive oxygen species (ROS) play one of the most critical roles. ROS overproduction leads to cellular oxidative stress. This may lead to an irrecoverable imbalance of redox (oxidation-reduction reaction) function that deregulates redox homeostasis, which itself could lead to several diseases including neurodegenerative disease, cardiovascular disease, and cancers. In this study, we focus on the redox effects related to vascular systems in mammals. To support research in this domain, we developed an online knowledge base, DES-RedoxVasc, which enables exploration of information contained in the biomedical scientific literature. The DES-RedoxVasc system analyzed 233399 documents consisting of PubMed abstracts and PubMed Central full-text articles related to different aspects of redox biology in vascular systems. It allows researchers to explore enriched concepts from 28 curated thematic dictionaries, as well as literature-derived potential associations of pairs of such enriched concepts, where associations themselves are statistically enriched. For example, the system allows exploration of associations of pathways, diseases, mutations, genes/proteins, miRNAs, long ncRNAs, toxins, drugs, biological processes, molecular functions, etc. that allow for insights about different aspects of redox effects and control of processes related to the vascular system. Moreover, we deliver case studies about some existing or possibly novel knowledge regarding redox of vascular biology demonstrating the usefulness of DES-RedoxVasc. DES-RedoxVasc is the first compiled knowledge base using text mining for the exploration of this topic.
Collapse
Affiliation(s)
- Magbubah Essack
- King Abdullah University of Science and Technology, Computational Bioscience Research Center, Thuwal, Saudi Arabia
| | - Adil Salhi
- King Abdullah University of Science and Technology, Computational Bioscience Research Center, Thuwal, Saudi Arabia
| | - Julijana Stanimirovic
- Vinca Institute, University of Belgrade, Laboratory for Molecular Endocrinology and Radiobiology, Belgrade, Serbia
| | - Faroug Tifratene
- King Abdullah University of Science and Technology, Computational Bioscience Research Center, Thuwal, Saudi Arabia
| | - Arwa Bin Raies
- King Abdullah University of Science and Technology, Computational Bioscience Research Center, Thuwal, Saudi Arabia
| | - Arnaud Hungler
- King Abdullah University of Science and Technology, Computational Bioscience Research Center, Thuwal, Saudi Arabia
| | - Mahmut Uludag
- King Abdullah University of Science and Technology, Computational Bioscience Research Center, Thuwal, Saudi Arabia
| | - Christophe Van Neste
- King Abdullah University of Science and Technology, Computational Bioscience Research Center, Thuwal, Saudi Arabia
| | - Andreja Trpkovic
- Vinca Institute, University of Belgrade, Laboratory for Molecular Endocrinology and Radiobiology, Belgrade, Serbia
| | - Vladan P. Bajic
- Vinca Institute, University of Belgrade, Laboratory for Molecular Endocrinology and Radiobiology, Belgrade, Serbia
| | - Vladimir B. Bajic
- King Abdullah University of Science and Technology, Computational Bioscience Research Center, Thuwal, Saudi Arabia
| | - Esma R. Isenovic
- Vinca Institute, University of Belgrade, Laboratory for Molecular Endocrinology and Radiobiology, Belgrade, Serbia
| |
Collapse
|
7
|
Kordopati V, Salhi A, Razali R, Radovanovic A, Tifratene F, Uludag M, Li Y, Bokhari A, AlSaieedi A, Bin Raies A, Van Neste C, Essack M, Bajic VB. DES-Mutation: System for Exploring Links of Mutations and Diseases. Sci Rep 2018; 8:13359. [PMID: 30190574 PMCID: PMC6127254 DOI: 10.1038/s41598-018-31439-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Accepted: 08/17/2018] [Indexed: 12/17/2022] Open
Abstract
During cellular division DNA replicates and this process is the basis for passing genetic information to the next generation. However, the DNA copy process sometimes produces a copy that is not perfect, that is, one with mutations. The collection of all such mutations in the DNA copy of an organism makes it unique and determines the organism’s phenotype. However, mutations are often the cause of diseases. Thus, it is useful to have the capability to explore links between mutations and disease. We approached this problem by analyzing a vast amount of published information linking mutations to disease states. Based on such information, we developed the DES-Mutation knowledgebase which allows for exploration of not only mutation-disease links, but also links between mutations and concepts from 27 topic-specific dictionaries such as human genes/proteins, toxins, pathogens, etc. This allows for a more detailed insight into mutation-disease links and context. On a sample of 600 mutation-disease associations predicted and curated, our system achieves precision of 72.83%. To demonstrate the utility of DES-Mutation, we provide case studies related to known or potentially novel information involving disease mutations. To our knowledge, this is the first mutation-disease knowledgebase dedicated to the exploration of this topic through text-mining and data-mining of different mutation types and their associations with terms from multiple thematic dictionaries.
Collapse
Affiliation(s)
- Vasiliki Kordopati
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Adil Salhi
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Rozaimi Razali
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Aleksandar Radovanovic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Faroug Tifratene
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Mahmut Uludag
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Yu Li
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Ameerah Bokhari
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Ahdab AlSaieedi
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia.,King Abdulaziz University (KAU), Faculty of Applied Medical Sciences (FAMS), Department of Medical Laboratory Technology (MLT), Jeddah, 21589-80324, Saudi Arabia
| | - Arwa Bin Raies
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Christophe Van Neste
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia.,Ghent University, Center for Medical Genetics Ghent (CMGG), B-9000, Ghent, Belgium
| | - Magbubah Essack
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia.
| |
Collapse
|
8
|
Tungekar A, Mandarthi S, Mandaviya PR, Gadekar VP, Tantry A, Kotian S, Reddy J, Prabha D, Bhat S, Sahay S, Mascarenhas R, Badkillaya RR, Nagasampige MK, Yelnadu M, Pawar H, Hebbar P, Kashyap MK. ESCC ATLAS: A population wide compendium of biomarkers for Esophageal Squamous Cell Carcinoma. Sci Rep 2018. [PMID: 30143675 DOI: 10.1038/s41598-018-30579-3,] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Esophageal cancer (EC) is the eighth most aggressive malignancy and its treatment remains a challenge due to the lack of biomarkers that can facilitate early detection. EC is identified in two major histological forms namely - Adenocarcinoma (EAC) and Squamous cell carcinoma (ESCC), each showing differences in the incidence among populations that are geographically separated. Hence the detection of potential drug target and biomarkers demands a population-centric understanding of the molecular and cellular mechanisms of EC. To provide an adequate impetus to the biomarker discovery for ESCC, which is the most prevalent esophageal cancer worldwide, here we have developed ESCC ATLAS, a manually curated database that integrates genetic, epigenetic, transcriptomic, and proteomic ESCC-related genes from the published literature. It consists of 3475 genes associated to molecular signatures such as, altered transcription (2600), altered translation (560), contain copy number variation/structural variations (233), SNPs (102), altered DNA methylation (82), Histone modifications (16) and miRNA based regulation (261). We provide a user-friendly web interface ( http://www.esccatlas.org , freely accessible for academic, non-profit users) that facilitates the exploration and the analysis of genes among different populations. We anticipate it to be a valuable resource for the population specific investigation and biomarker discovery for ESCC.
Collapse
Affiliation(s)
- Asna Tungekar
- Mbiomics, Manipal, Karnataka, India.,Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India
| | - Sumana Mandarthi
- Mbiomics, Manipal, Karnataka, India.,Department of Biochemistry, Kasturba Medical College, Manipal University, Manipal, Karnataka, India
| | - Pooja Rajendra Mandaviya
- Mbiomics, Manipal, Karnataka, India.,Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India
| | - Veerendra P Gadekar
- Mbiomics, Manipal, Karnataka, India.,Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India.,Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, 1090, Vienna, Austria
| | - Ananthajith Tantry
- Mbiomics, Manipal, Karnataka, India.,Manipal Center for Information Sciences, Manipal University, Manipal, Karnataka, India
| | - Sowmya Kotian
- Mbiomics, Manipal, Karnataka, India.,Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India
| | - Jyotshna Reddy
- Mbiomics, Manipal, Karnataka, India.,Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India
| | | | - Sushma Bhat
- Mbiomics, Manipal, Karnataka, India.,Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India
| | | | - Roshan Mascarenhas
- Mbiomics, Manipal, Karnataka, India.,Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India.,Newcastle University Medicine Malaysia, Johor Bahru, 79200, Malaysia
| | - Raghavendra Rao Badkillaya
- Mbiomics, Manipal, Karnataka, India.,Department of Biotechnology, Alva's college, Moodubidre, Karnataka, India
| | - Manoj Kumar Nagasampige
- Mbiomics, Manipal, Karnataka, India.,Department of Biotechnology, Sikkim Manipal University, Gangtok, Sikkim, 737102, India
| | - Mohan Yelnadu
- Mbiomics, Manipal, Karnataka, India.,Manipal Center for Information Sciences, Manipal University, Manipal, Karnataka, India.,Infosys Technologies Ltd, Bangalore, Karnataka, India.,Faculty of Biology, Technion-Israel Institute of Technology, Haifa, 3200003, Israel
| | - Harsh Pawar
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa, 3200003, Israel
| | - Prashantha Hebbar
- Mbiomics, Manipal, Karnataka, India. .,Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India.
| | - Manoj Kumar Kashyap
- Mbiomics, Manipal, Karnataka, India. .,Faculty of Applied Sciences and Biotechnology, Shoolini University of Biotechnology and Management Sciences, Bajhol, Solan, Himachal Pradesh 173229, India. .,School of Life and Allied Health Sciences, Glocal University, Saharanpur, Uttar Pradesh, 247001, India. .,Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, 1090, Vienna, Austria.
| |
Collapse
|
9
|
Tungekar A, Mandarthi S, Mandaviya PR, Gadekar VP, Tantry A, Kotian S, Reddy J, Prabha D, Bhat S, Sahay S, Mascarenhas R, Badkillaya RR, Nagasampige MK, Yelnadu M, Pawar H, Hebbar P, Kashyap MK. ESCC ATLAS: A population wide compendium of biomarkers for Esophageal Squamous Cell Carcinoma. Sci Rep 2018; 8:12715. [PMID: 30143675 PMCID: PMC6109081 DOI: 10.1038/s41598-018-30579-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Accepted: 08/01/2018] [Indexed: 02/07/2023] Open
Abstract
Esophageal cancer (EC) is the eighth most aggressive malignancy and its treatment remains a challenge due to the lack of biomarkers that can facilitate early detection. EC is identified in two major histological forms namely - Adenocarcinoma (EAC) and Squamous cell carcinoma (ESCC), each showing differences in the incidence among populations that are geographically separated. Hence the detection of potential drug target and biomarkers demands a population-centric understanding of the molecular and cellular mechanisms of EC. To provide an adequate impetus to the biomarker discovery for ESCC, which is the most prevalent esophageal cancer worldwide, here we have developed ESCC ATLAS, a manually curated database that integrates genetic, epigenetic, transcriptomic, and proteomic ESCC-related genes from the published literature. It consists of 3475 genes associated to molecular signatures such as, altered transcription (2600), altered translation (560), contain copy number variation/structural variations (233), SNPs (102), altered DNA methylation (82), Histone modifications (16) and miRNA based regulation (261). We provide a user-friendly web interface ( http://www.esccatlas.org , freely accessible for academic, non-profit users) that facilitates the exploration and the analysis of genes among different populations. We anticipate it to be a valuable resource for the population specific investigation and biomarker discovery for ESCC.
Collapse
Affiliation(s)
- Asna Tungekar
- Mbiomics, Manipal, Karnataka, India
- Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India
| | - Sumana Mandarthi
- Mbiomics, Manipal, Karnataka, India
- Department of Biochemistry, Kasturba Medical College, Manipal University, Manipal, Karnataka, India
| | - Pooja Rajendra Mandaviya
- Mbiomics, Manipal, Karnataka, India
- Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India
| | - Veerendra P Gadekar
- Mbiomics, Manipal, Karnataka, India
- Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India
- Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, 1090, Vienna, Austria
| | - Ananthajith Tantry
- Mbiomics, Manipal, Karnataka, India
- Manipal Center for Information Sciences, Manipal University, Manipal, Karnataka, India
| | - Sowmya Kotian
- Mbiomics, Manipal, Karnataka, India
- Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India
| | - Jyotshna Reddy
- Mbiomics, Manipal, Karnataka, India
- Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India
| | | | - Sushma Bhat
- Mbiomics, Manipal, Karnataka, India
- Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India
| | | | - Roshan Mascarenhas
- Mbiomics, Manipal, Karnataka, India
- Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India
- Newcastle University Medicine Malaysia, Johor Bahru, 79200, Malaysia
| | - Raghavendra Rao Badkillaya
- Mbiomics, Manipal, Karnataka, India
- Department of Biotechnology, Alva's college, Moodubidre, Karnataka, India
| | - Manoj Kumar Nagasampige
- Mbiomics, Manipal, Karnataka, India
- Department of Biotechnology, Sikkim Manipal University, Gangtok, Sikkim, 737102, India
| | - Mohan Yelnadu
- Mbiomics, Manipal, Karnataka, India
- Manipal Center for Information Sciences, Manipal University, Manipal, Karnataka, India
- Infosys Technologies Ltd, Bangalore, Karnataka, India
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa, 3200003, Israel
| | - Harsh Pawar
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa, 3200003, Israel
| | - Prashantha Hebbar
- Mbiomics, Manipal, Karnataka, India.
- Manipal Life Sciences Center, Manipal University, Manipal, Karnataka, India.
| | - Manoj Kumar Kashyap
- Mbiomics, Manipal, Karnataka, India.
- Faculty of Applied Sciences and Biotechnology, Shoolini University of Biotechnology and Management Sciences, Bajhol, Solan, Himachal Pradesh 173229, India.
- School of Life and Allied Health Sciences, Glocal University, Saharanpur, Uttar Pradesh, 247001, India.
- Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, 1090, Vienna, Austria.
| |
Collapse
|
10
|
Salhi A, Negrão S, Essack M, Morton MJL, Bougouffa S, Razali R, Radovanovic A, Marchand B, Kulmanov M, Hoehndorf R, Tester M, Bajic VB. DES-TOMATO: A Knowledge Exploration System Focused On Tomato Species. Sci Rep 2017; 7:5968. [PMID: 28729549 PMCID: PMC5519719 DOI: 10.1038/s41598-017-05448-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2017] [Accepted: 05/25/2017] [Indexed: 12/29/2022] Open
Abstract
Tomato is the most economically important horticultural crop used as a model to study plant biology and particularly fruit development. Knowledge obtained from tomato research initiated improvements in tomato and, being transferrable to other such economically important crops, has led to a surge of tomato-related research and published literature. We developed DES-TOMATO knowledgebase (KB) for exploration of information related to tomato. Information exploration is enabled through terms from 26 dictionaries and combination of these terms. To illustrate the utility of DES-TOMATO, we provide several examples how one can efficiently use this KB to retrieve known or potentially novel information. DES-TOMATO is free for academic and nonprofit users and can be accessed at http://cbrc.kaust.edu.sa/des_tomato/, using any of the mainstream web browsers, including Firefox, Safari and Chrome.
Collapse
Affiliation(s)
- Adil Salhi
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Sónia Negrão
- King Abdullah University of Science and Technology (KAUST), Division of Biological and Environmental Sciences and Engineering, Thuwal, 23955-6900, Saudi Arabia
| | - Magbubah Essack
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Mitchell J L Morton
- King Abdullah University of Science and Technology (KAUST), Division of Biological and Environmental Sciences and Engineering, Thuwal, 23955-6900, Saudi Arabia
| | - Salim Bougouffa
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Rozaimi Razali
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Aleksandar Radovanovic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | | | - Maxat Kulmanov
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
| | - Robert Hoehndorf
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia
- King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Thuwal, 23955-6900, Saudi Arabia
| | - Mark Tester
- King Abdullah University of Science and Technology (KAUST), Division of Biological and Environmental Sciences and Engineering, Thuwal, 23955-6900, Saudi Arabia
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, 23955-6900, Saudi Arabia.
- King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Thuwal, 23955-6900, Saudi Arabia.
| |
Collapse
|
11
|
Salhi A, Essack M, Alam T, Bajic VP, Ma L, Radovanovic A, Marchand B, Schmeier S, Zhang Z, Bajic VB. DES-ncRNA: A knowledgebase for exploring information about human micro and long noncoding RNAs based on literature-mining. RNA Biol 2017; 14:963-971. [PMID: 28387604 PMCID: PMC5546543 DOI: 10.1080/15476286.2017.1312243] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2017] [Revised: 02/23/2017] [Accepted: 03/24/2017] [Indexed: 01/08/2023] Open
Abstract
Noncoding RNAs (ncRNAs), particularly microRNAs (miRNAs) and long ncRNAs (lncRNAs), are important players in diseases and emerge as novel drug targets. Thus, unraveling the relationships between ncRNAs and other biomedical entities in cells are critical for better understanding ncRNA roles that may eventually help develop their use in medicine. To support ncRNA research and facilitate retrieval of relevant information regarding miRNAs and lncRNAs from the plethora of published ncRNA-related research, we developed DES-ncRNA ( www.cbrc.kaust.edu.sa/des_ncrna ). DES-ncRNA is a knowledgebase containing text- and data-mined information from public scientific literature and other public resources. Exploration of mined information is enabled through terms and pairs of terms from 19 topic-specific dictionaries including, for example, antibiotics, toxins, drugs, enzymes, mutations, pathways, human genes and proteins, drug indications and side effects, mutations, diseases, etc. DES-ncRNA contains approximately 878,000 associations of terms from these dictionaries of which 36,222 (5,373) are with regards to miRNAs (lncRNAs). We provide several ways to explore information regarding ncRNAs to users including controlled generation of association networks as well as hypotheses generation. We show an example how DES-ncRNA can aid research on Alzheimer disease and suggest potential therapeutic role for Fasudil. DES-ncRNA is a powerful tool that can be used on its own or as a complement to the existing resources, to support research in human ncRNA. To our knowledge, this is the only knowledgebase dedicated to human miRNAs and lncRNAs derived primarily through literature-mining enabling exploration of a broad spectrum of associated biomedical entities, not paralleled by any other resource.
Collapse
Affiliation(s)
- Adil Salhi
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Kingdom of Saudi Arabia
| | - Magbubah Essack
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Kingdom of Saudi Arabia
| | - Tanvir Alam
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Kingdom of Saudi Arabia
| | - Vladan P. Bajic
- VINCA Institute of Nuclear Sciences, Belgrade, Republic of Serbia
| | - Lina Ma
- BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing, China
| | - Aleksandar Radovanovic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Kingdom of Saudi Arabia
| | | | - Sebastian Schmeier
- Massey University Auckland, Institute of Natural and Mathematical Sciences, Albany, Auckland, New Zealand
| | - Zhang Zhang
- BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, Beijing, China
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
- Collaborative Innovation Center of Genetics and Development, Fudan University, Shanghai, China
| | - Vladimir B. Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Kingdom of Saudi Arabia
| |
Collapse
|
12
|
He H, Lin D, Zhang J, Wang YP, Deng HW. Comparison of statistical methods for subnetwork detection in the integration of gene expression and protein interaction network. BMC Bioinformatics 2017; 18:149. [PMID: 28253853 PMCID: PMC5335754 DOI: 10.1186/s12859-017-1567-2] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 02/24/2017] [Indexed: 11/10/2022] Open
Abstract
Background With the advancement of high-throughput technologies and enrichment of popular public databases, more and more research focuses of bioinformatics research have been on computational integration of network and gene expression profiles for extracting context-dependent active subnetworks. Many methods for subnetwork searching have been developed. Scoring and searching algorithms present a range of computational considerations and implementations. The primary goal of present study is to comprehensively evaluate the performance of different subnetwork detection methods. Eleven popular methods were selected for comprehensive comparison. Results First, taking into account the dependence of genes given a protein-protein interaction (PPI) network, we simulated microarray gene expression data under case and control conditions. Then each method was applied to the simulated data for subnetwork identification. Second, a large microarray data set of prostate cancer was used to assess the practical performance of each method. Using both simulation studies and a real data application, we evaluated the performance of different methods in terms of recall and precision. Conclusions jActiveModules, PinnacleZ and WMAXC performed well in identifying subnetwork with relative high precision and recall. BioNet performed very well only in precision. As none of methods outperformed other methods overall, users should choose an appropriate method based on the purposes of their studies.
Collapse
Affiliation(s)
- Hao He
- Department of Biostatistics and Bioinformatics, Center for Bioinformatics and Genomics, Tulane University School of Public Health and Tropical Medicine, 1440 Canal St., Suite 2001, New Orleans, LA, 70112, USA
| | - Dongdong Lin
- Department of Biomedical Engineering, Tulane University, New Orleans, LA, 70118, USA
| | - Jigang Zhang
- Department of Biostatistics and Bioinformatics, Center for Bioinformatics and Genomics, Tulane University School of Public Health and Tropical Medicine, 1440 Canal St., Suite 2001, New Orleans, LA, 70112, USA
| | - Yu-Ping Wang
- Department of Biostatistics and Bioinformatics, Center for Bioinformatics and Genomics, Tulane University School of Public Health and Tropical Medicine, 1440 Canal St., Suite 2001, New Orleans, LA, 70112, USA.,Department of Biomedical Engineering, Tulane University, New Orleans, LA, 70118, USA
| | - Hong-Wen Deng
- Department of Biostatistics and Bioinformatics, Center for Bioinformatics and Genomics, Tulane University School of Public Health and Tropical Medicine, 1440 Canal St., Suite 2001, New Orleans, LA, 70112, USA.
| |
Collapse
|
13
|
Agarwal R, Kumar B, Jayadev M, Raghav D, Singh A. CoReCG: a comprehensive database of genes associated with colon-rectal cancer. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw059. [PMID: 27114494 PMCID: PMC4843536 DOI: 10.1093/database/baw059] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2015] [Accepted: 03/23/2016] [Indexed: 12/19/2022]
Abstract
Cancer of large intestine is commonly referred as colorectal cancer, which is also the third most frequently prevailing neoplasm across the globe. Though, much of work is being carried out to understand the mechanism of carcinogenesis and advancement of this disease but, fewer studies has been performed to collate the scattered information of alterations in tumorigenic cells like genes, mutations, expression changes, epigenetic alteration or post translation modification, genetic heterogeneity. Earlier findings were mostly focused on understanding etiology of colorectal carcinogenesis but less emphasis were given for the comprehensive review of the existing findings of individual studies which can provide better diagnostics based on the suggested markers in discrete studies. Colon Rectal Cancer Gene Database (CoReCG), contains 2056 colon-rectal cancer genes information involved in distinct colorectal cancer stages sourced from published literature with an effective knowledge based information retrieval system. Additionally, interactive web interface enriched with various browsing sections, augmented with advance search facility for querying the database is provided for user friendly browsing, online tools for sequence similarity searches and knowledge based schema ensures a researcher friendly information retrieval mechanism. Colorectal cancer gene database (CoReCG) is expected to be a single point source for identification of colorectal cancer-related genes, thereby helping with the improvement of classification, diagnosis and treatment of human cancers. Database URL: lms.snu.edu.in/corecg
Collapse
Affiliation(s)
- Rahul Agarwal
- Department of Life Science, Shiv Nadar University, Greater Noida, India
| | - Binayak Kumar
- Department of Life Science, Shiv Nadar University, Greater Noida, India
| | - Msk Jayadev
- Department of Life Science, Shiv Nadar University, Greater Noida, India
| | - Dhwani Raghav
- Department of Health Research (Ministry of Health & Family Welfare), Division of Epidemiology and Communicable Diseases, Indian Council of Medical Research, Ansari Nagar, New Delhi, India
| | - Ashutosh Singh
- Department of Life Science, Shiv Nadar University, Greater Noida, India
| |
Collapse
|
14
|
Text Mining for Precision Medicine: Bringing Structure to EHRs and Biomedical Literature to Understand Genes and Health. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2016; 939:139-166. [PMID: 27807747 DOI: 10.1007/978-981-10-1503-8_7] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
The key question of precision medicine is whether it is possible to find clinically actionable granularity in diagnosing disease and classifying patient risk. The advent of next-generation sequencing and the widespread adoption of electronic health records (EHRs) have provided clinicians and researchers a wealth of data and made possible the precise characterization of individual patient genotypes and phenotypes. Unstructured text-found in biomedical publications and clinical notes-is an important component of genotype and phenotype knowledge. Publications in the biomedical literature provide essential information for interpreting genetic data. Likewise, clinical notes contain the richest source of phenotype information in EHRs. Text mining can render these texts computationally accessible and support information extraction and hypothesis generation. This chapter reviews the mechanics of text mining in precision medicine and discusses several specific use cases, including database curation for personalized cancer medicine, patient outcome prediction from EHR-derived cohorts, and pharmacogenomic research. Taken as a whole, these use cases demonstrate how text mining enables effective utilization of existing knowledge sources and thus promotes increased value for patients and healthcare systems. Text mining is an indispensable tool for translating genotype-phenotype data into effective clinical care that will undoubtedly play an important role in the eventual realization of precision medicine.
Collapse
|
15
|
Salhi A, Essack M, Radovanovic A, Marchand B, Bougouffa S, Antunes A, Simoes MF, Lafi FF, Motwalli OA, Bokhari A, Malas T, Amoudi SA, Othum G, Allam I, Mineta K, Gao X, Hoehndorf R, C Archer JA, Gojobori T, Bajic VB. DESM: portal for microbial knowledge exploration systems. Nucleic Acids Res 2015; 44:D624-33. [PMID: 26546514 PMCID: PMC4702830 DOI: 10.1093/nar/gkv1147] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2015] [Accepted: 10/19/2015] [Indexed: 12/31/2022] Open
Abstract
Microorganisms produce an enormous variety of chemical compounds. It is of general interest for microbiology and biotechnology researchers to have means to explore information about molecular and genetic basis of functioning of different microorganisms and their ability for bioproduction. To enable such exploration, we compiled 45 topic-specific knowledgebases (KBs) accessible through DESM portal (www.cbrc.kaust.edu.sa/desm). The KBs contain information derived through text-mining of PubMed information and complemented by information data-mined from various other resources (e.g. ChEBI, Entrez Gene, GO, KOBAS, KEGG, UniPathways, BioGrid). All PubMed records were indexed using 4 538 278 concepts from 29 dictionaries, with 1 638 986 records utilized in KBs. Concepts used are normalized whenever possible. Most of the KBs focus on a particular type of microbial activity, such as production of biocatalysts or nutraceuticals. Others are focused on specific categories of microorganisms, e.g. streptomyces or cyanobacteria. KBs are all structured in a uniform manner and have a standardized user interface. Information exploration is enabled through various searches. Users can explore statistically most significant concepts or pairs of concepts, generate hypotheses, create interactive networks of associated concepts and export results. We believe DESM will be a useful complement to the existing resources to benefit microbiology and biotechnology research.
Collapse
Affiliation(s)
- Adil Salhi
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Magbubah Essack
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Aleksandar Radovanovic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | | | - Salim Bougouffa
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Andre Antunes
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Marta Filipa Simoes
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Feras F Lafi
- King Abdullah University of Science and Technology (KAUST), Center for Desert Agriculture (CDA), Thuwal 23955-6900, Kingdom of Saudi Arabia King Abdullah University of Science and Technology (KAUST), Biological and Environmental Sciences and Engineering Division (BESE), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Olaa A Motwalli
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Ameerah Bokhari
- King Abdullah University of Science and Technology (KAUST), Center for Desert Agriculture (CDA), Thuwal 23955-6900, Kingdom of Saudi Arabia King Abdullah University of Science and Technology (KAUST), Biological and Environmental Sciences and Engineering Division (BESE), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Tariq Malas
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Soha Al Amoudi
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Ghofran Othum
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Intikhab Allam
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Katsuhiko Mineta
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xin Gao
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Robert Hoehndorf
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - John A C Archer
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Takashi Gojobori
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia King Abdullah University of Science and Technology (KAUST), Biological and Environmental Sciences and Engineering Division (BESE), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Kingdom of Saudi Arabia King Abdullah University of Science and Technology (KAUST), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Thuwal 23955-6900, Kingdom of Saudi Arabia
| |
Collapse
|
16
|
Laitinen VH, Rantapero T, Fischer D, Vuorinen EM, Tammela TL, Wahlfors T, Schleutker J. Fine-mapping the 2q37 and 17q11.2-q22 loci for novel genes and sequence variants associated with a genetic predisposition to prostate cancer. Int J Cancer 2015; 136:2316-27. [PMID: 25335771 PMCID: PMC4355047 DOI: 10.1002/ijc.29276] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2014] [Accepted: 10/01/2014] [Indexed: 01/13/2023]
Abstract
The 2q37 and 17q12-q22 loci are linked to an increased prostate cancer (PrCa) risk. No candidate gene has been localized at 2q37 and the HOXB13 variant G84E only partially explains the linkage to 17q21-q22 observed in Finland. We screened these regions by targeted DNA sequencing to search for cancer-associated variants. Altogether, four novel susceptibility alleles were identified. Two ZNF652 (17q21.3) variants, rs116890317 and rs79670217, increased the risk of both sporadic and hereditary PrCa (rs116890317: OR = 3.3-7.8, p = 0.003-3.3 × 10(-5) ; rs79670217: OR = 1.6-1.9, p = 0.002-0.009). The HDAC4 (2q37.2) variant rs73000144 (OR = 14.6, p = 0.018) and the EFCAB13 (17q21.3) variant rs118004742 (OR = 1.8, p = 0.048) were overrepresented in patients with familial PrCa. To map the variants within 2q37 and 17q11.2-q22 that may regulate PrCa-associated genes, we combined DNA sequencing results with transcriptome data obtained by RNA sequencing. This expression quantitative trait locus (eQTL) analysis identified 272 single-nucleotide polymorphisms (SNPs) possibly regulating six genes that were differentially expressed between cases and controls. In a modified approach, prefiltered PrCa-associated SNPs were exploited and interestingly, a novel eQTL targeting ZNF652 was identified. The novel variants identified in this study could be utilized for PrCa risk assessment, and they further validate the suggested role of ZNF652 as a PrCa candidate gene. The regulatory regions discovered by eQTL mapping increase our understanding of the relationship between regulation of gene expression and susceptibility to PrCa and provide a valuable starting point for future functional research.
Collapse
Affiliation(s)
- Virpi H. Laitinen
- BioMediTech, University of Tampere and Fimlab Laboratories, FI-33520 Tampere, Finland
| | - Tommi Rantapero
- BioMediTech, University of Tampere and Fimlab Laboratories, FI-33520 Tampere, Finland
| | - Daniel Fischer
- School of Health Sciences, University of Tampere, FI-33014 Tampere, Finland
| | - Elisa M. Vuorinen
- BioMediTech, University of Tampere and Fimlab Laboratories, FI-33520 Tampere, Finland
| | - Teuvo L.J. Tammela
- Department of Urology, Tampere University Hospital and Medical School, University of Tampere, FI-33520 Tampere, Finland
| | | | - Tiina Wahlfors
- BioMediTech, University of Tampere and Fimlab Laboratories, FI-33520 Tampere, Finland
| | - Johanna Schleutker
- BioMediTech, University of Tampere and Fimlab Laboratories, FI-33520 Tampere, Finland
- Medical Biochemistry and Genetics, Institute of Biomedicine, FI-20014 University of Turku, Turku, Finland
| |
Collapse
|
17
|
Blank AT, Gage M, Tejwani N, McLaurin T. Overlapping Dislocation of the Pubic Symphysis with an Open Reduction and Anterior and Posterior Pelvic Ring Fixation: A Case Report. JBJS Case Connect 2015; 5:e6. [PMID: 29252342 DOI: 10.2106/jbjs.cc.n.00082] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
CASE We present a case of a patient who sustained overlapping dislocation of the pubic symphysis (ODPS), which required an open reduction as well as anterior and posterior pelvic ring fixation. CONCLUSION This case report is a valuable addition to the current literature on ODPS because we believe it to be the first report describing a patient who required both anterior and posterior fixation because of pelvic instability.
Collapse
Affiliation(s)
- Alan T Blank
- NYU Hospital for Joint Diseases, 301 East 17th Street, New York, NY 10003
| | | | | | | |
Collapse
|
18
|
Jin D, Lee H. A computational approach to identifying gene-microRNA modules in cancer. PLoS Comput Biol 2015; 11:e1004042. [PMID: 25611546 PMCID: PMC4303261 DOI: 10.1371/journal.pcbi.1004042] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2014] [Accepted: 11/16/2014] [Indexed: 11/21/2022] Open
Abstract
MicroRNAs (miRNAs) play key roles in the initiation and progression of various cancers by regulating genes. Regulatory interactions between genes and miRNAs are complex, as multiple miRNAs can regulate multiple genes. In addtion, these interactions vary from patient to patient and even among patients with the same cancer type, as cancer development is a heterogeneous process. These relationships are more complicated because transcription factors and other regulatory molecules can also regulate miRNAs and genes. Hence, it is important to identify the complex relationships between genes and miRNAs in cancer. In this study, we propose a computational approach to constructing modules that represent these relationships by integrating the expression data of genes and miRNAs with gene-gene interaction data. First, we used a biclustering algorithm to construct modules consisting of a subset of genes and a subset of samples to incorporate the heterogeneity of cancer cells. Second, we combined gene-gene interactions to include genes that play important roles in cancer-related pathways. Then, we selected miRNAs that are closely associated with genes in the modules based on a Gaussian Bayesian network and Bayesian Information Criteria. When we applied our approach to ovarian cancer and glioblastoma (GBM) data sets, 33 and 54 modules were constructed, respectively. In these modules, 91% and 94% of ovarian cancer and GBM modules, respectively, were explained either by direct regulation between genes and miRNAs or by indirect relationships via transcription factors. In addition, 48.4% and 74.0% of modules from ovarian cancer and GBM, respectively, were enriched with cancer-related pathways, and 51.7% and 71.7% of miRNAs in modules were ovarian cancer-related miRNAs and GBM-related miRNAs, respectively. Finally, we extensively analyzed significant modules and showed that most genes in these modules were related to ovarian cancer and GBM. A microRNA (miRNA) is a small RNA molecule that regulates the expression of mRNA genes. A miRNA can regulate multiple genes, and a gene can be regulated by multiple miRNAs. The regulation of genes by miRNAs may vary from patient to patient, even if they suffer from the same type of cancer. In this study, we identify the relationships between genes and miRNAs in cancer patients using expression data. Because these relationships are complicated by the involvement of transcription factors, which are among the most influential regulators of genes, we also attempt to explain the triple relationship among genes, miRNAs, and transcription factors. We constructed modules consisting of a set of genes and miRNAs, in which the expression levels are highly correlated. In most of these modules, genes and miRNAs are related to specific cancer types; their relationships are explained both by direct regulation of genes by miRNAs and by indirect relationships via transcription factors.
Collapse
Affiliation(s)
- Daeyong Jin
- School of Information and Communications, Gwangju Institute of Science and Technology, Gwangju, South Korea
| | - Hyunju Lee
- School of Information and Communications, Gwangju Institute of Science and Technology, Gwangju, South Korea
- * E-mail:
| |
Collapse
|
19
|
Pavlopoulou A, Spandidos DA, Michalopoulos I. Human cancer databases (review). Oncol Rep 2014; 33:3-18. [PMID: 25369839 PMCID: PMC4254674 DOI: 10.3892/or.2014.3579] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Accepted: 10/31/2014] [Indexed: 12/20/2022] Open
Abstract
Cancer is one of the four major non‑communicable diseases (NCD), responsible for ~14.6% of all human deaths. Currently, there are >100 different known types of cancer and >500 genes involved in cancer. Ongoing research efforts have been focused on cancer etiology and therapy. As a result, there is an exponential growth of cancer‑associated data from diverse resources, such as scientific publications, genome‑wide association studies, gene expression experiments, gene‑gene or protein‑protein interaction data, enzymatic assays, epigenomics, immunomics and cytogenetics, stored in relevant repositories. These data are complex and heterogeneous, ranging from unprocessed, unstructured data in the form of raw sequences and polymorphisms to well‑annotated, structured data. Consequently, the storage, mining, retrieval and analysis of these data in an efficient and meaningful manner pose a major challenge to biomedical investigators. In the current review, we present the central, publicly accessible databases that contain data pertinent to cancer, the resources available for delivering and analyzing information from these databases, as well as databases dedicated to specific types of cancer. Examples for this wealth of cancer‑related information and bioinformatic tools have also been provided.
Collapse
Affiliation(s)
- Athanasia Pavlopoulou
- Center of Systems Biology, Biomedical Research Foundation, Academy of Athens, Athens 11527, Greece
| | - Demetrios A Spandidos
- Laboratory of Clinical Virology, Medical School, University of Crete, Heraklion 71003, Crete, Greece
| | - Ioannis Michalopoulos
- Center of Systems Biology, Biomedical Research Foundation, Academy of Athens, Athens 11527, Greece
| |
Collapse
|
20
|
Amgalan B, Lee H. WMAXC: a weighted maximum clique method for identifying condition-specific sub-network. PLoS One 2014; 9:e104993. [PMID: 25148538 PMCID: PMC4141761 DOI: 10.1371/journal.pone.0104993] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2014] [Accepted: 07/07/2014] [Indexed: 11/19/2022] Open
Abstract
Sub-networks can expose complex patterns in an entire bio-molecular network by extracting interactions that depend on temporal or condition-specific contexts. When genes interact with each other during cellular processes, they may form differential co-expression patterns with other genes across different cell states. The identification of condition-specific sub-networks is of great importance in investigating how a living cell adapts to environmental changes. In this work, we propose the weighted MAXimum clique (WMAXC) method to identify a condition-specific sub-network. WMAXC first proposes scoring functions that jointly measure condition-specific changes to both individual genes and gene-gene co-expressions. It then employs a weaker formula of a general maximum clique problem and relates the maximum scored clique of a weighted graph to the optimization of a quadratic objective function under sparsity constraints. We combine a continuous genetic algorithm and a projection procedure to obtain a single optimal sub-network that maximizes the objective function (scoring function) over the standard simplex (sparsity constraints). We applied the WMAXC method to both simulated data and real data sets of ovarian and prostate cancer. Compared with previous methods, WMAXC selected a large fraction of cancer-related genes, which were enriched in cancer-related pathways. The results demonstrated that our method efficiently captured a subset of genes relevant under the investigated condition.
Collapse
Affiliation(s)
- Bayarbaatar Amgalan
- School of Information and Communications, Gwangju Institute of Science and Technology, Gwangju, South Korea
| | - Hyunju Lee
- School of Information and Communications, Gwangju Institute of Science and Technology, Gwangju, South Korea
| |
Collapse
|
21
|
Lee HJ, Dang TC, Lee H, Park JC. OncoSearch: cancer gene search engine with literature evidence. Nucleic Acids Res 2014; 42:W416-21. [PMID: 24813447 PMCID: PMC4086113 DOI: 10.1093/nar/gku368] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
In order to identify genes that are involved in oncogenesis and to understand how such genes affect cancers, abnormal gene expressions in cancers are actively studied. For an efficient access to the results of such studies that are reported in biomedical literature, the relevant information is accumulated via text-mining tools and made available through the Web. However, current Web tools are not yet tailored enough to allow queries that specify how a cancer changes along with the change in gene expression level, which is an important piece of information to understand an involved gene's role in cancer progression or regression. OncoSearch is a Web-based engine that searches Medline abstracts for sentences that mention gene expression changes in cancers, with queries that specify (i) whether a gene expression level is up-regulated or down-regulated, (ii) whether a certain type of cancer progresses or regresses along with such gene expression change and (iii) the expected role of the gene in the cancer. OncoSearch is available through http://oncosearch.biopathway.org.
Collapse
Affiliation(s)
- Hee-Jin Lee
- Department of Computer Science, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 305-701, Republic of Korea
| | - Tien Cuong Dang
- Department of Computer Science, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 305-701, Republic of Korea
| | - Hyunju Lee
- School of Information and Communications, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 500-712, Republic of Korea
| | - Jong C Park
- Department of Computer Science, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 305-701, Republic of Korea
| |
Collapse
|
22
|
Li Y, Vongsangnak W, Chen L, Shen B. Integrative analysis reveals disease-associated genes and biomarkers for prostate cancer progression. BMC Med Genomics 2014; 7 Suppl 1:S3. [PMID: 25080090 PMCID: PMC4110715 DOI: 10.1186/1755-8794-7-s1-s3] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
23
|
Lee HJ, Shim SH, Song MR, Lee H, Park JC. CoMAGC: a corpus with multi-faceted annotations of gene-cancer relations. BMC Bioinformatics 2013; 14:323. [PMID: 24225062 PMCID: PMC3833657 DOI: 10.1186/1471-2105-14-323] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2013] [Accepted: 11/05/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In order to access the large amount of information in biomedical literature about genes implicated in various cancers both efficiently and accurately, the aid of text mining (TM) systems is invaluable. Current TM systems do target either gene-cancer relations or biological processes involving genes and cancers, but the former type produces information not comprehensive enough to explain how a gene affects a cancer, and the latter does not provide a concise summary of gene-cancer relations. RESULTS In this paper, we present a corpus for the development of TM systems that are specifically targeting gene-cancer relations but are still able to capture complex information in biomedical sentences. We describe CoMAGC, a corpus with multi-faceted annotations of gene-cancer relations. In CoMAGC, a piece of annotation is composed of four semantically orthogonal concepts that together express 1) how a gene changes, 2) how a cancer changes and 3) the causality between the gene and the cancer. The multi-faceted annotations are shown to have high inter-annotator agreement. In addition, we show that the annotations in CoMAGC allow us to infer the prospective roles of genes in cancers and to classify the genes into three classes according to the inferred roles. We encode the mapping between multi-faceted annotations and gene classes into 10 inference rules. The inference rules produce results with high accuracy as measured against human annotations. CoMAGC consists of 821 sentences on prostate, breast and ovarian cancers. Currently, we deal with changes in gene expression levels among other types of gene changes. The corpus is available at http://biopathway.org/CoMAGCunder the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0). CONCLUSIONS The corpus will be an important resource for the development of advanced TM systems on gene-cancer relations.
Collapse
Affiliation(s)
| | | | | | | | - Jong C Park
- Department of Computer Science, KAIST, 291 Daehak-ro, Daejeon, Republic of Korea.
| |
Collapse
|
24
|
Chen J, Zhang D, Yan W, Yang D, Shen B. Translational bioinformatics for diagnostic and prognostic prediction of prostate cancer in the next-generation sequencing era. BIOMED RESEARCH INTERNATIONAL 2013; 2013:901578. [PMID: 23957008 PMCID: PMC3727129 DOI: 10.1155/2013/901578] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Accepted: 06/22/2013] [Indexed: 01/13/2023]
Abstract
The discovery of prostate cancer biomarkers has been boosted by the advent of next-generation sequencing (NGS) technologies. Nevertheless, many challenges still exist in exploiting the flood of sequence data and translating them into routine diagnostics and prognosis of prostate cancer. Here we review the recent developments in prostate cancer biomarkers by high throughput sequencing technologies. We highlight some fundamental issues of translational bioinformatics and the potential use of cloud computing in NGS data processing for the improvement of prostate cancer treatment.
Collapse
Affiliation(s)
- Jiajia Chen
- Center for Systems Biology, Soochow University, Suzhou 215006, China
- School of Chemistry, Biology and Material Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| | - Daqing Zhang
- Center for Systems Biology, Soochow University, Suzhou 215006, China
| | - Wenying Yan
- Center for Systems Biology, Soochow University, Suzhou 215006, China
| | - Dongrong Yang
- Department of Urology, The Second Affiliated Hospital of Soochow University, Suzhou 215004, China
| | - Bairong Shen
- Center for Systems Biology, Soochow University, Suzhou 215006, China
| |
Collapse
|
25
|
Kim J, So S, Lee HJ, Park JC, Kim JJ, Lee H. DigSee: Disease gene search engine with evidence sentences (version cancer). Nucleic Acids Res 2013; 41:W510-7. [PMID: 23761452 PMCID: PMC3692119 DOI: 10.1093/nar/gkt531] [Citation(s) in RCA: 74] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Biological events such as gene expression, regulation, phosphorylation, localization and protein catabolism play important roles in the development of diseases. Understanding the association between diseases and genes can be enhanced with the identification of involved biological events in this association. Although biological knowledge has been accumulated in several databases and can be accessed through the Web, there is no specialized Web tool yet allowing for a query into the relationship among diseases, genes and biological events. For this task, we developed DigSee to search MEDLINE abstracts for evidence sentences describing that ‘genes’ are involved in the development of ‘cancer’ through ‘biological events’. DigSee is available through http://gcancer.org/digsee.
Collapse
Affiliation(s)
- Jeongkyun Kim
- School of Information and Communications, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-go, Gwangju 500-712, Republic of Korea
| | | | | | | | | | | |
Collapse
|
26
|
Information exploration system for sickle cell disease and repurposing of hydroxyfasudil. PLoS One 2013; 8:e65190. [PMID: 23762313 PMCID: PMC3677893 DOI: 10.1371/journal.pone.0065190] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2013] [Accepted: 04/22/2013] [Indexed: 11/19/2022] Open
Abstract
Background Sickle cell disease (SCD) is a fatal monogenic disorder with no effective cure and thus high rates of morbidity and sequelae. Efforts toward discovery of disease modifying drugs and curative strategies can be augmented by leveraging the plethora of information contained in available biomedical literature. To facilitate research in this direction we have developed a resource, Dragon Exploration System for Sickle Cell Disease (DESSCD) (http://cbrc.kaust.edu.sa/desscd/) that aims to promote the easy exploration of SCD-related data. Description The Dragon Exploration System (DES), developed based on text mining and complemented by data mining, processed 419,612 MEDLINE abstracts retrieved from a PubMed query using SCD-related keywords. The processed SCD-related data has been made available via the DESSCD web query interface that enables: a/information retrieval using specified concepts, keywords and phrases, and b/the generation of inferred association networks and hypotheses. The usefulness of the system is demonstrated by: a/reproducing a known scientific fact, the “Sickle_Cell_Anemia–Hydroxyurea” association, and b/generating novel and plausible “Sickle_Cell_Anemia–Hydroxyfasudil” hypothesis. A PCT patent (PCT/US12/55042) has been filed for the latter drug repurposing for SCD treatment. Conclusion We developed the DESSCD resource dedicated to exploration of text-mined and data-mined information about SCD. No similar SCD-related resource exists. Thus, we anticipate that DESSCD will serve as a valuable tool for physicians and researchers interested in SCD.
Collapse
|
27
|
Shi J, Hu J, Zhou Q, Du Y, Jiang C. PEpiD: a prostate epigenetic database in mammals. PLoS One 2013; 8:e64289. [PMID: 23696878 PMCID: PMC3655999 DOI: 10.1371/journal.pone.0064289] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2012] [Accepted: 04/10/2013] [Indexed: 11/18/2022] Open
Abstract
Epigenetic mechanisms play key roles in initiation and progression of prostate cancer by changing gene expression. The Prostate Epigenetic Database (PEpiD: http://wukong.tongji.edu.cn/pepid) archives the three extensively characterized epigenetic mechanisms DNA methylation, histone modification, and microRNA implicated in prostate cancer of human, mouse, and rat. PEpiD uses a distinct color scheme to present the three types of epigenetic data and provides a user-friendly interface for flexible query. The retrieved information includes Refseq ID, gene symbol, gene alias, genomic loci of epigenetic changes, tissue source, experimental method, and supportive references. The change of histone modification (hyper or hypo) and the corresponding gene expression change (up or down) are also indicated. A graphic view of DNA methylation with exon-intron structure and predicted CpG islands is provided as well. Moreover, the prostate-related ENCODE tracks (DNA methylation, histone modifications, chromatin remodelers), and other key transcription factors with reported roles in prostate are displayed in the browser as well. The reversibility of epigenetic aberrations has made them potential markers for diagnosis and prognosis, and targets for treatment of cancers. This curated information will improve our understanding of epigenetic mechanisms of gene regulation in prostate cancer, and serve as an important resource for epigenetic research in prostate cancer.
Collapse
Affiliation(s)
- Jiejun Shi
- Shanghai Key Laboratory of Signaling and Disease Research, Department of Bioinformatics, Shanghai Tenth People's Hospital, The School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Jian Hu
- Shanghai Key Laboratory of Signaling and Disease Research, Department of Bioinformatics, Shanghai Tenth People's Hospital, The School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Qing Zhou
- Shanghai Key Laboratory of Signaling and Disease Research, Department of Bioinformatics, Shanghai Tenth People's Hospital, The School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Yanhua Du
- Shanghai Key Laboratory of Signaling and Disease Research, Department of Bioinformatics, Shanghai Tenth People's Hospital, The School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Cizhong Jiang
- Shanghai Key Laboratory of Signaling and Disease Research, Department of Bioinformatics, Shanghai Tenth People's Hospital, The School of Life Sciences and Technology, Tongji University, Shanghai, China
- * E-mail:
| |
Collapse
|
28
|
Biomedical text mining and its applications in cancer research. J Biomed Inform 2013; 46:200-11. [PMID: 23159498 DOI: 10.1016/j.jbi.2012.10.007] [Citation(s) in RCA: 159] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2012] [Revised: 10/30/2012] [Accepted: 10/30/2012] [Indexed: 11/21/2022]
|
29
|
Hu Y, Zhong D, Pang F, Ning Q, Zhang Y, Li G, Wu J, Mo Z. HNF1b is involved in prostate cancer risk via modulating androgenic hormone effects and coordination with other genes. GENETICS AND MOLECULAR RESEARCH 2013; 12:1327-35. [DOI: 10.4238/2013.april.25.4] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
30
|
Vazquez M, Krallinger M, Leitner F, Valencia A. Text Mining for Drugs and Chemical Compounds: Methods, Tools and Applications. Mol Inform 2011; 30:506-19. [PMID: 27467152 DOI: 10.1002/minf.201100005] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2011] [Accepted: 06/07/2011] [Indexed: 11/10/2022]
Abstract
Providing prior knowledge about biological properties of chemicals, such as kinetic values, protein targets, or toxic effects, can facilitate many aspects of drug development. Chemical information is rapidly accumulating in all sorts of free text documents like patents, industry reports, or scientific articles, which has motivated the development of specifically tailored text mining applications. Despite the potential gains, chemical text mining still faces significant challenges. One of the most salient is the recognition of chemical entities mentioned in text. To help practitioners contribute to this area, a good portion of this review is devoted to this issue, and presents the basic concepts and principles underlying the main strategies. The technical details are introduced and accompanied by relevant bibliographic references. Other tasks discussed are retrieving relevant articles, identifying relationships between chemicals and other entities, or determining the chemical structures of chemicals mentioned in text. This review also introduces a number of published applications that can be used to build pipelines in topics like drug side effects, toxicity, and protein-disease-compound network analysis. We conclude the review with an outlook on how we expect the field to evolve, discussing its possibilities and its current limitations.
Collapse
Affiliation(s)
- Miguel Vazquez
- Centro Nacional de Investigaciones Oncológicas, Biología Computacional y Estructural, Madrid, Spain
| | - Martin Krallinger
- Centro Nacional de Investigaciones Oncológicas, Biología Computacional y Estructural, Madrid, Spain
| | - Florian Leitner
- Centro Nacional de Investigaciones Oncológicas, Biología Computacional y Estructural, Madrid, Spain
| | - Alfonso Valencia
- Centro Nacional de Investigaciones Oncológicas, Biología Computacional y Estructural, Madrid, Spain.
| |
Collapse
|
31
|
Galperin MY, Cochrane GR. The 2011 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection. Nucleic Acids Res 2011; 39:D1-6. [PMID: 21177655 PMCID: PMC3013748 DOI: 10.1093/nar/gkq1243] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
The current 18th Database Issue of Nucleic Acids Research features descriptions of 96 new and 83 updated online databases covering various areas of molecular biology. It includes two editorials, one that discusses COMBREX, a new exciting project aimed at figuring out the functions of the ‘conserved hypothetical’ proteins, and one concerning BioDBcore, a proposed description of the ‘minimal information about a biological database’. Papers from the members of the International Nucleotide Sequence Database collaboration (INSDC) describe each of the participating databases, DDBJ, ENA and GenBank, principles of data exchange within the collaboration, and the recently established Sequence Read Archive. A testament to the longevity of databases, this issue includes updates on the RNA modification database, Definition of Secondary Structure of Proteins (DSSP) and Homology-derived Secondary Structure of Proteins (HSSP) databases, which have not been featured here in >12 years. There is also a block of papers describing recent progress in protein structure databases, such as Protein DataBank (PDB), PDB in Europe (PDBe), CATH, SUPERFAMILY and others, as well as databases on protein structure modeling, protein–protein interactions and the organization of inter-protein contact sites. Other highlights include updates of the popular gene expression databases, GEO and ArrayExpress, several cancer gene databases and a detailed description of the UK PubMed Central project. The Nucleic Acids Research online Database Collection, available at: http://www.oxfordjournals.org/nar/database/a/, now lists 1330 carefully selected molecular biology databases. The full content of the Database Issue is freely available online at the Nucleic Acids Research web site (http://nar.oxfordjournals.org/).
Collapse
Affiliation(s)
- Michael Y Galperin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| | | |
Collapse
|
32
|
Ma H, Schadt EE, Kaplan LM, Zhao H. COSINE: COndition-SpecIfic sub-NEtwork identification using a global optimization method. ACTA ACUST UNITED AC 2011; 27:1290-8. [PMID: 21414987 DOI: 10.1093/bioinformatics/btr136] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
MOTIVATION The identification of condition specific sub-networks from gene expression profiles has important biological applications, ranging from the selection of disease-related biomarkers to the discovery of pathway alterations across different phenotypes. Although many methods exist for extracting these sub-networks, very few existing approaches simultaneously consider both the differential expression of individual genes and the differential correlation of gene pairs, losing potentially valuable information in the data. RESULTS In this article, we propose a new method, COSINE (COndition SpecIfic sub-NEtwork), which employs a scoring function that jointly measures the condition-specific changes of both 'nodes' (individual genes) and 'edges' (gene-gene co-expression). It uses the genetic algorithm to search for the single optimal sub-network which maximizes the scoring function. We applied COSINE to both simulated datasets with various differential expression patterns, and three real datasets, one prostate cancer dataset, a second one from the across-tissue comparison of morbidly obese patients and the other from the across-population comparison of the HapMap samples. Compared with previous methods, COSINE is more powerful in identifying truly significant sub-networks of appropriate size and meaningful biological relevance. AVAILABILITY The R code is available as the COSINE package on CRAN: http://cran.r-project.org/web/packages/COSINE/index.html.
Collapse
Affiliation(s)
- Haisu Ma
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06511, USA
| | | | | | | |
Collapse
|