1
|
Zheng H, Wang Y, Li F. C-C Motif Chemokine Ligand 5 (CCL5): A Potential Biomarker and Immunotherapy Target for Osteosarcoma. Curr Cancer Drug Targets 2024; 24:308-318. [PMID: 37581517 DOI: 10.2174/1568009623666230815115755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 07/10/2023] [Accepted: 07/12/2023] [Indexed: 08/16/2023]
Abstract
BACKGROUND Osteosarcoma (OS) is the most common primary malignant tumor of bone tissue, which has an insidious onset and is difficult to detect early, and few early diagnostic markers with high specificity and sensitivity. Therefore, this study aims to identify potential biomarkers that can help diagnose OS in its early stages and improve the prognosis of patients. METHODS The data sets of GSE12789, GSE28424, GSE33382 and GSE36001 were combined and normalized to identify Differentially Expressed Genes (DEGs). The data were analyzed by Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genome (KEGG) and Disease Ontology (DO). The hub gene was selected based on the common DEG that was obtained by applying two regression methods: the Least Absolute Shrinkage and Selection Operator (LASSO) and Support vVector Machine (SVM). Then the diagnostic value of the hub gene was evaluated in the GSE42572 data set. Finally, the correlation between immunocyte infiltration and key genes was analyzed by CIBERSORT. RESULTS The regression analysis results of LASSO and SVM are the following three DEGs: FK501 binding protein 51 (FKBP5), C-C motif chemokine ligand 5 (CCL5), complement component 1 Q subcomponent B chain (C1QB). We evaluated the diagnostic performance of three biomarkers (FKBP5, CCL5 and C1QB) for osteosarcoma using receiver operating characteristic (ROC) analysis. In the training group, the area under the curve (AUC) of FKBP5, CCL5 and C1QB was 0.907, 0.874 and 0.676, respectively. In the validation group, the AUC of FKBP5, CCL5 and C1QB was 0.618, 0.932 and 0.895, respectively. It is noteworthy that these genes were more expressed in tumor tissues than in normal tissues by various immune cell types, such as plasma cells, CD8+ T cells, T regulatory cells (Tregs), activated NK cells, activated dendritic cells and activated mast cells. These immune cell types are also associated with the expression levels of the three diagnostic genes that we identified. CONCLUSION We found that CCL5 can be considered an early diagnostic gene of osteosarcoma, and CCL5 interacts with immune cells to influence tumor occurrence and development. These findings have important implications for the early detection of osteosarcoma and the identification of novel therapeutic targets.
Collapse
Affiliation(s)
- Heng Zheng
- The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People's Hospital, Qingyuan, China
| | - Yichong Wang
- Department of Orthopedics, The Second Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Fengfeng Li
- Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, The Affiliated Hospital of Nanjing University Medical School, Nanjing, China
| |
Collapse
|
2
|
Liu ZH, Ji CM, Ni JC, Wang YT, Qiao LJ, Zheng CH. Convolution Neural Networks Using Deep Matrix Factorization for Predicting Circrna-Disease Association. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:277-284. [PMID: 34951853 DOI: 10.1109/tcbb.2021.3138339] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
CircRNAs have a stable structure, which gives them a higher tolerance to nucleases. Therefore, the properties of circular RNAs are beneficial in disease diagnosis. However, there are few known associations between circRNAs and disease. Biological experiments identify new associations is time-consuming and high-cost. As a result, there is a need of building efficient and achievable computation models to predict potential circRNA-disease associations. In this paper, we design a novel convolution neural networks framework(DMFCNNCD) to learn features from deep matrix factorization to predict circRNA-disease associations. Firstly, we decompose the circRNA-disease association matrix to obtain the original features of the disease and circRNA, and use the mapping module to extract potential nonlinear features. Then, we integrate it with the similarity information to form a training set. Finally, we apply convolution neural networks to predict the unknown association between circRNAs and diseases. The five-fold cross-validation on various experiments shows that our method can predict circRNA-disease association and outperforms state of the art methods.
Collapse
|
3
|
Kirsten T, Meineke FA, Loeffler-Wirth H, Beger C, Uciteli A, Stäubert S, Löbe M, Hänsel R, Rauscher FG, Schuster J, Peschel T, Herre H, Wagner J, Zachariae S, Engel C, Scholz M, Rahm E, Binder H, Loeffler M. The Leipzig Health Atlas-An Open Platform to Present, Archive, and Share Biomedical Data, Analyses, and Models Online. Methods Inf Med 2022; 61:e103-e115. [PMID: 35915977 PMCID: PMC9788914 DOI: 10.1055/a-1914-1985] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
BACKGROUND Clinical trials, epidemiological studies, clinical registries, and other prospective research projects, together with patient care services, are main sources of data in the medical research domain. They serve often as a basis for secondary research in evidence-based medicine, prediction models for disease, and its progression. This data are often neither sufficiently described nor accessible. Related models are often not accessible as a functional program tool for interested users from the health care and biomedical domains. OBJECTIVE The interdisciplinary project Leipzig Health Atlas (LHA) was developed to close this gap. LHA is an online platform that serves as a sustainable archive providing medical data, metadata, models, and novel phenotypes from clinical trials, epidemiological studies, and other medical research projects. METHODS Data, models, and phenotypes are described by semantically rich metadata. The platform prefers to share data and models presented in original publications but is also open for nonpublished data. LHA provides and associates unique permanent identifiers for each dataset and model. Hence, the platform can be used to share prepared, quality-assured datasets and models while they are referenced in publications. All managed data, models, and phenotypes in LHA follow the FAIR principles, with public availability or restricted access for specific user groups. RESULTS The LHA platform is in productive mode (https://www.health-atlas.de/). It is already used by a variety of clinical trial and research groups and is becoming increasingly popular also in the biomedical community. LHA is an integral part of the forthcoming initiative building a national research data infrastructure for health in Germany.
Collapse
Affiliation(s)
- Toralf Kirsten
- Department of Medical Data Science, Leipzig University Medical Center, Leipzig, Germany,Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany,Interdisciplinary Centre for Bioinformatics, Leipzig University, Leipzig, Germany,Address for correspondence Toralf Kirsten Department of Medical Data Science, Leipzig UniversityHärtelstraße 16-18, 04107 LeipzigGermany
| | - Frank A. Meineke
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Henry Loeffler-Wirth
- LIFE Research Centre for Civilization Diseases, Leipzig University, Leipzig, Germany
| | - Christoph Beger
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Alexandr Uciteli
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Sebastian Stäubert
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Matthias Löbe
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany,Interdisciplinary Centre for Bioinformatics, Leipzig University, Leipzig, Germany
| | - René Hänsel
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Franziska G. Rauscher
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany,Interdisciplinary Centre for Bioinformatics, Leipzig University, Leipzig, Germany
| | - Judith Schuster
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Thomas Peschel
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Heinrich Herre
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Jonas Wagner
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany,Interdisciplinary Centre for Bioinformatics, Leipzig University, Leipzig, Germany
| | - Silke Zachariae
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Christoph Engel
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany,Interdisciplinary Centre for Bioinformatics, Leipzig University, Leipzig, Germany
| | - Markus Scholz
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany
| | - Erhard Rahm
- Department of Computer Sciences, Leipzig University, Leipzig, Germany
| | - Hans Binder
- LIFE Research Centre for Civilization Diseases, Leipzig University, Leipzig, Germany
| | - Markus Loeffler
- Institute for Medical Informatics, Statistics, and Epidemiology, Leipzig University, Leipzig, Germany,Interdisciplinary Centre for Bioinformatics, Leipzig University, Leipzig, Germany,LIFE Research Centre for Civilization Diseases, Leipzig University, Leipzig, Germany
| | | |
Collapse
|
4
|
Casaletto J, Parsons M, Markello C, Iwasaki Y, Momozawa Y, Spurdle AB, Cline M. Federated analysis of BRCA1 and BRCA2 variation in a Japanese cohort. CELL GENOMICS 2022; 2:110882. [PMID: 35373174 PMCID: PMC8975122 DOI: 10.1016/j.xgen.2022.100109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 10/21/2021] [Accepted: 02/09/2022] [Indexed: 10/31/2022]
Abstract
More than 40% of the germline variants in ClinVar today are variants of uncertain significance (VUSs). These variants remain unclassified in part because the patient-level data needed for their interpretation is siloed. Federated analysis can overcome this problem by "bringing the code to the data": analyzing the sensitive patient-level data computationally within its secure home institution and providing researchers with valuable insights from data that would not otherwise be accessible. We tested this principle with a federated analysis of breast cancer clinical data at RIKEN, derived from the BioBank Japan repository. We were able to analyze these data within RIKEN's secure computational framework without the need to transfer the data, gathering evidence for the interpretation of several variants. This exercise represents an approach to help realize the core charter of the Global Alliance for Genomics and Health (GA4GH): to responsibly share genomic data for the benefit of human health.
Collapse
Affiliation(s)
- James Casaletto
- UC Santa Cruz Genomics Institute, Mail Stop: Genomics, University of California, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Michael Parsons
- QIMR Berghofer Medical Research Institute, 300 Herston Rd., Herston, QLD 4006, Australia
| | - Charles Markello
- UC Santa Cruz Genomics Institute, Mail Stop: Genomics, University of California, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Yusuke Iwasaki
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa 230-0045, Japan
| | - Yukihide Momozawa
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa 230-0045, Japan
| | - Amanda B. Spurdle
- QIMR Berghofer Medical Research Institute, 300 Herston Rd., Herston, QLD 4006, Australia
| | - Melissa Cline
- UC Santa Cruz Genomics Institute, Mail Stop: Genomics, University of California, 1156 High Street, Santa Cruz, CA 95064, USA
| |
Collapse
|
5
|
A Framework for Supporting Well-being using the Character Computing Ontology - Anxiety and Sleep Quality during COVID-19. OPEN PSYCHOLOGY 2022. [DOI: 10.1515/psych-2022-0011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Abstract
The COVID-19 pandemic is affecting human behavior, increasing the demand for the cooperation between psychologists and computer scientists to develop technology solutions that can help people in order to promote well-being and behavior change. According to the conceptual Character-Behavior-Situation (CBS) triad of Character Computing, behavior is driven by an individual’s character (trait and state markers) and the situation. In previous work, a computational ontology for Character Computing (CCOnto) has been introduced. The ontology can be extended with domain-specific knowledge for developing applications for inferring certain human behaviors to be leveraged for different purposes. In this paper, we present a framework for developing applications for dealing with changes in well-being during the COVID-19 pandemic. The framework can be used by psychology domain experts and application developers. The proposed model allows the input of heuristic rules as well as data-based rule extraction for inferring behavior. In this paper, we present how CCOnto is extended with components of physical and mental well-being and how the framework uses the extended domain ontologies in applications for evaluating sleep habits, anxiety, and depression predisposition during the COVID-19 pandemic based on user-input data.
Collapse
|
6
|
Manoharan S, Iyyappan OR. A Hybrid Protocol for Finding Novel Gene Targets for Various Diseases Using Microarray Expression Data Analysis and Text Mining. Methods Mol Biol 2022; 2496:41-70. [PMID: 35713858 DOI: 10.1007/978-1-0716-2305-3_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The advancement in technology for various scientific experiments and the amount of raw data produced from that is enormous, thus giving rise to various subsets of biologists working with genome, proteome, transcriptome, expression, pathway, and so on. This has led to exponential growth in scientific literature which is becoming beyond the means of manual curation and annotation for extracting information of importance. Microarray data are expression data, analysis of which results in a set of up/downregulated lists of genes that are functionally annotated to ascertain the biological meaning of genes. These genes are represented as vocabularies and/or Gene Ontology terms when associated with pathway enrichment analysis need relational and conceptual understanding to a disease. The chapter deals with a hybrid approach we designed for identifying novel drug-disease targets. Microarray data for muscular dystrophy is explored here as an example and text mining approaches are utilized with an aim to identify promisingly novel drug targets. Our main objective is to give a basic overview from a biologist's perspective for whom text mining approaches of data mining and information retrieval is fairly a new concept. The chapter aims to bridge the gap between biologist and computational text miners and bring about unison for a more informative research in a fast and time efficient manner.
Collapse
Affiliation(s)
- Sharanya Manoharan
- Department of Bioinformatics, Stella Maris College (Autonomous), Chennai, Tamilnadu, India.
| | - Oviya Ramalakshmi Iyyappan
- Department of Sciences, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Chennai, Tamilnadu, India
| |
Collapse
|
7
|
RNA-sequencing and bioinformatics analysis of long noncoding RNAs and mRNAs in the asthenozoospermia. Biosci Rep 2021; 40:225687. [PMID: 32614449 PMCID: PMC7364483 DOI: 10.1042/bsr20194041] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Revised: 05/15/2020] [Accepted: 06/01/2020] [Indexed: 12/26/2022] Open
Abstract
Asthenozoospermia is one of the major causes of human male infertility. Long noncoding RNAs (lncRNAs) play critical roles in the spermatogenesis processes. The present study aims to investigate the intricate regulatory network associated with asthenozoospermia. The lncRNAs expression profile was analyzed in the asthenozoospermia seminal plasma exosomes by RNA-sequencing, and the functions of differentially expressed genes (DEGs) were analyzed by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and DO (Disease Ontology) enrichment analyses. Pearson’s correlation test was utilized to calculate the correlation coefficients between lncRNA and mRNAs. Moreover, the lncRNA–miRNA–mRNA co-expression network was constructed with bioinformatics. From the co-expression analyses, we identified the cis regulated correlation pairs lncRNA–mRNA. To confirm sequencing results with five of the identified DElncRNAs were verified with quantitative reverse-transcription polymerase chain reaction (qRT-PCR). We identified 4228 significantly DEGs, 995 known DElncRNAs, 2338 DEmRNAs and 11,706 novel DElncRNAs between asthenozoospermia and normal group. GO and KEGG analyses showed that the DEGs were mainly associated with metabolism, transcription, ribosome and channel activity. We found 254,981 positive correlations lncRNA–mRNA pairs through correlation analysis. The detailed lncRNA–miRNA–mRNA regulatory network included 11 lncRNAs, 35 miRNAs and 59 mRNAs. From the co-expression analyses, we identified 7 cis-regulated correlation pairs lncRNA–mRNA. Additionally, the qRT-PCR analysis confirmed our sequencing results. Our study constructed the lncRNA–mRNA–miRNA regulation networks in asthenozoospermia. Therefore, the study findings provide a set of pivotal lncRNAs for future investigation into the molecular mechanisms of asthenozoospermia.
Collapse
|
8
|
Jarmusch AK, Wang M, Aceves CM, Advani RS, Aguirre S, Aksenov AA, Aleti G, Aron AT, Bauermeister A, Bolleddu S, Bouslimani A, Caraballo Rodriguez AM, Chaar R, Coras R, Elijah EO, Ernst M, Gauglitz JM, Gentry EC, Husband M, Jarmusch SA, Jones KL, Kamenik Z, Le Gouellec A, Lu A, McCall LI, McPhail KL, Meehan MJ, Melnik AV, Menezes RC, Montoya Giraldo YA, Nguyen NH, Nothias LF, Nothias-Esposito M, Panitchpakdi M, Petras D, Quinn RA, Sikora N, van der Hooft JJJ, Vargas F, Vrbanac A, Weldon KC, Knight R, Bandeira N, Dorrestein PC. ReDU: a framework to find and reanalyze public mass spectrometry data. Nat Methods 2020; 17:901-904. [PMID: 32807955 DOI: 10.1038/s41592-020-0916-7] [Citation(s) in RCA: 65] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 07/10/2020] [Indexed: 11/09/2022]
Abstract
We present ReDU ( https://redu.ucsd.edu/ ), a system for metadata capture of public mass spectrometry-based metabolomics data, with validated controlled vocabularies. Systematic capture of knowledge enables the reanalysis of public data and/or co-analysis of one's own data. ReDU enables multiple types of analyses, including finding chemicals and associated metadata, comparing the shared and different chemicals between groups of samples, and metadata-filtered, repository-scale molecular networking.
Collapse
Affiliation(s)
- Alan K Jarmusch
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Mingxun Wang
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Christine M Aceves
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Rohit S Advani
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Shaden Aguirre
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Alexander A Aksenov
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Gajender Aleti
- Department of Psychiatry, Stein Clinical Research, University of California, San Diego, La Jolla, CA, USA
| | - Allegra T Aron
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Anelize Bauermeister
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Institute of Biomedical Sciences, Universidade de São Paulo, São Paulo, Brazil
| | - Sanjana Bolleddu
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Amina Bouslimani
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Andres Mauricio Caraballo Rodriguez
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Rama Chaar
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Roxana Coras
- Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Emmanuel O Elijah
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Madeleine Ernst
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA.,Center for Newborn Screening, Department of Congenital Disorders, Statens Serum Institut, Copenhagen, Denmark
| | - Julia M Gauglitz
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Emily C Gentry
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Makhai Husband
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Scott A Jarmusch
- Marine Biodiscovery Centre, Department of Chemistry, University of Aberdeen, Old Aberdeen, UK
| | - Kenneth L Jones
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Zdenek Kamenik
- Institute of Microbiology, Czech Academy of Sciences, Videnska, Czech Republic
| | - Audrey Le Gouellec
- TIMC-IMAG, Univ. Grenoble Alpes, CNRS, Grenoble INP, CHU Grenoble Alpes, Grenoble, France
| | - Aileen Lu
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Laura-Isobel McCall
- Department of Chemistry and Biochemistry, Department of Microbiology and Plant Biology, and Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma, Norman, OK, USA
| | - Kerry L McPhail
- Department of Pharmaceutical Sciences, College of Pharmacy, Oregon State University, Corvallis, OR, USA
| | - Michael J Meehan
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Alexey V Melnik
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Riya C Menezes
- Research Group Mass Spectrometry, Max Planck Institute for Chemical Ecology, Jena, Germany
| | - Yessica Alejandra Montoya Giraldo
- Grupo de Investigación en Ciencias Biológicas y Bioprocesos (CIBIOP), Department of Biological Sciences, Universidad EAFIT, Medellín, Colombia
| | - Ngoc Hung Nguyen
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Louis Felix Nothias
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Mélissa Nothias-Esposito
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Morgan Panitchpakdi
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Daniel Petras
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA.,Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA, USA
| | - Robert A Quinn
- Department of Biochemistry and Molecular Biology, Michigan State University, Lansing, MI, USA
| | - Nicole Sikora
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Justin J J van der Hooft
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Bioinformatics Group, Wageningen University, Wageningen, the Netherlands
| | - Fernando Vargas
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA.,Division of Biological Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Alison Vrbanac
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Kelly C Weldon
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA.,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA.,Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, USA
| | - Rob Knight
- Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA.,Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, USA.,Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA.,Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Nuno Bandeira
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA.,Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, USA.,Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, University of California, San Diego, La Jolla, CA, USA. .,Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA. .,Department of Pediatrics, School of Medicine, University of California, San Diego, La Jolla, CA, USA. .,Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
9
|
Cisplatin-resistant triple-negative breast cancer subtypes: multiple mechanisms of resistance. BMC Cancer 2019; 19:1039. [PMID: 31684899 PMCID: PMC6829976 DOI: 10.1186/s12885-019-6278-9] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Accepted: 10/21/2019] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Understanding mechanisms underlying specific chemotherapeutic responses in subtypes of cancer may improve identification of treatment strategies most likely to benefit particular patients. For example, triple-negative breast cancer (TNBC) patients have variable response to the chemotherapeutic agent cisplatin. Understanding the basis of treatment response in cancer subtypes will lead to more informed decisions about selection of treatment strategies. METHODS In this study we used an integrative functional genomics approach to investigate the molecular mechanisms underlying known cisplatin-response differences among subtypes of TNBC. To identify changes in gene expression that could explain mechanisms of resistance, we examined 102 evolutionarily conserved cisplatin-associated genes, evaluating their differential expression in the cisplatin-sensitive, basal-like 1 (BL1) and basal-like 2 (BL2) subtypes, and the two cisplatin-resistant, luminal androgen receptor (LAR) and mesenchymal (M) subtypes of TNBC. RESULTS We found 20 genes that were differentially expressed in at least one subtype. Fifteen of the 20 genes are associated with cell death and are distributed among all TNBC subtypes. The less cisplatin-responsive LAR and M TNBC subtypes show different regulation of 13 genes compared to the more sensitive BL1 and BL2 subtypes. These 13 genes identify a variety of cisplatin-resistance mechanisms including increased transport and detoxification of cisplatin, and mis-regulation of the epithelial to mesenchymal transition. CONCLUSIONS We identified gene signatures in resistant TNBC subtypes indicative of mechanisms of cisplatin. Our results indicate that response to cisplatin in TNBC has a complex foundation based on impact of treatment on distinct cellular pathways. We find that examination of expression data in the context of heterogeneous data such as drug-gene interactions leads to a better understanding of mechanisms at work in cancer therapy response.
Collapse
|
10
|
Abstract
Multifunctional genes are important genes because of their essential roles in human cells. Studying and analyzing multifunctional genes can help understand disease mechanisms and drug discovery. We propose a computational method for scoring gene multifunctionality based on functional annotations of the target gene from the Gene Ontology. The method is based on identifying pairs of GO annotations that represent semantically different biological functions and any gene annotated with two annotations from one pair is considered multifunctional. The proposed method can be employed to identify multifunctional genes in the entire human genome using solely the GO annotations. We evaluated the proposed method in scoring multifunctionality of all human genes using four criteria: gene-disease associations; protein-protein interactions; gene studies with PubMed publications; and published known multifunctional gene sets. The evaluation results confirm the validity and reliability of the proposed method for identifying multifunctional human genes. The results across all four evaluation criteria were statistically significant in determining multifunctionality. For example, the method confirmed that multifunctional genes tend to be associated with diseases more than other genes, with significance [Formula: see text]. Moreover, consistent with all previous studies, proteins encoded by multifunctional genes, based on our method, are involved in protein-protein interactions significantly more ([Formula: see text]) than other proteins.
Collapse
Affiliation(s)
- Hisham Al-Mubaid
- 1 Computer Science Department, University of Houston-Clear Lake, Houston, TX 77062, USA
| |
Collapse
|
11
|
Lee RYN, Howe KL, Harris TW, Arnaboldi V, Cain S, Chan J, Chen WJ, Davis P, Gao S, Grove C, Kishore R, Muller HM, Nakamura C, Nuin P, Paulini M, Raciti D, Rodgers F, Russell M, Schindelman G, Tuli MA, Van Auken K, Wang Q, Williams G, Wright A, Yook K, Berriman M, Kersey P, Schedl T, Stein L, Sternberg PW. WormBase 2017: molting into a new stage. Nucleic Acids Res 2019; 46:D869-D874. [PMID: 29069413 PMCID: PMC5753391 DOI: 10.1093/nar/gkx998] [Citation(s) in RCA: 130] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Accepted: 10/11/2017] [Indexed: 11/13/2022] Open
Abstract
WormBase (http://www.wormbase.org) is an important knowledge resource for biomedical researchers worldwide. To accommodate the ever increasing amount and complexity of research data, WormBase continues to advance its practices on data acquisition, curation and retrieval to most effectively deliver comprehensive knowledge about Caenorhabditis elegans, and genomic information about other nematodes and parasitic flatworms. Recent notable enhancements include user-directed submission of data, such as micropublication; genomic data curation and presentation, including additional genomes and JBrowse, respectively; new query tools, such as SimpleMine, Gene Enrichment Analysis; new data displays, such as the Person Lineage browser and the Summary of Ontology-based Annotations. Anticipating more rapid data growth ahead, WormBase continues the process of migrating to a cutting-edge database technology to achieve better stability, scalability, reproducibility and a faster response time. To better serve the broader research community, WormBase, with five other Model Organism Databases and The Gene Ontology project, have begun to collaborate formally as the Alliance of Genome Resources.
Collapse
Affiliation(s)
- Raymond Y N Lee
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Kevin L Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Todd W Harris
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Valerio Arnaboldi
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Scott Cain
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Juancarlos Chan
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Wen J Chen
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paul Davis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sibyl Gao
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Christian Grove
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Ranjana Kishore
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Hans-Michael Muller
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Cecilia Nakamura
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paulo Nuin
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Michael Paulini
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniela Raciti
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Faye Rodgers
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Matt Russell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gary Schindelman
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Mary Ann Tuli
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Kimberly Van Auken
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Qinghua Wang
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Gary Williams
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Adam Wright
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Karen Yook
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Matthew Berriman
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Paul Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tim Schedl
- Department of Genetics, Washington University School of Medicine, St Louis, MO 63110, USA
| | - Lincoln Stein
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Paul W Sternberg
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| |
Collapse
|
12
|
Smith CL, Blake JA, Kadin JA, Richardson JE, Bult CJ. Mouse Genome Database (MGD)-2018: knowledgebase for the laboratory mouse. Nucleic Acids Res 2019; 46:D836-D842. [PMID: 29092072 PMCID: PMC5753350 DOI: 10.1093/nar/gkx1006] [Citation(s) in RCA: 163] [Impact Index Per Article: 32.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 10/19/2017] [Indexed: 12/23/2022] Open
Abstract
The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the key community mouse database which supports basic, translational and computational research by providing integrated data on the genetics, genomics, and biology of the laboratory mouse. MGD serves as the source for biological reference data sets related to mouse genes, gene functions, phenotypes and disease models with an increasing emphasis on the association of these data to human biology and disease. We report here on recent enhancements to this resource, including improved access to mouse disease model and human phenotype data and enhanced relationships of mouse models to human disease.
Collapse
Affiliation(s)
- Cynthia L Smith
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Judith A Blake
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - James A Kadin
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | | | - Carol J Bult
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | | |
Collapse
|
13
|
Lo Surdo P, Calderone A, Iannuccelli M, Licata L, Peluso D, Castagnoli L, Cesareni G, Perfetto L. DISNOR: a disease network open resource. Nucleic Acids Res 2019; 46:D527-D534. [PMID: 29036667 PMCID: PMC5753342 DOI: 10.1093/nar/gkx876] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2017] [Accepted: 09/25/2017] [Indexed: 12/13/2022] Open
Abstract
DISNOR is a new resource that aims at exploiting the explosion of data on the identification of disease-associated genes to assemble inferred disease pathways. This may help dissecting the signaling events whose disruption causes the pathological phenotypes and may contribute to build a platform for precision medicine. To this end we combine the gene-disease association (GDA) data annotated in the DisGeNET resource with a new curation effort aimed at populating the SIGNOR database with causal interactions related to disease genes with the highest possible coverage. DISNOR can be freely accessed at http://DISNOR.uniroma2.it/ where >3700 disease-networks, linking ∼2600 disease genes, can be explored. For each disease curated in DisGeNET, DISNOR links disease genes by manually annotated causal relationships and offers an intuitive visualization of the inferred ‘patho-pathways’ at different complexity levels. User-defined gene lists are also accepted in the query pipeline. In addition, for each list of query genes—either annotated in DisGeNET or user-defined—DISNOR performs a gene set enrichment analysis on KEGG-defined pathways or on the lists of proteins associated with the inferred disease pathways. This function offers additional information on disease-associated cellular pathways and disease similarity.
Collapse
Affiliation(s)
- Prisca Lo Surdo
- Bioinformatics and Computational Biology Unit, Department of Biology, University of Rome 'Tor Vergata', 00133 Rome, Italy
| | - Alberto Calderone
- Bioinformatics and Computational Biology Unit, Department of Biology, University of Rome 'Tor Vergata', 00133 Rome, Italy
| | - Marta Iannuccelli
- Bioinformatics and Computational Biology Unit, Department of Biology, University of Rome 'Tor Vergata', 00133 Rome, Italy
| | - Luana Licata
- Bioinformatics and Computational Biology Unit, Department of Biology, University of Rome 'Tor Vergata', 00133 Rome, Italy
| | - Daniele Peluso
- Bioinformatics and Computational Biology Unit, Department of Biology, University of Rome 'Tor Vergata', 00133 Rome, Italy.,Laboratory of Bioinformatic, IRCCS Fondazione Santa Lucia, 00143 Rome, Italy
| | - Luisa Castagnoli
- Bioinformatics and Computational Biology Unit, Department of Biology, University of Rome 'Tor Vergata', 00133 Rome, Italy
| | - Gianni Cesareni
- Bioinformatics and Computational Biology Unit, Department of Biology, University of Rome 'Tor Vergata', 00133 Rome, Italy
| | - Livia Perfetto
- Bioinformatics and Computational Biology Unit, Department of Biology, University of Rome 'Tor Vergata', 00133 Rome, Italy
| |
Collapse
|
14
|
Finke MT, Filice RW, Kahn CE. Integrating ontologies of human diseases, phenotypes, and radiological diagnosis. J Am Med Inform Assoc 2019; 26:149-154. [PMID: 30624645 DOI: 10.1093/jamia/ocy161] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Accepted: 11/13/2018] [Indexed: 11/12/2022] Open
Abstract
Mappings between ontologies enable reuse and interoperability of biomedical knowledge. The Radiology Gamuts Ontology (RGO)-an ontology of 16 918 diseases, interventions, and imaging observations-provides a resource for differential diagnosis and automated textual report understanding in radiology. An automated process with subsequent manual review was used to identify exact and partial matches of RGO entities to the Disease Ontology (DO) and the Human Phenotype Ontology (HPO). Exact mappings identified equivalent concepts; partial mappings identified subclass and superclass relationships. A total of 7913 distinct RGO entities (46.8%) were mapped to one or both of the two target ontologies. Integration of RGO's causal knowledge resulted in 9605 axioms that expressed direct causal relationships between DO diseases and HPO phenotypic abnormalities, and allowed one to formulate queries about causal relations using the abstraction properties in those two ontologies. The mappings can be used to support automated diagnostic reasoning, data mining, and knowledge discovery.
Collapse
Affiliation(s)
- Michael T Finke
- Pacific Northwest University of Health Sciences, Yakima, WA, USA
| | - Ross W Filice
- Department of Radiology, MedStar Georgetown University Hospital, Washington, DC, USA
| | - Charles E Kahn
- Department of Radiology and Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
15
|
Chen F, Li Z, Zhou H. Identification of prognostic miRNA biomarkers for predicting overall survival of colon adenocarcinoma and bioinformatics analysis: A study based on The Cancer Genome Atlas database. J Cell Biochem 2018; 120:9839-9849. [DOI: 10.1002/jcb.28264] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 10/24/2018] [Indexed: 12/16/2022]
Affiliation(s)
- Fangyao Chen
- Department of Epidemiology and Health Statistics School of Public Health Xi’an Jiaotong University Health Science Center Xi’an Shaanxi China
| | - Zhe Li
- First Affiliated Hospital of Xi’an Jiaotong University Xi’an Shaanxi China
| | - Hui Zhou
- Department of Pharmacy, First Affiliated Hospatial of Xi’an Jiaotong University Xi’an Shaanxi China
| |
Collapse
|
16
|
Chen F, Zhou H, Wu C, Yan H. Identification of miRNA profiling in prediction of tumor recurrence and progress and bioinformatics analysis for patients with primary esophageal cancer: Study based on TCGA database. Pathol Res Pract 2018; 214:2081-2086. [PMID: 30477645 DOI: 10.1016/j.prp.2018.10.009] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/29/2018] [Revised: 09/13/2018] [Accepted: 10/17/2018] [Indexed: 02/07/2023]
Abstract
OBJECT This study focused on the identification of prognostic miRNAs for the prediction of tumor recurrence and progress in esophageal cancer. METHODS MiRNA profiling and clinical characteristics of esophageal cancer patients was downloaded from the TCGA database. Univariate analysis was performed to select potential prognostic miRNAs and covariates. LASSO based logistic regression was conducted to identify the prognostic miRNAs given covariates. Bioinformatics analysis including gene ontology, disease ontology and pathway enrichment analysis were performed. A nomogram was generated based on multivariate logistic regression to illustrate the association between the identified miRNAs and the risk of tumor recurrence and progress. RESULTS A total of 1881 miRNAs and 10 clinical characteristics were obtained from TCGA database. 18 miRNAs were finally identified in which 6 miRNAs were identified for the first time to be associated with the tumor recurrence and progress of esophageal cancer given covariates. Bioinformatics analysis suggested that the identified miRNAs were associated with the tumor recurrence and progress of esophageal cancer. The association between identified miRNAs and risk of tumor recurrence and progress were presented in a nomogram. CONCLUSION The 6 newly identified miRNAs may be potential biomarkers for the prediction of tumor recurrence and progress of esophageal cancer.
Collapse
Affiliation(s)
- Fangyao Chen
- Department of Epidemiology and Health Statistics, School of Public Health, Xi'an Jiaotong University Health Science Center, 76 Yanta Xilu Road, Xi'an, Shaanxi, 710061, China
| | - Hui Zhou
- Department of Pharmacy, First Affiliated Hospital of Xi'an Jiaotong University, 277 Yanta Xilu Road, Xi'an, Shaanxi, 710061, China
| | - Chenqiuzi Wu
- First Affiliated Hospital of Xi'an Jiaotong University, 277 Yanta Xilu Road, Xi'an, Shaanxi, 710061, China
| | - Hong Yan
- Department of Epidemiology and Health Statistics, School of Public Health, Xi'an Jiaotong University Health Science Center, 76 Yanta Xilu Road, Xi'an, Shaanxi, 710061, China.
| |
Collapse
|
17
|
Gkoutos GV, Schofield PN, Hoehndorf R. The anatomy of phenotype ontologies: principles, properties and applications. Brief Bioinform 2018; 19:1008-1021. [PMID: 28387809 PMCID: PMC6169674 DOI: 10.1093/bib/bbx035] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2017] [Revised: 02/05/2017] [Indexed: 12/14/2022] Open
Abstract
The past decade has seen an explosion in the collection of genotype data in domains as diverse as medicine, ecology, livestock and plant breeding. Along with this comes the challenge of dealing with the related phenotype data, which is not only large but also highly multidimensional. Computational analysis of phenotypes has therefore become critical for our ability to understand the biological meaning of genomic data in the biological sciences. At the heart of computational phenotype analysis are the phenotype ontologies. A large number of these ontologies have been developed across many domains, and we are now at a point where the knowledge captured in the structure of these ontologies can be used for the integration and analysis of large interrelated data sets. The Phenotype And Trait Ontology framework provides a method for formal definitions of phenotypes and associated data sets and has proved to be key to our ability to develop methods for the integration and analysis of phenotype data. Here, we describe the development and products of the ontological approach to phenotype capture, the formal content of phenotype ontologies and how their content can be used computationally.
Collapse
Affiliation(s)
| | | | - Robert Hoehndorf
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, King Abdullah University of Science and Technology, Thuwal
| |
Collapse
|
18
|
Abburu S. Ontology Driven Cross-Linked Domain Data Integration and Spatial Semantic Multi Criteria Query System for Geospatial Public Health. INT J SEMANT WEB INF 2018. [DOI: 10.4018/ijswis.2018070101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This article describes how public health information management is an interdisciplinary application which deals with cross linked application domains. Geospatial environment, place and meteorology parameters effect public health. Effective decision making plays a vital role and requires disease data analysis which in turn requires effective Public Health Knowledge Base (PHKB) and a strong efficient query engine. Ontologies enhance the performance of the retrieval system and achieve application interoperability. The current research aims at building PHKB through ontology based cross linked domain integration. It designs a dynamic GeoSPARQL query building from simple form based query composition. The spatial semantic multi criteria query engine is developed by identifying all possible query patterns considering the ontology elements and multi criteria from cross linked application domains. The research has adopted OGC, W3C, WHO and mHealth standards.
Collapse
|
19
|
Hu Y, Zhao T, Zhang N, Zang T, Zhang J, Cheng L. Identifying diseases-related metabolites using random walk. BMC Bioinformatics 2018; 19:116. [PMID: 29671398 PMCID: PMC5907145 DOI: 10.1186/s12859-018-2098-1] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Background Metabolites disrupted by abnormal state of human body are deemed as the effect of diseases. In comparison with the cause of diseases like genes, these markers are easier to be captured for the prevention and diagnosis of metabolic diseases. Currently, a large number of metabolic markers of diseases need to be explored, which drive us to do this work. Methods The existing metabolite-disease associations were extracted from Human Metabolome Database (HMDB) using a text mining tool NCBO annotator as priori knowledge. Next we calculated the similarity of a pair-wise metabolites based on the similarity of disease sets of them. Then, all the similarities of metabolite pairs were utilized for constructing a weighted metabolite association network (WMAN). Subsequently, the network was utilized for predicting novel metabolic markers of diseases using random walk. Results Totally, 604 metabolites and 228 diseases were extracted from HMDB. From 604 metabolites, 453 metabolites are selected to construct the WMAN, where each metabolite is deemed as a node, and the similarity of two metabolites as the weight of the edge linking them. The performance of the network is validated using the leave one out method. As a result, the high area under the receiver operating characteristic curve (AUC) (0.7048) is achieved. The further case studies for identifying novel metabolites of diabetes mellitus were validated in the recent studies. Conclusion In this paper, we presented a novel method for prioritizing metabolite-disease pairs. The superior performance validates its reliability for exploring novel metabolic markers of diseases.
Collapse
Affiliation(s)
- Yang Hu
- School of Life Science and Technology, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, People's Republic of China
| | - Tianyi Zhao
- School of Life Science and Technology, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, People's Republic of China
| | - Ningyi Zhang
- School of Life Science and Technology, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, People's Republic of China
| | - Tianyi Zang
- School of Life Science and Technology, Department of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, People's Republic of China.
| | - Jun Zhang
- Department of rehabilitation, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin, 150001, People's Republic of China.
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150001, China.
| |
Collapse
|
20
|
Bello SM, Shimoyama M, Mitraka E, Laulederkind SJF, Smith CL, Eppig JT, Schriml LM. Disease Ontology: improving and unifying disease annotations across species. Dis Model Mech 2018; 11:dmm.032839. [PMID: 29590633 PMCID: PMC5897730 DOI: 10.1242/dmm.032839] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Accepted: 02/08/2018] [Indexed: 11/20/2022] Open
Abstract
Model organisms are vital to uncovering the mechanisms of human disease and developing new therapeutic tools. Researchers collecting and integrating relevant model organism and/or human data often apply disparate terminologies (vocabularies and ontologies), making comparisons and inferences difficult. A unified disease ontology is required that connects data annotated using diverse disease terminologies, and in which the terminology relationships are continuously maintained. The Mouse Genome Database (MGD, http://www.informatics.jax.org), Rat Genome Database (RGD, http://rgd.mcw.edu) and Disease Ontology (DO, http://www.disease-ontology.org) projects are collaborating to augment DO, aligning and incorporating disease terms used by MGD and RGD, and improving DO as a tool for unifying disease annotations across species. Coordinated assessment of MGD's and RGD's disease term annotations identified new terms that enhance DO's representation of human diseases. Expansion of DO term content and cross-references to clinical vocabularies (e.g. OMIM, ORDO, MeSH) has enriched the DO's domain coverage and utility for annotating many types of data generated from experimental and clinical investigations. The extension of anatomy-based DO classification structure of disease improves accessibility of terms and facilitates application of DO for computational research. A consistent representation of disease associations across data types from cellular to whole organism, generated from clinical and model organism studies, will promote the integration, mining and comparative analysis of these data. The coordinated enrichment of the DO and adoption of DO by MGD and RGD demonstrates DO's usability across human data, MGD, RGD and the rest of the model organism database community. Summary: Analyzing diverse disease data requires a comprehensive, robust disease ontology to integrate annotations and retrieve accurate, interpretable results. MGD, RGD and DO are working in collaboration to achieve this goal.
Collapse
Affiliation(s)
| | - Mary Shimoyama
- Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Elvira Mitraka
- Department of Epidemiology and Public Health, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | | | | | | | - Lynn M Schriml
- Department of Epidemiology and Public Health, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| |
Collapse
|
21
|
El-Sappagh S, Kwak D, Ali F, Kwak KS. DMTO: a realistic ontology for standard diabetes mellitus treatment. J Biomed Semantics 2018; 9:8. [PMID: 29409535 PMCID: PMC5800094 DOI: 10.1186/s13326-018-0176-y] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Accepted: 01/04/2018] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Treatment of type 2 diabetes mellitus (T2DM) is a complex problem. A clinical decision support system (CDSS) based on massive and distributed electronic health record data can facilitate the automation of this process and enhance its accuracy. The most important component of any CDSS is its knowledge base. This knowledge base can be formulated using ontologies. The formal description logic of ontology supports the inference of hidden knowledge. Building a complete, coherent, consistent, interoperable, and sharable ontology is a challenge. RESULTS This paper introduces the first version of the newly constructed Diabetes Mellitus Treatment Ontology (DMTO) as a basis for shared-semantics, domain-specific, standard, machine-readable, and interoperable knowledge relevant to T2DM treatment. It is a comprehensive ontology and provides the highest coverage and the most complete picture of coded knowledge about T2DM patients' current conditions, previous profiles, and T2DM-related aspects, including complications, symptoms, lab tests, interactions, treatment plan (TP) frameworks, and glucose-related diseases and medications. It adheres to the design principles recommended by the Open Biomedical Ontologies Foundry and is based on ontological realism that follows the principles of the Basic Formal Ontology and the Ontology for General Medical Science. DMTO is implemented under Protégé 5.0 in Web Ontology Language (OWL) 2 format and is publicly available through the National Center for Biomedical Ontology's BioPortal at http://bioportal.bioontology.org/ontologies/DMTO . The current version of DMTO includes more than 10,700 classes, 277 relations, 39,425 annotations, 214 semantic rules, and 62,974 axioms. We provide proof of concept for this approach to modeling TPs. CONCLUSION The ontology is able to collect and analyze most features of T2DM as well as customize chronic TPs with the most appropriate drugs, foods, and physical exercises. DMTO is ready to be used as a knowledge base for semantically intelligent and distributed CDSS systems.
Collapse
Affiliation(s)
- Shaker El-Sappagh
- Information Systems Department, Faculty of Computers and Informatics, Benha University, Banha Mansura Road, Meit Ghamr - Benha, Banha, Al Qalyubia Governorate 3000-104 Egypt
| | - Daehan Kwak
- Department of Computer Science, Kean University, Union, NJ 07083 USA
| | - Farman Ali
- Department of Information and Communication Engineering, Inha University, 100 Inharo, Nam-gu, Incheon, 22212 South Korea
| | - Kyung-Sup Kwak
- Department of Information and Communication Engineering, Inha University, 100 Inharo, Nam-gu, Incheon, 22212 South Korea
| |
Collapse
|
22
|
Van Slyke CE, Bradford YM, Howe DG, Fashena DS, Ramachandran S, Ruzicka L. Using ZFIN: Data Types, Organization, and Retrieval. Methods Mol Biol 2018; 1757:307-347. [PMID: 29761463 PMCID: PMC6319390 DOI: 10.1007/978-1-4939-7737-6_11] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
The Zebrafish Model Organism Database (ZFIN; zfin.org) was established in 1994 as the primary genetic and genomic resource for the zebrafish research community. Some of the earliest records in ZFIN were for people and laboratories. Since that time, services and data types provided by ZFIN have grown considerably. Today, ZFIN provides the official nomenclature for zebrafish genes, mutants, and transgenics and curates many data types including gene expression, phenotypes, Gene Ontology, models of human disease, orthology, knockdown reagents, transgenic constructs, and antibodies. Ontologies are used throughout ZFIN to structure these expertly curated data. An integrated genome browser provides genomic context for genes, transgenics, mutants, and knockdown reagents. ZFIN also supports a community wiki where the research community can post new antibody records and research protocols. Data in ZFIN are accessible via web pages, download files, and the ZebrafishMine (zebrafishmine.org), an installation of the InterMine data warehousing software. Searching for data at ZFIN utilizes both parameterized search forms and a single box search for searching or browsing data quickly. This chapter aims to describe the primary ZFIN data and services, and provide insight into how to use and interpret ZFIN searches, data, and web pages.
Collapse
Affiliation(s)
- Ceri E Van Slyke
- The Zebrafish Information Network, University of Oregon, Eugene, OR, USA.
| | - Yvonne M Bradford
- The Zebrafish Information Network, University of Oregon, Eugene, OR, USA
| | - Douglas G Howe
- The Zebrafish Information Network, University of Oregon, Eugene, OR, USA
| | - David S Fashena
- The Zebrafish Information Network, University of Oregon, Eugene, OR, USA
| | | | - Leyla Ruzicka
- The Zebrafish Information Network, University of Oregon, Eugene, OR, USA
| |
Collapse
|
23
|
Eppig JT. Mouse Genome Informatics (MGI) Resource: Genetic, Genomic, and Biological Knowledgebase for the Laboratory Mouse. ILAR J 2017; 58:17-41. [PMID: 28838066 PMCID: PMC5886341 DOI: 10.1093/ilar/ilx013] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Revised: 03/14/2017] [Accepted: 03/28/2017] [Indexed: 12/13/2022] Open
Abstract
The Mouse Genome Informatics (MGI) Resource supports basic, translational, and computational research by providing high-quality, integrated data on the genetics, genomics, and biology of the laboratory mouse. MGI serves a strategic role for the scientific community in facilitating biomedical, experimental, and computational studies investigating the genetics and processes of diseases and enabling the development and testing of new disease models and therapeutic interventions. This review describes the nexus of the body of growing genetic and biological data and the advances in computer technology in the late 1980s, including the World Wide Web, that together launched the beginnings of MGI. MGI develops and maintains a gold-standard resource that reflects the current state of knowledge, provides semantic and contextual data integration that fosters hypothesis testing, continually develops new and improved tools for searching and analysis, and partners with the scientific community to assure research data needs are met. Here we describe one slice of MGI relating to the development of community-wide large-scale mutagenesis and phenotyping projects and introduce ways to access and use these MGI data. References and links to additional MGI aspects are provided.
Collapse
Affiliation(s)
- Janan T. Eppig
- Janan T. Eppig, PhD, is Professor Emeritus at The Jackson Laboratory in Bar Harbor, Maine
| |
Collapse
|
24
|
Bradford YM, Toro S, Ramachandran S, Ruzicka L, Howe DG, Eagle A, Kalita P, Martin R, Taylor Moxon SA, Schaper K, Westerfield M. Zebrafish Models of Human Disease: Gaining Insight into Human Disease at ZFIN. ILAR J 2017; 58:4-16. [PMID: 28838067 PMCID: PMC5886338 DOI: 10.1093/ilar/ilw040] [Citation(s) in RCA: 89] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Revised: 12/12/2016] [Accepted: 12/19/2016] [Indexed: 12/18/2022] Open
Abstract
The Zebrafish Model Organism Database (ZFIN; https://zfin.org) is the central resource for genetic, genomic, and phenotypic data for zebrafish (Danio rerio) research. ZFIN continuously assesses trends in zebrafish research, adding new data types and providing data repositories and tools that members of the research community can use to navigate data. The many research advantages and flexibility of manipulation of zebrafish have made them an increasingly attractive animal to model and study human disease.To facilitate disease-related research, ZFIN developed support to provide human disease information as well as annotation of zebrafish models of human disease. Human disease term pages at ZFIN provide information about disease names, synonyms, and references to other databases as well as a list of publications reporting studies of human diseases in which zebrafish were used. Zebrafish orthologs of human genes that are implicated in human disease etiology are routinely studied to provide an understanding of the molecular basis of disease. Therefore, a list of human genes involved in the disease with their corresponding zebrafish ortholog is displayed on the disease page, with links to additional information regarding the genes and existing mutations. Studying human disease often requires the use of models that recapitulate some or all of the pathologies observed in human diseases. Access to information regarding existing and published models can be critical, because they provide a tractable way to gain insight into the phenotypic outcomes of the disease. ZFIN annotates zebrafish models of human disease and supports retrieval of these published models by listing zebrafish models on the disease term page as well as by providing search interfaces and data download files to access the data. The improvements ZFIN has made to annotate, display, and search data related to human disease, especially zebrafish models for disease and disease-associated gene information, should be helpful to researchers and clinicians considering the use of zebrafish to study human disease.
Collapse
Affiliation(s)
- Yvonne M. Bradford
- Yvonne M. Bradford, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sabrina Toro, PhD, is a scientific curator for the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sridhar Ramachandran, MS, is a scientific curator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Leyla Ruzicka, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database, at the University of Oregon in Eugene, Oregon. Douglas G. Howe, PhD, is the Data Curation Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Anne Eagle, MSCS, is the Software Development and Project Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Patrick Kalita, MS, is a software developer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Ryan Martin, MS, is a systems administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sierra A. Taylor Moxon, BA, is the Database Administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Kevin Schaper, BS, is a software engineer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Monte Westerfield, PhD, is a Professor of Biology in the Institute of Neuroscience and directs the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon.
| | - Sabrina Toro
- Yvonne M. Bradford, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sabrina Toro, PhD, is a scientific curator for the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sridhar Ramachandran, MS, is a scientific curator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Leyla Ruzicka, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database, at the University of Oregon in Eugene, Oregon. Douglas G. Howe, PhD, is the Data Curation Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Anne Eagle, MSCS, is the Software Development and Project Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Patrick Kalita, MS, is a software developer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Ryan Martin, MS, is a systems administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sierra A. Taylor Moxon, BA, is the Database Administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Kevin Schaper, BS, is a software engineer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Monte Westerfield, PhD, is a Professor of Biology in the Institute of Neuroscience and directs the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon.
| | - Sridhar Ramachandran
- Yvonne M. Bradford, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sabrina Toro, PhD, is a scientific curator for the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sridhar Ramachandran, MS, is a scientific curator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Leyla Ruzicka, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database, at the University of Oregon in Eugene, Oregon. Douglas G. Howe, PhD, is the Data Curation Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Anne Eagle, MSCS, is the Software Development and Project Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Patrick Kalita, MS, is a software developer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Ryan Martin, MS, is a systems administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sierra A. Taylor Moxon, BA, is the Database Administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Kevin Schaper, BS, is a software engineer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Monte Westerfield, PhD, is a Professor of Biology in the Institute of Neuroscience and directs the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon.
| | - Leyla Ruzicka
- Yvonne M. Bradford, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sabrina Toro, PhD, is a scientific curator for the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sridhar Ramachandran, MS, is a scientific curator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Leyla Ruzicka, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database, at the University of Oregon in Eugene, Oregon. Douglas G. Howe, PhD, is the Data Curation Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Anne Eagle, MSCS, is the Software Development and Project Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Patrick Kalita, MS, is a software developer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Ryan Martin, MS, is a systems administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sierra A. Taylor Moxon, BA, is the Database Administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Kevin Schaper, BS, is a software engineer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Monte Westerfield, PhD, is a Professor of Biology in the Institute of Neuroscience and directs the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon.
| | - Douglas G. Howe
- Yvonne M. Bradford, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sabrina Toro, PhD, is a scientific curator for the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sridhar Ramachandran, MS, is a scientific curator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Leyla Ruzicka, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database, at the University of Oregon in Eugene, Oregon. Douglas G. Howe, PhD, is the Data Curation Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Anne Eagle, MSCS, is the Software Development and Project Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Patrick Kalita, MS, is a software developer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Ryan Martin, MS, is a systems administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sierra A. Taylor Moxon, BA, is the Database Administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Kevin Schaper, BS, is a software engineer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Monte Westerfield, PhD, is a Professor of Biology in the Institute of Neuroscience and directs the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon.
| | - Anne Eagle
- Yvonne M. Bradford, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sabrina Toro, PhD, is a scientific curator for the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sridhar Ramachandran, MS, is a scientific curator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Leyla Ruzicka, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database, at the University of Oregon in Eugene, Oregon. Douglas G. Howe, PhD, is the Data Curation Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Anne Eagle, MSCS, is the Software Development and Project Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Patrick Kalita, MS, is a software developer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Ryan Martin, MS, is a systems administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sierra A. Taylor Moxon, BA, is the Database Administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Kevin Schaper, BS, is a software engineer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Monte Westerfield, PhD, is a Professor of Biology in the Institute of Neuroscience and directs the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon.
| | - Patrick Kalita
- Yvonne M. Bradford, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sabrina Toro, PhD, is a scientific curator for the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sridhar Ramachandran, MS, is a scientific curator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Leyla Ruzicka, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database, at the University of Oregon in Eugene, Oregon. Douglas G. Howe, PhD, is the Data Curation Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Anne Eagle, MSCS, is the Software Development and Project Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Patrick Kalita, MS, is a software developer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Ryan Martin, MS, is a systems administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sierra A. Taylor Moxon, BA, is the Database Administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Kevin Schaper, BS, is a software engineer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Monte Westerfield, PhD, is a Professor of Biology in the Institute of Neuroscience and directs the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon.
| | - Ryan Martin
- Yvonne M. Bradford, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sabrina Toro, PhD, is a scientific curator for the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sridhar Ramachandran, MS, is a scientific curator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Leyla Ruzicka, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database, at the University of Oregon in Eugene, Oregon. Douglas G. Howe, PhD, is the Data Curation Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Anne Eagle, MSCS, is the Software Development and Project Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Patrick Kalita, MS, is a software developer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Ryan Martin, MS, is a systems administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sierra A. Taylor Moxon, BA, is the Database Administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Kevin Schaper, BS, is a software engineer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Monte Westerfield, PhD, is a Professor of Biology in the Institute of Neuroscience and directs the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon.
| | - Sierra A. Taylor Moxon
- Yvonne M. Bradford, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sabrina Toro, PhD, is a scientific curator for the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sridhar Ramachandran, MS, is a scientific curator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Leyla Ruzicka, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database, at the University of Oregon in Eugene, Oregon. Douglas G. Howe, PhD, is the Data Curation Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Anne Eagle, MSCS, is the Software Development and Project Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Patrick Kalita, MS, is a software developer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Ryan Martin, MS, is a systems administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sierra A. Taylor Moxon, BA, is the Database Administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Kevin Schaper, BS, is a software engineer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Monte Westerfield, PhD, is a Professor of Biology in the Institute of Neuroscience and directs the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon.
| | - Kevin Schaper
- Yvonne M. Bradford, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sabrina Toro, PhD, is a scientific curator for the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sridhar Ramachandran, MS, is a scientific curator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Leyla Ruzicka, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database, at the University of Oregon in Eugene, Oregon. Douglas G. Howe, PhD, is the Data Curation Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Anne Eagle, MSCS, is the Software Development and Project Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Patrick Kalita, MS, is a software developer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Ryan Martin, MS, is a systems administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sierra A. Taylor Moxon, BA, is the Database Administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Kevin Schaper, BS, is a software engineer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Monte Westerfield, PhD, is a Professor of Biology in the Institute of Neuroscience and directs the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon.
| | - Monte Westerfield
- Yvonne M. Bradford, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sabrina Toro, PhD, is a scientific curator for the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sridhar Ramachandran, MS, is a scientific curator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Leyla Ruzicka, PhD, is a scientific curator and senior research associate at the Zebrafish Model Organism Database, at the University of Oregon in Eugene, Oregon. Douglas G. Howe, PhD, is the Data Curation Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Anne Eagle, MSCS, is the Software Development and Project Manager at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Patrick Kalita, MS, is a software developer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Ryan Martin, MS, is a systems administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Sierra A. Taylor Moxon, BA, is the Database Administrator at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Kevin Schaper, BS, is a software engineer at the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon. Monte Westerfield, PhD, is a Professor of Biology in the Institute of Neuroscience and directs the Zebrafish Model Organism Database at the University of Oregon in Eugene, Oregon.
| |
Collapse
|
25
|
Barcellos Almeida M, Farinelli F. Ontologies for the representation of electronic medical records: The obstetric and neonatal ontology. J Assoc Inf Sci Technol 2017. [DOI: 10.1002/asi.23900] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Mauricio Barcellos Almeida
- Graduate Program of Knowledge Management & Organization; School of Information Science, Federal University of Minas Gerais, Av. Antônio Carlos, 6627, Campus Pampulha; Belo Horizonte Minas Gerais CEP 31.270-901 Brazil
| | - Fernanda Farinelli
- Graduate Program of Knowledge Management & Organization; School of Information Science, Federal University of Minas Gerais, Av. Antônio Carlos, 6627, Campus Pampulha; Belo Horizonte Minas Gerais CEP 31.270-901 Brazil
| |
Collapse
|
26
|
Jia J, Shi T. Towards efficiency in rare disease research: what is distinctive and important? SCIENCE CHINA-LIFE SCIENCES 2017. [PMID: 28639105 DOI: 10.1007/s11427-017-9099-3] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Characterized by their low prevalence, rare diseases are often chronically debilitating or life threatening. Despite their low prevalence, the aggregate number of individuals suffering from a rare disease is estimated to be nearly 400 million worldwide. Over the past decades, efforts from researchers, clinicians, and pharmaceutical industries have been focused on both the diagnosis and therapy of rare diseases. However, because of the lack of data and medical records for individual rare diseases and the high cost of orphan drug development, only limited progress has been achieved. In recent years, the rapid development of next-generation sequencing (NGS)-based technologies, as well as the popularity of precision medicine has facilitated a better understanding of rare diseases and their molecular etiology. As a result, molecular subclassification can be identified within each disease more clearly, significantly improving diagnostic accuracy. However, providing appropriate care for patients with rare diseases is still an enormous challenge. In this review, we provide a brief introduction to the challenges of rare disease research and make suggestions on where and how our efforts should be focused.
Collapse
Affiliation(s)
- Jinmeng Jia
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Tieliu Shi
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China.
| |
Collapse
|
27
|
Howe DG, Bradford YM, Eagle A, Fashena D, Frazer K, Kalita P, Mani P, Martin R, Moxon ST, Paddock H, Pich C, Ramachandran S, Ruzicka L, Schaper K, Shao X, Singer A, Toro S, Van Slyke C, Westerfield M. The Zebrafish Model Organism Database: new support for human disease models, mutation details, gene expression phenotypes and searching. Nucleic Acids Res 2016; 45:D758-D768. [PMID: 27899582 PMCID: PMC5210580 DOI: 10.1093/nar/gkw1116] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2016] [Revised: 10/25/2016] [Accepted: 10/27/2016] [Indexed: 12/16/2022] Open
Abstract
The Zebrafish Model Organism Database (ZFIN; http://zfin.org) is the central resource for zebrafish (Danio rerio) genetic, genomic, phenotypic and developmental data. ZFIN curators provide expert manual curation and integration of comprehensive data involving zebrafish genes, mutants, transgenic constructs and lines, phenotypes, genotypes, gene expressions, morpholinos, TALENs, CRISPRs, antibodies, anatomical structures, models of human disease and publications. We integrate curated, directly submitted, and collaboratively generated data, making these available to zebrafish research community. Among the vertebrate model organisms, zebrafish are superbly suited for rapid generation of sequence-targeted mutant lines, characterization of phenotypes including gene expression patterns, and generation of human disease models. The recent rapid adoption of zebrafish as human disease models is making management of these data particularly important to both the research and clinical communities. Here, we describe recent enhancements to ZFIN including use of the zebrafish experimental conditions ontology, ‘Fish’ records in the ZFIN database, support for gene expression phenotypes, models of human disease, mutation details at the DNA, RNA and protein levels, and updates to the ZFIN single box search.
Collapse
Affiliation(s)
- Douglas G Howe
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Yvonne M Bradford
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Anne Eagle
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - David Fashena
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Ken Frazer
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Patrick Kalita
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Prita Mani
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Ryan Martin
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Sierra Taylor Moxon
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Holly Paddock
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Christian Pich
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | | | - Leyla Ruzicka
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Kevin Schaper
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Xiang Shao
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Amy Singer
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Sabrina Toro
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Ceri Van Slyke
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Monte Westerfield
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| |
Collapse
|
28
|
Hochheiser H, Castine M, Harris D, Savova G, Jacobson RS. An information model for computable cancer phenotypes. BMC Med Inform Decis Mak 2016; 16:121. [PMID: 27629872 PMCID: PMC5024416 DOI: 10.1186/s12911-016-0358-4] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2016] [Accepted: 09/01/2016] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Standards, methods, and tools supporting the integration of clinical data and genomic information are an area of significant need and rapid growth in biomedical informatics. Integration of cancer clinical data and cancer genomic information poses unique challenges, because of the high volume and complexity of clinical data, as well as the heterogeneity and instability of cancer genome data when compared with germline data. Current information models of clinical and genomic data are not sufficiently expressive to represent individual observations and to aggregate those observations into longitudinal summaries over the course of cancer care. These models are acutely needed to support the development of systems and tools for generating the so called clinical "deep phenotype" of individual cancer patients, a process which remains almost entirely manual in cancer research and precision medicine. METHODS Reviews of existing ontologies and interviews with cancer researchers were used to inform iterative development of a cancer phenotype information model. We translated a subset of the Fast Healthcare Interoperability Resources (FHIR) models into the OWL 2 Description Logic (DL) representation, and added extensions as needed for modeling cancer phenotypes with terms derived from the NCI Thesaurus. Models were validated with domain experts and evaluated against competency questions. RESULTS The DeepPhe Information model represents cancer phenotype data at increasing levels of abstraction from mention level in clinical documents to summaries of key events and findings. We describe the model using breast cancer as an example, depicting methods to represent phenotypic features of cancers, tumors, treatment regimens, and specific biologic behaviors that span the entire course of a patient's disease. CONCLUSIONS We present a multi-scale information model for representing individual document mentions, document level classifications, episodes along a disease course, and phenotype summarization, linking individual observations to high-level summaries in support of subsequent integration and analysis.
Collapse
Affiliation(s)
- Harry Hochheiser
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Boulevard, Rm 523, Pittsburgh, 15206-3701, PA, USA. .,Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA.
| | - Melissa Castine
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Boulevard, Rm 523, Pittsburgh, 15206-3701, PA, USA
| | - David Harris
- Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Guergana Savova
- Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
| | - Rebecca S Jacobson
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, 5607 Baum Boulevard, Rm 523, Pittsburgh, 15206-3701, PA, USA.,Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA.,University of Pittsburgh Cancer Institute, Pittsburgh, PA, USA
| |
Collapse
|
29
|
Vitali F, Cohen LD, Demartini A, Amato A, Eterno V, Zambelli A, Bellazzi R. A Network-Based Data Integration Approach to Support Drug Repurposing and Multi-Target Therapies in Triple Negative Breast Cancer. PLoS One 2016; 11:e0162407. [PMID: 27632168 PMCID: PMC5025072 DOI: 10.1371/journal.pone.0162407] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2016] [Accepted: 08/22/2016] [Indexed: 01/08/2023] Open
Abstract
The integration of data and knowledge from heterogeneous sources can be a key success factor in drug design, drug repurposing and multi-target therapies. In this context, biological networks provide a useful instrument to highlight the relationships and to model the phenomena underlying therapeutic action in cancer. In our work, we applied network-based modeling within a novel bioinformatics pipeline to identify promising multi-target drugs. Given a certain tumor type/subtype, we derive a disease-specific Protein-Protein Interaction (PPI) network by combining different data-bases and knowledge repositories. Next, the application of suitable graph-based algorithms allows selecting a set of potentially interesting combinations of drug targets. A list of drug candidates is then extracted by applying a recent data fusion approach based on matrix tri-factorization. Available knowledge about selected drugs mechanisms of action is finally exploited to identify the most promising candidates for planning in vitro studies. We applied this approach to the case of Triple Negative Breast Cancer (TNBC), a subtype of breast cancer whose biology is poorly understood and that lacks of specific molecular targets. Our “in-silico” findings have been confirmed by a number of in vitro experiments, whose results demonstrated the ability of the method to select candidates for drug repurposing.
Collapse
Affiliation(s)
- Francesca Vitali
- Dipartimento di Ingegneria Industriale e dell'Informazione, Università di Pavia, Pavia, Italy
- * E-mail:
| | - Laurie D. Cohen
- Dipartimento di Ingegneria Industriale e dell'Informazione, Università di Pavia, Pavia, Italy
| | - Andrea Demartini
- Dipartimento di Ingegneria Industriale e dell'Informazione, Università di Pavia, Pavia, Italy
| | | | | | - Alberto Zambelli
- IRCCS-Fondazione S. Maugeri, Pavia, Italy
- Oncologia Medica, ASST Papa Giovanni XXIII, Bergamo, Italy
| | - Riccardo Bellazzi
- Dipartimento di Ingegneria Industriale e dell'Informazione, Università di Pavia, Pavia, Italy
- IRCCS-Fondazione S. Maugeri, Pavia, Italy
| |
Collapse
|
30
|
Wang Q, S Abdul S, Almeida L, Ananiadou S, Balderas-Martínez YI, Batista-Navarro R, Campos D, Chilton L, Chou HJ, Contreras G, Cooper L, Dai HJ, Ferrell B, Fluck J, Gama-Castro S, George N, Gkoutos G, Irin AK, Jensen LJ, Jimenez S, Jue TR, Keseler I, Madan S, Matos S, McQuilton P, Milacic M, Mort M, Natarajan J, Pafilis E, Pereira E, Rao S, Rinaldi F, Rothfels K, Salgado D, Silva RM, Singh O, Stefancsik R, Su CH, Subramani S, Tadepally HD, Tsaprouni L, Vasilevsky N, Wang X, Chatr-Aryamontri A, Laulederkind SJF, Matis-Mitchell S, McEntyre J, Orchard S, Pundir S, Rodriguez-Esteban R, Van Auken K, Lu Z, Schaeffer M, Wu CH, Hirschman L, Arighi CN. Overview of the interactive task in BioCreative V. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw119. [PMID: 27589961 PMCID: PMC5009325 DOI: 10.1093/database/baw119] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Accepted: 07/28/2016] [Indexed: 11/14/2022]
Abstract
Fully automated text mining (TM) systems promote efficient literature searching, retrieval, and review but are not sufficient to produce ready-to-consume curated documents. These systems are not meant to replace biocurators, but instead to assist them in one or more literature curation steps. To do so, the user interface is an important aspect that needs to be considered for tool adoption. The BioCreative Interactive task (IAT) is a track designed for exploring user-system interactions, promoting development of useful TM tools, and providing a communication channel between the biocuration and the TM communities. In BioCreative V, the IAT track followed a format similar to previous interactive tracks, where the utility and usability of TM tools, as well as the generation of use cases, have been the focal points. The proposed curation tasks are user-centric and formally evaluated by biocurators. In BioCreative V IAT, seven TM systems and 43 biocurators participated. Two levels of user participation were offered to broaden curator involvement and obtain more feedback on usability aspects. The full level participation involved training on the system, curation of a set of documents with and without TM assistance, tracking of time-on-task, and completion of a user survey. The partial level participation was designed to focus on usability aspects of the interface and not the performance per se. In this case, biocurators navigated the system by performing pre-designed tasks and then were asked whether they were able to achieve the task and the level of difficulty in completing the task. In this manuscript, we describe the development of the interactive task, from planning to execution and discuss major findings for the systems tested. Database URL:http://www.biocreative.org
Collapse
Affiliation(s)
- Qinghua Wang
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, 19711, USA Department of Computer and Information Sciences, University of Delaware, Newark, DE, 19711, USA
| | - Shabbir S Abdul
- International Centre of Health Information Technology, Taipei Medical University, Taipei, Taiwan
| | - Lara Almeida
- DETI/IEETA, University of Aveiro, Campus Universitário de Santiago, Aveiro 3810-193, Portugal
| | - Sophia Ananiadou
- National Centre for Text Mining, University of Manchester, Manchester, UK
| | | | | | | | - Lucy Chilton
- Northern Institute for Cancer Research, Newcastle University, New Castle, UK
| | - Hui-Jou Chou
- Rutgers University-Camden, Camden, NJ 08102, USA
| | - Gabriela Contreras
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, 04510 Ciudad de México, México
| | - Laurel Cooper
- Department of Botany and Plant Pathology, Oregon State University Corvallis, OR 97331, USA
| | - Hong-Jie Dai
- Department of Computer Science and Information Engineering, National Taitung University, Taitung, Taiwan
| | - Barbra Ferrell
- College of Agriculture and Natural Resources, University of Delaware, Newark, DE 19711, USA
| | - Juliane Fluck
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, 53754 St. Augustin, Germany
| | - Socorro Gama-Castro
- Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, 04510 Ciudad de México, México
| | | | - Georgios Gkoutos
- College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham B15 2TT, UK Institute of Translational Medicine, University Hospitals Birmingham NHS Foundation Trust, Birmingham B15 2TT, UK
| | - Afroza K Irin
- Life Science Informatics, University of Bonn, Bonn, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Silvia Jimenez
- Blue Brain Project, École Polytechnique Fédérale de Lausanne (EPFL) Biotech Campus, Geneva, Switzerland
| | - Toni R Jue
- Prince of Wales Clinical School, University of New South Wales NSW, Sydney, New South Wales, Australia
| | | | - Sumit Madan
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, 53754 St. Augustin, Germany
| | - Sérgio Matos
- DETI/IEETA, University of Aveiro, Campus Universitário de Santiago, Aveiro 3810-193, Portugal
| | | | - Marija Milacic
- Department of Informatics and Bio-Computing, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Matthew Mort
- HGMD, Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, UK
| | - Jeyakumar Natarajan
- Department of Bioinformatics, Bharathiar University, Coimbatore, Tamil Nadu, India
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, Greece
| | - Emiliano Pereira
- Microbial Genomics and Bioinformatics Group, Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Shruti Rao
- Innovation Center for Biomedical Informatics (ICBI), Georgetown University, Washington, DC 20007, USA
| | - Fabio Rinaldi
- Institute of Computational Linguistics, University of Zurich, Zurich, Switzerland
| | - Karen Rothfels
- Department of Informatics and Bio-Computing, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - David Salgado
- GMGF, Aix-Marseille Universite, 13385 Marseille, France Inserm, UMR_S 910, 13385 Marseille, France
| | - Raquel M Silva
- Department of Medical Sciences, iBiMED & IEETA, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Onkar Singh
- Taipei Medical University Graduate Institute of Biomedical informatics, Taipei, Taiwan
| | | | - Chu-Hsien Su
- Institute of Information Science, Academia Sinica, Taipei, Taiwan
| | - Suresh Subramani
- Department of Bioinformatics, Bharathiar University, Coimbatore, Tamil Nadu, India
| | | | - Loukia Tsaprouni
- Institute of Sport and Physical Activity Research (ISPAR), University of Bedfordshire, Bedford, UK
| | - Nicole Vasilevsky
- Ontology Development Group, Oregon Health & Science University, Portland, OR 97239, USA
| | - Xiaodong Wang
- WormBase Consortium, Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | | | | | | | | | - Sandra Orchard
- European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Sangya Pundir
- European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | | | - Kimberly Van Auken
- WormBase Consortium, Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI), National Institutes of Health, Bethesda, MD 20894, USA
| | - Mary Schaeffer
- MaizeGDB USDA ARS and University of Missouri, Columbia, MO 65211, USA
| | - Cathy H Wu
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, 19711, USA Department of Computer and Information Sciences, University of Delaware, Newark, DE, 19711, USA
| | | | - Cecilia N Arighi
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, 19711, USA Department of Computer and Information Sciences, University of Delaware, Newark, DE, 19711, USA
| |
Collapse
|
31
|
Mattingly CJ, Boyles R, Lawler CP, Haugen AC, Dearry A, Haendel M. Laying a Community-Based Foundation for Data-Driven Semantic Standards in Environmental Health Sciences. ENVIRONMENTAL HEALTH PERSPECTIVES 2016; 124:1136-40. [PMID: 26871594 PMCID: PMC4977056 DOI: 10.1289/ehp.1510438] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Revised: 12/17/2015] [Accepted: 02/03/2016] [Indexed: 05/19/2023]
Abstract
BACKGROUND Despite increasing availability of environmental health science (EHS) data, development, and implementation of relevant semantic standards, such as ontologies or hierarchical vocabularies, has lagged. Consequently, integration and analysis of information needed to better model environmental influences on human health remains a significant challenge. OBJECTIVES We aimed to identify a committed community and mechanisms needed to develop EHS semantic standards that will advance understanding about the impacts of environmental exposures on human disease. METHODS The National Institute of Environmental Health Sciences sponsored the "Workshop for the Development of a Framework for Environmental Health Science Language" hosted at North Carolina State University on 15-16 September 2014. Through the assembly of data generators, users, publishers, and funders, we aimed to develop a foundation for enabling the development of community-based and data-driven standards that will ultimately improve standardization, sharing, and interoperability of EHS information. DISCUSSION Creating and maintaining an EHS common language is a continuous and iterative process, requiring community building around research interests and needs, enabling integration and reuse of existing data, and providing a low barrier of access for researchers needing to use or extend such a resource. CONCLUSIONS Recommendations included developing a community-supported web-based toolkit that would enable a) collaborative development of EHS research questions and use cases, b) construction of user-friendly tools for searching and extending existing semantic resources, c) education and guidance about standards and their implementation, and d) creation of a plan for governance and sustainability. CITATION Mattingly CJ, Boyles R, Lawler CP, Haugen AC, Dearry A, Haendel M. 2016. Laying a community-based foundation for data-driven semantic standards in environmental health sciences. Environ Health Perspect 124:1136-1140; http://dx.doi.org/10.1289/ehp.1510438.
Collapse
Affiliation(s)
- Carolyn J. Mattingly
- Department of Biological Sciences, and
- Center for Human Health and the Environment, North Carolina State University, Raleigh, North Carolina, USA
- Address correspondence to C.J. Mattingly, Department of Biological Sciences, North Carolina State University, Campus Box 7633, Raleigh, NC 27695-7617 USA. Telephone: (919) 515-1509. E-mail:
| | - Rebecca Boyles
- National Institute of Environmental Health Sciences, National Institutes of Health, Department of Health and Human Services, Research Triangle Park, North Carolina, USA
| | - Cindy P. Lawler
- National Institute of Environmental Health Sciences, National Institutes of Health, Department of Health and Human Services, Research Triangle Park, North Carolina, USA
| | - Astrid C. Haugen
- National Institute of Environmental Health Sciences, National Institutes of Health, Department of Health and Human Services, Research Triangle Park, North Carolina, USA
| | - Allen Dearry
- National Institute of Environmental Health Sciences, National Institutes of Health, Department of Health and Human Services, Research Triangle Park, North Carolina, USA
| | - Melissa Haendel
- Library, and
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA
| |
Collapse
|
32
|
Howe DG, Bradford YM, Eagle A, Fashena D, Frazer K, Kalita P, Mani P, Martin R, Moxon ST, Paddock H, Pich C, Ramachandran S, Ruzicka L, Schaper K, Shao X, Singer A, Toro S, Van Slyke C, Westerfield M. A scientist's guide for submitting data to ZFIN. Methods Cell Biol 2016; 135:451-81. [PMID: 27443940 DOI: 10.1016/bs.mcb.2016.04.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The Zebrafish Model Organism Database (ZFIN; zfin.org) serves as the central repository for genetic and genomic data produced using zebrafish (Danio rerio). Data in ZFIN are either manually curated from peer-reviewed publications or submitted directly to ZFIN from various data repositories. Data types currently supported include mutants, transgenic lines, DNA constructs, gene expression, phenotypes, antibodies, morpholinos, TALENs, CRISPRs, disease models, movies, and images. The rapidly changing methods of genomic science have increased the production of data that cannot readily be represented in standard journal publications. These large data sets require web-based presentation. As the central repository for zebrafish research data, it has become increasingly important for ZFIN to provide the zebrafish research community with support for their data sets and guidance on what is required to submit these data to ZFIN. Regardless of their volume, all data that are submitted for inclusion in ZFIN must include a minimum set of information that describes the data. The aim of this chapter is to identify data types that fit into the current ZFIN database and explain how to provide those data in the optimal format for integration. We identify the required and optional data elements, define jargon, and present tools and templates that can help with the acquisition and organization of data as they are being prepared for submission to ZFIN. This information will also appear in the ZFIN wiki, where it will be updated as our services evolve over time.
Collapse
Affiliation(s)
- D G Howe
- University of Oregon, Eugene, OR, United States
| | | | - A Eagle
- University of Oregon, Eugene, OR, United States
| | - D Fashena
- University of Oregon, Eugene, OR, United States
| | - K Frazer
- University of Oregon, Eugene, OR, United States
| | - P Kalita
- University of Oregon, Eugene, OR, United States
| | - P Mani
- University of Oregon, Eugene, OR, United States
| | - R Martin
- University of Oregon, Eugene, OR, United States
| | - S T Moxon
- University of Oregon, Eugene, OR, United States
| | - H Paddock
- University of Oregon, Eugene, OR, United States
| | - C Pich
- University of Oregon, Eugene, OR, United States
| | | | - L Ruzicka
- University of Oregon, Eugene, OR, United States
| | - K Schaper
- University of Oregon, Eugene, OR, United States
| | - X Shao
- University of Oregon, Eugene, OR, United States
| | - A Singer
- University of Oregon, Eugene, OR, United States
| | - S Toro
- University of Oregon, Eugene, OR, United States
| | - C Van Slyke
- University of Oregon, Eugene, OR, United States
| | | |
Collapse
|
33
|
Hayman GT, Laulederkind SJF, Smith JR, Wang SJ, Petri V, Nigam R, Tutaj M, De Pons J, Dwinell MR, Shimoyama M. The Disease Portals, disease-gene annotation and the RGD disease ontology at the Rat Genome Database. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw034. [PMID: 27009807 PMCID: PMC4805243 DOI: 10.1093/database/baw034] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/24/2015] [Accepted: 02/29/2016] [Indexed: 12/23/2022]
Abstract
The Rat Genome Database (RGD;http://rgd.mcw.edu/) provides critical datasets and software tools to a diverse community of rat and non-rat researchers worldwide. To meet the needs of the many users whose research is disease oriented, RGD has created a series of Disease Portals and has prioritized its curation efforts on the datasets important to understanding the mechanisms of various diseases. Gene-disease relationships for three species, rat, human and mouse, are annotated to capture biomarkers, genetic associations, molecular mechanisms and therapeutic targets. To generate gene-disease annotations more effectively and in greater detail, RGD initially adopted the MEDIC disease vocabulary from the Comparative Toxicogenomics Database and adapted it for use by expanding this framework with the addition of over 1000 terms to create the RGD Disease Ontology (RDO). The RDO provides the foundation for, at present, 10 comprehensive disease area-related dataset and analysis platforms at RGD, the Disease Portals. Two major disease areas are the focus of data acquisition and curation efforts each year, leading to the release of the related Disease Portals. Collaborative efforts to realize a more robust disease ontology are underway. Database URL:http://rgd.mcw.edu.
Collapse
Affiliation(s)
- G Thomas Hayman
- Medical College of Wisconsin, Human and Molecular Genetics Center Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Stanley J F Laulederkind
- Medical College of Wisconsin, Human and Molecular Genetics Center Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Jennifer R Smith
- Medical College of Wisconsin, Human and Molecular Genetics Center Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Shur-Jen Wang
- Medical College of Wisconsin, Human and Molecular Genetics Center Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Victoria Petri
- Medical College of Wisconsin, Human and Molecular Genetics Center Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Rajni Nigam
- Medical College of Wisconsin, Human and Molecular Genetics Center Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Marek Tutaj
- Medical College of Wisconsin, Human and Molecular Genetics Center Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Jeff De Pons
- Medical College of Wisconsin, Human and Molecular Genetics Center Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Melinda R Dwinell
- Medical College of Wisconsin, Human and Molecular Genetics Center Department of Physiology, Medical College of Wisconsin
| | - Mary Shimoyama
- Medical College of Wisconsin, Human and Molecular Genetics Center Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| |
Collapse
|
34
|
Semantic interrogation of a multi knowledge domain ontological model of tendinopathy identifies four strong candidate risk genes. Sci Rep 2016; 6:19820. [PMID: 26804977 PMCID: PMC4726433 DOI: 10.1038/srep19820] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Accepted: 11/11/2015] [Indexed: 01/10/2023] Open
Abstract
Tendinopathy is a multifactorial syndrome characterised by tendon pain and thickening, and impaired performance during activity. Candidate gene association studies have identified genetic factors that contribute to intrinsic risk of developing tendinopathy upon exposure to extrinsic factors. Bioinformatics approaches that data-mine existing knowledge for biological relationships may assist with the identification of candidate genes. The aim of this study was to data-mine functional annotation of human genes and identify candidate genes by ontology-seeded queries capturing the features of tendinopathy. Our BioOntological Relationship Graph database (BORG) integrates multiple sources of genomic and biomedical knowledge into an on-disk semantic network where human genes and their orthologs in mouse and rat are central concepts mapped to ontology terms. The BORG was used to screen all human genes for potential links to tendinopathy. Following further prioritisation, four strong candidate genes (COL11A2, ELN, ITGB3, LOX) were identified. These genes are differentially expressed in tendinopathy, functionally linked to features of tendinopathy and previously implicated in other connective tissue diseases. In conclusion, cross-domain semantic integration of multiple sources of biomedical knowledge, and interrogation of phenotypes and gene functions associated with disease, may significantly increase the probability of identifying strong and unobvious candidate genes in genetic association studies.
Collapse
|
35
|
Patterson SE, Liu R, Statz CM, Durkin D, Lakshminarayana A, Mockus SM. The clinical trial landscape in oncology and connectivity of somatic mutational profiles to targeted therapies. Hum Genomics 2016; 10:4. [PMID: 26772741 PMCID: PMC4715272 DOI: 10.1186/s40246-016-0061-7] [Citation(s) in RCA: 83] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2015] [Accepted: 01/10/2016] [Indexed: 12/24/2022] Open
Abstract
Background Precision medicine in oncology relies on rapid associations between patient-specific variations and targeted therapeutic efficacy. Due to the advancement of genomic analysis, a vast literature characterizing cancer-associated molecular aberrations and relative therapeutic relevance has been published. However, data are not uniformly reported or readily available, and accessing relevant information in a clinically acceptable time-frame is a daunting proposition, hampering connections between patients and appropriate therapeutic options. One important therapeutic avenue for oncology patients is through clinical trials. Accordingly, a global view into the availability of targeted clinical trials would provide insight into strengths and weaknesses and potentially enable research focus. However, data regarding the landscape of clinical trials in oncology is not readily available, and as a result, a comprehensive understanding of clinical trial availability is difficult. Results To support clinical decision-making, we have developed a data loader and mapper that connects sequence information from oncology patients to data stored in an in-house database, the JAX Clinical Knowledgebase (JAX-CKB), which can be queried readily to access comprehensive data for clinical reporting via customized reporting queries. JAX-CKB functions as a repository to house expertly curated clinically relevant data surrounding our 358-gene panel, the JAX Cancer Treatment Profile (JAX CTP), and supports annotation of functional significance of molecular variants. Through queries of data housed in JAX-CKB, we have analyzed the landscape of clinical trials relevant to our 358-gene targeted sequencing panel to evaluate strengths and weaknesses in current molecular targeting in oncology. Through this analysis, we have identified patient indications, molecular aberrations, and targeted therapy classes that have strong or weak representation in clinical trials. Conclusions Here, we describe the development and disseminate system methods for associating patient genomic sequence data with clinically relevant information, facilitating interpretation and providing a mechanism for informing therapeutic decision-making. Additionally, through customized queries, we have the capability to rapidly analyze the landscape of targeted therapies in clinical trials, enabling a unique view into current therapeutic availability in oncology.
Collapse
Affiliation(s)
- Sara E Patterson
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA.
| | - Rangjiao Liu
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA.
| | - Cara M Statz
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA.
| | - Daniel Durkin
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA.
| | | | - Susan M Mockus
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Dr., Farmington, CT, 06032, USA.
| |
Collapse
|
36
|
Li WV, Razaee ZS, Li JJ. Epigenome overlap measure (EPOM) for comparing tissue/cell types based on chromatin states. BMC Genomics 2016; 17 Suppl 1:10. [PMID: 26817822 PMCID: PMC4895267 DOI: 10.1186/s12864-015-2303-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Background The dynamics of epigenomic marks in their relevant chromatin states regulate distinct gene expression patterns, biological functions and phenotypic variations in biological processes. The availability of high-throughput epigenomic data generated by next-generation sequencing technologies allows a data-driven approach to evaluate the similarities and differences of diverse tissue and cell types in terms of epigenomic features. While ChromImpute has allowed for the imputation of large-scale epigenomic information to yield more robust data to capture meaningful relationships between biological samples, widely used methods such as hierarchical clustering and correlation analysis cannot adequately utilize epigenomic data to accurately reveal the distinction and grouping of different tissue and cell types. Methods We utilize a three-step testing procedure–ANOVA, t test and overlap test to identify tissue/cell-type- associated enhancers and promoters and to calculate a newly defined Epigenomic Overlap Measure (EPOM). EPOM results in a clear correspondence map of biological samples from different tissue and cell types through comparison of epigenomic marks evaluated in their relevant chromatin states. Results Correspondence maps by EPOM show strong capability in distinguishing and grouping different tissue and cell types and reveal biologically meaningful similarities between Heart and Muscle, Blood & T-cell and HSC & B-cell, Brain and Neurosphere, etc. The gene ontology enrichment analysis both supports and explains the discoveries made by EPOM and suggests that the associated enhancers and promoters demonstrate distinguishable functions across tissue and cell types. Moreover, the tissue/cell-type-associated enhancers and promoters show enrichment in the disease-related SNPs that are also associated with the corresponding tissue or cell types. This agreement suggests the potential of identifying causal genetic variants relevant to cell-type-specific diseases from our identified associated enhancers and promoters. Conclusions The proposed EPOM measure demonstrates superior capability in grouping and finding a clear correspondence map of biological samples from different tissue and cell types. The identified associated enhancers and promoters provide a comprehensive catalog to study distinct biological processes and disease variants in different tissue and cell types. Our results also find that the associated promoters exhibit more cell-type-specific functions than the associated enhancers do, suggesting that the non-associated promoters have more housekeeping functions than the non-associated enhancers. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2303-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Wei Vivian Li
- Department of Statistics, 8125 Math Sciences Bldg., University of California, Los Angeles, CA, 90095-1554, USA.
| | - Zahra S Razaee
- Department of Statistics, 8125 Math Sciences Bldg., University of California, Los Angeles, CA, 90095-1554, USA.
| | - Jingyi Jessica Li
- Department of Statistics, 8125 Math Sciences Bldg., University of California, Los Angeles, CA, 90095-1554, USA. .,Department of Human Genetics, University of California, Los Angeles, CA, 90095-7088, USA.
| |
Collapse
|
37
|
Orechia J, Pathak A, Shi Y, Nawani A, Belozerov A, Fontes C, Lakhiani C, Jawale C, Patel C, Quinn D, Botvinnik D, Mei E, Cotter E, Byleckie J, Ullman-Cullere M, Chhetri P, Chalasani P, Karnam P, Beaudoin R, Sahu S, Belozerova Y, Mathew JP. OncDRS: An integrative clinical and genomic data platform for enabling translational research and precision medicine. Appl Transl Genom 2015; 6:18-25. [PMID: 27054074 PMCID: PMC4803771 DOI: 10.1016/j.atg.2015.08.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Accepted: 08/05/2015] [Indexed: 02/01/2023]
Abstract
We live in the genomic era of medicine, where a patient's genomic/molecular data is becoming increasingly important for disease diagnosis, identification of targeted therapy, and risk assessment for adverse reactions. However, decoding the genomic test results and integrating it with clinical data for retrospective studies and cohort identification for prospective clinical trials is still a challenging task. In order to overcome these barriers, we developed an overarching enterprise informatics framework for translational research and personalized medicine called Synergistic Patient and Research Knowledge Systems (SPARKS) and a suite of tools called Oncology Data Retrieval Systems (OncDRS). OncDRS enables seamless data integration, secure and self-navigated query and extraction of clinical and genomic data from heterogeneous sources. Within a year of release, the system has facilitated more than 1500 research queries and has delivered data for more than 50 research studies.
Collapse
Affiliation(s)
- John Orechia
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Ameet Pathak
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Yunling Shi
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Aniket Nawani
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Andrey Belozerov
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Caitlin Fontes
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Camille Lakhiani
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Chetan Jawale
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Chetansharan Patel
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Daniel Quinn
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Dmitry Botvinnik
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Eddie Mei
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Elizabeth Cotter
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - James Byleckie
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | | | - Padam Chhetri
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Poornima Chalasani
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Purushotham Karnam
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Ronald Beaudoin
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Sandeep Sahu
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Yelena Belozerova
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| | - Jomol P Mathew
- Dana-Faber Cancer Institute, 450 Brookline Ave., Boston, MA-02215, United States
| |
Collapse
|