1
|
Abstract
Dermatomyositis is a common connective tissue disease. The occurrence and development of dermatomyositis is a result of multiple factors, but its exact pathogenesis has not been fully elucidated. Here, we used biological information method to explore and predict the major disease related genes of dermatomyositis and to find the underlying pathogenic molecular mechanism.The gene expression data of GDS1956, GDS2153, GDS2855, and GDS3417 including 94 specimens, 66 cases of dermatomyositis specimens and 28 cases of normal specimens, were obtained from the Gene Expression Omnibus database. The 4 microarray gene data groups were combined to get differentially expressed genes (DEGs). The gene ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichments of DEGs were operated by the database for annotation, visualization and integrated discovery and KEGG orthology based annotation system databases, separately. The protein-protein interaction networks of the DEGs were built from the STRING website. A total of 4097 DEGs were extracted from the 4 Gene Expression Omnibus datasets, of which 2213 genes were upregulated, and 1884 genes were downregulated. Gene ontology analysis indicated that the biological functions of DEGs focused primarily on response to virus, type I interferon signaling pathway and negative regulation of viral genome replication. The main cellular components include extracellular space, cytoplasm, and blood microparticle. The molecular functions include protein binding, double-stranded RNA binding and MHC class I protein binding. KEGG pathway analysis showed that these DEGs were mainly involved in the toll-like receptor signaling pathway, cytosolic DNA-sensing pathway, RIG-I-like receptor signaling pathway, complement and coagulation cascades, arginine and proline metabolism, phagosome signaling pathway. The following 13 closely related genes, XAF1, NT5E, UGCG, GBP2, TLR3, DDX58, STAT1, GBP1, PLSCR1, OAS3, SP100, IGK, and RSAD2, were key nodes from the protein-protein interaction network.This research suggests that exploring for DEGs and pathways in dermatomyositis using integrated bioinformatics methods could help us realize the molecular mechanism underlying the development of dermatomyositis, be of actual implication for the early detection and prophylaxis of dermatomyositis and afford reliable goals for the curing of dermatomyositis.
Collapse
Affiliation(s)
- Wei Liu
- First Teaching Hospital of Tianjin University of Traditional Chinese Medicine
- Tianjin Key Laboratory of Translational Research of TCM Prescription and Syndrome, Tianjin, China
| | - Wen-Jia Zhao
- First Teaching Hospital of Tianjin University of Traditional Chinese Medicine
| | - Yuan-Hao Wu
- First Teaching Hospital of Tianjin University of Traditional Chinese Medicine
- Tianjin Key Laboratory of Translational Research of TCM Prescription and Syndrome, Tianjin, China
| |
Collapse
|
2
|
Plyusnin I, Holm L, Törönen P. Novel comparison of evaluation metrics for gene ontology classifiers reveals drastic performance differences. PLoS Comput Biol 2019; 15:e1007419. [PMID: 31682632 PMCID: PMC6855565 DOI: 10.1371/journal.pcbi.1007419] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2018] [Revised: 11/14/2019] [Accepted: 09/24/2019] [Indexed: 11/18/2022] Open
Abstract
Automated protein annotation using the Gene Ontology (GO) plays an important role in the biosciences. Evaluation has always been considered central to developing novel annotation methods, but little attention has been paid to the evaluation metrics themselves. Evaluation metrics define how well an annotation method performs and allows for them to be ranked against one another. Unfortunately, most of these metrics were adopted from the machine learning literature without establishing whether they were appropriate for GO annotations. We propose a novel approach for comparing GO evaluation metrics called Artificial Dilution Series (ADS). Our approach uses existing annotation data to generate a series of annotation sets with different levels of correctness (referred to as their signal level). We calculate the evaluation metric being tested for each annotation set in the series, allowing us to identify whether it can separate different signal levels. Finally, we contrast these results with several false positive annotation sets, which are designed to expose systematic weaknesses in GO assessment. We compared 37 evaluation metrics for GO annotation using ADS and identified drastic differences between metrics. We show that some metrics struggle to differentiate between different signal levels, while others give erroneously high scores to the false positive data sets. Based on our findings, we provide guidelines on which evaluation metrics perform well with the Gene Ontology and propose improvements to several well-known evaluation metrics. In general, we argue that evaluation metrics should be tested for their performance and we provide software for this purpose (https://bitbucket.org/plyusnin/ads/). ADS is applicable to other areas of science where the evaluation of prediction results is non-trivial.
Collapse
Affiliation(s)
- Ilya Plyusnin
- Institute of Biotechnology, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
| | - Liisa Holm
- Institute of Biotechnology, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
- Research Programme in Organismal and Evolutionary Biology, Faculty of Biosciences, University of Helsinki, Helsinki, Finland
| | - Petri Törönen
- Institute of Biotechnology, Helsinki Institute of Life Sciences, University of Helsinki, Helsinki, Finland
| |
Collapse
|
4
|
Abstract
OBJECTIVE To identify novel clinically relevant genes in papillary thyroid carcinoma from public databases. METHODS Four original microarray datasets, GSE3678, GSE3467, GSE33630 and GSE58545, were downloaded. Differentially expressed genes (DEGs) were filtered from integrated data. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed, followed by protein-protein interaction (PPI) network construction. The CentiScape pug-in was performed to scale degree. The genes at the top of the degree distribution (≥ 95% percentile) in the significantly perturbed networks were defined as central genes. UALCAN and The Cancer Genome Atlas Clinical Explorer were used to verify clinically relevant genes and perform survival analysis. RESULT 225 commonly changed DEGs (111 up-regulated and 114 down-regulated) were identified. The DEGs were classified into three groups by GO terms. KEGG pathway enrichment analysis showed DEGs mainly enriched in the PI3K-Akt signaling pathway, pathways in cancer, focal adhesion and proteoglycans in cancer. DEGs' protein-protein interaction (PPI) network complex was developed; six central genes (BCL2, CCND1, FN1, IRS1, COL1A1, CXCL12) were identified. Among them, BCL2, CCND1 and COL1A1 were identified as clinically relevant genes. CONCLUSION BCL2, CCND1 and COL1A1 may be key genes for papillary thyroid carcinoma. Further molecular biological experiments are required to confirm the function of the identified genes.
Collapse
Affiliation(s)
- W Liang
- Department of Endocrinology, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310000, People's Republic of China.
| | - F Sun
- Key Laboratory of Cancer Prevention and Intervention, China National Ministry of Education, The Second Affiliated Hospital, Cancer Institute, Zhejiang University School of Medicine, Hangzhou, 310009, People's Republic of China
| |
Collapse
|
5
|
Jacobson M, Sedeño-Cortés AE, Pavlidis P. Monitoring changes in the Gene Ontology and their impact on genomic data analysis. Gigascience 2018; 7:5069393. [PMID: 30107399 PMCID: PMC6113503 DOI: 10.1093/gigascience/giy103] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Revised: 07/27/2018] [Accepted: 08/06/2018] [Indexed: 01/01/2023] Open
Abstract
Background The Gene Ontology (GO) is one of the most widely used resources in molecular and cellular biology, largely through the use of "enrichment analysis." To facilitate informed use of GO, we present GOtrack (https://gotrack.msl.ubc.ca), which provides access to historical records and trends in the GO and GO annotations. Findings GOtrack gives users access to gene- and term-level information on annotations for nine model organisms as well as an interactive tool that measures the stability of enrichment results over time for user-provided "hit lists" of genes. To document the effects of GO evolution on enrichment, we analyzed more than 2,500 published hit lists of human genes (most older than 9 years ); 53% of hit lists were considered to yield significantly stable enrichment results. Conclusions Because stability is far from assured for any individual hit list, GOtrack can lead to more informed and cautious application of GO to genomics research.
Collapse
Affiliation(s)
- Matthew Jacobson
- Michael Smith Laboratories, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
- Department of Psychiatry, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
| | - Adriana Estela Sedeño-Cortés
- Graduate Program in Bioinformatics, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
| | - Paul Pavlidis
- Michael Smith Laboratories, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
- Department of Psychiatry, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
| |
Collapse
|
6
|
Groza T, Köhler S, Moldenhauer D, Vasilevsky N, Baynam G, Zemojtel T, Schriml LM, Kibbe WA, Schofield PN, Beck T, Vasant D, Brookes AJ, Zankl A, Washington NL, Mungall CJ, Lewis SE, Haendel MA, Parkinson H, Robinson PN. The Human Phenotype Ontology: Semantic Unification of Common and Rare Disease. Am J Hum Genet 2015; 97:111-24. [PMID: 26119816 PMCID: PMC4572507 DOI: 10.1016/j.ajhg.2015.05.020] [Citation(s) in RCA: 149] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2015] [Accepted: 05/22/2015] [Indexed: 12/24/2022] Open
Abstract
The Human Phenotype Ontology (HPO) is widely used in the rare disease community for differential diagnostics, phenotype-driven analysis of next-generation sequence-variation data, and translational research, but a comparable resource has not been available for common disease. Here, we have developed a concept-recognition procedure that analyzes the frequencies of HPO disease annotations as identified in over five million PubMed abstracts by employing an iterative procedure to optimize precision and recall of the identified terms. We derived disease models for 3,145 common human diseases comprising a total of 132,006 HPO annotations. The HPO now comprises over 250,000 phenotypic annotations for over 10,000 rare and common diseases and can be used for examining the phenotypic overlap among common diseases that share risk alleles, as well as between Mendelian diseases and common diseases linked by genomic location. The annotations, as well as the HPO itself, are freely available.
Collapse
Affiliation(s)
- Tudor Groza
- School of Information Technology and Electrical Engineering, University of Queensland, St. Lucia, QLD 4072, Australia; Garvan Institute of Medical Research, Darlinghurst, Sydney, NSW 2010, Australia
| | - Sebastian Köhler
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
| | - Dawid Moldenhauer
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany; University of Applied Sciences, Wiesenstrasse 14, 35390 Giessen, Germany
| | - Nicole Vasilevsky
- Library, Oregon Health & Science University, Portland, OR 97239, USA
| | - Gareth Baynam
- School of Paediatrics and Child Health, University of Western Australia, Perth, WA 6840, Australia; Institute for Immunology and Infectious Diseases, Murdoch University, Perth, WA 6150, Australia; Office of Population Health Genomics, Public Health and Clinical Services Division, Department of Health, Perth, WA 6004, Australia; Genetic Services of Western Australia, King Edward Memorial Hospital, Perth, WA 6008, Australia; Telethon Kids Institute, Perth, WA 6008, Australia
| | - Tomasz Zemojtel
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany; Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznań, Poland
| | - Lynn Marie Schriml
- Department of Epidemiology and Public Health, School of Medicine, University of Maryland, Baltimore, MD 21201, USA; Institute for Genome Sciences, School of Medicine, University of Maryland, Baltimore, MD 21201, USA
| | - Warren Alden Kibbe
- Center for Biomedical Informatics and Information Technology, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20850, USA
| | - Paul N Schofield
- Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge CB2 3EG, UK; The Jackson Laboratory, Bar Harbor, ME 04609, USA
| | - Tim Beck
- Department of Genetics, University of Leicester, Leicester LE1 7RH, UK
| | - Drashtti Vasant
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK
| | - Anthony J Brookes
- Department of Genetics, University of Leicester, Leicester LE1 7RH, UK
| | - Andreas Zankl
- Garvan Institute of Medical Research, Darlinghurst, Sydney, NSW 2010, Australia; Academic Department of Medical Genetics, The Children's Hospital at Westmead, Sydney, NSW 2145, Australia; Discipline of Genetic Medicine, Sydney Medical School, University of Sydney, Sydney, NSW 2145, Australia
| | - Nicole L Washington
- Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Christopher J Mungall
- Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Suzanna E Lewis
- Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | - Melissa A Haendel
- Library, Oregon Health & Science University, Portland, OR 97239, USA
| | - Helen Parkinson
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK
| | - Peter N Robinson
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany; Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany; Berlin Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany; Institute of Bioinformatics, Department of Mathematics and Computer Science, Freie Universität Berlin, Takustrasse 9, 14195 Berlin, Germany.
| |
Collapse
|