1
|
Chen J, Goudey B, Geard N, Verspoor K. Integration of background knowledge for automatic detection of inconsistencies in gene ontology annotation. Bioinformatics 2024; 40:i390-i400. [PMID: 38940182 PMCID: PMC11256942 DOI: 10.1093/bioinformatics/btae246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION Biological background knowledge plays an important role in the manual quality assurance (QA) of biological database records. One such QA task is the detection of inconsistencies in literature-based Gene Ontology Annotation (GOA). This manual verification ensures the accuracy of the GO annotations based on a comprehensive review of the literature used as evidence, Gene Ontology (GO) terms, and annotated genes in GOA records. While automatic approaches for the detection of semantic inconsistencies in GOA have been developed, they operate within predetermined contexts, lacking the ability to leverage broader evidence, especially relevant domain-specific background knowledge. This paper investigates various types of background knowledge that could improve the detection of prevalent inconsistencies in GOA. In addition, the paper proposes several approaches to integrate background knowledge into the automatic GOA inconsistency detection process. RESULTS We have extended a previously developed GOA inconsistency dataset with several kinds of GOA-related background knowledge, including GeneRIF statements, biological concepts mentioned within evidence texts, GO hierarchy and existing GO annotations of the specific gene. We have proposed several effective approaches to integrate background knowledge as part of the automatic GOA inconsistency detection process. The proposed approaches can improve automatic detection of self-consistency and several of the most prevalent types of inconsistencies. This is the first study to explore the advantages of utilizing background knowledge and to propose a practical approach to incorporate knowledge in automatic GOA inconsistency detection. We establish a new benchmark for performance on this task. Our methods may be applicable to various tasks that involve incorporating biological background knowledge. AVAILABILITY AND IMPLEMENTATION https://github.com/jiyuc/de-inconsistency.
Collapse
Affiliation(s)
- Jiyu Chen
- School of Computing and Information Systems, The University of Melbourne, Parkville 3010, VIC, Australia
- Data61, The Commonwealth Scientific and Industrial Research Organisation, Marsfield 2122, NSW, Australia
| | - Benjamin Goudey
- School of Computing and Information Systems, The University of Melbourne, Parkville 3010, VIC, Australia
| | - Nicholas Geard
- School of Computing and Information Systems, The University of Melbourne, Parkville 3010, VIC, Australia
| | - Karin Verspoor
- School of Computing Technologies, RMIT University, Melbourne, Victoria 3000, Australia
| |
Collapse
|
2
|
Shishkova D, Lobov A, Repkin E, Markova V, Markova Y, Sinitskaya A, Sinitsky M, Kondratiev E, Torgunakova E, Kutikhin A. Calciprotein Particles Induce Cellular Compartment-Specific Proteome Alterations in Human Arterial Endothelial Cells. J Cardiovasc Dev Dis 2023; 11:5. [PMID: 38248875 PMCID: PMC10816121 DOI: 10.3390/jcdd11010005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 12/06/2023] [Accepted: 12/07/2023] [Indexed: 01/23/2024] Open
Abstract
Calciprotein particles (CPPs) are indispensable scavengers of excessive Ca2+ and PO43- ions in blood, being internalised and recycled by liver and spleen macrophages, monocytes, and endothelial cells (ECs). Here, we performed a pathway enrichment analysis of cellular compartment-specific proteomes in primary human coronary artery ECs (HCAEC) and human internal thoracic artery ECs (HITAEC) treated with primary (amorphous) or secondary (crystalline) CPPs (CPP-P and CPPs, respectively). Exposure to CPP-P and CPP-S induced notable upregulation of: (1) cytokine- and chemokine-mediated signaling, Ca2+-dependent events, and apoptosis in cytosolic and nuclear proteomes; (2) H+ and Ca2+ transmembrane transport, generation of reactive oxygen species, mitochondrial outer membrane permeabilisation, and intrinsic apoptosis in the mitochondrial proteome; (3) oxidative, calcium, and endoplasmic reticulum (ER) stress, unfolded protein binding, and apoptosis in the ER proteome. In contrast, transcription, post-transcriptional regulation, translation, cell cycle, and cell-cell adhesion pathways were underrepresented in cytosol and nuclear compartments, whilst biosynthesis of amino acids, mitochondrial translation, fatty acid oxidation, pyruvate dehydrogenase activity, and energy generation were downregulated in the mitochondrial proteome of CPP-treated ECs. Differentially expressed organelle-specific pathways were coherent in HCAEC and HITAEC and between ECs treated with CPP-P or CPP-S. Proteomic analysis of mitochondrial and nuclear lysates from CPP-treated ECs confirmed bioinformatic filtration findings.
Collapse
Affiliation(s)
- Daria Shishkova
- Department of Experimental Medicine, Research Institute for Complex Issues of Cardiovascular Diseases, 6 Sosnovy Boulevard, 650002 Kemerovo, Russia; (D.S.); (V.M.); (Y.M.); (A.S.); (M.S.); (E.K.); (E.T.)
| | - Arseniy Lobov
- Laboratory of Regenerative Biomedicine, Institute of Cytology of the RAS, 4 Tikhoretskiy Prospekt, 194064 St. Petersburg, Russia;
| | - Egor Repkin
- Centre for Molecular and Cell Technologies, St. Petersburg State University, Universitetskaya Embankment, 7/9, 199034 St. Petersburg, Russia;
| | - Victoria Markova
- Department of Experimental Medicine, Research Institute for Complex Issues of Cardiovascular Diseases, 6 Sosnovy Boulevard, 650002 Kemerovo, Russia; (D.S.); (V.M.); (Y.M.); (A.S.); (M.S.); (E.K.); (E.T.)
| | - Yulia Markova
- Department of Experimental Medicine, Research Institute for Complex Issues of Cardiovascular Diseases, 6 Sosnovy Boulevard, 650002 Kemerovo, Russia; (D.S.); (V.M.); (Y.M.); (A.S.); (M.S.); (E.K.); (E.T.)
| | - Anna Sinitskaya
- Department of Experimental Medicine, Research Institute for Complex Issues of Cardiovascular Diseases, 6 Sosnovy Boulevard, 650002 Kemerovo, Russia; (D.S.); (V.M.); (Y.M.); (A.S.); (M.S.); (E.K.); (E.T.)
| | - Maxim Sinitsky
- Department of Experimental Medicine, Research Institute for Complex Issues of Cardiovascular Diseases, 6 Sosnovy Boulevard, 650002 Kemerovo, Russia; (D.S.); (V.M.); (Y.M.); (A.S.); (M.S.); (E.K.); (E.T.)
| | - Egor Kondratiev
- Department of Experimental Medicine, Research Institute for Complex Issues of Cardiovascular Diseases, 6 Sosnovy Boulevard, 650002 Kemerovo, Russia; (D.S.); (V.M.); (Y.M.); (A.S.); (M.S.); (E.K.); (E.T.)
| | - Evgenia Torgunakova
- Department of Experimental Medicine, Research Institute for Complex Issues of Cardiovascular Diseases, 6 Sosnovy Boulevard, 650002 Kemerovo, Russia; (D.S.); (V.M.); (Y.M.); (A.S.); (M.S.); (E.K.); (E.T.)
| | - Anton Kutikhin
- Department of Experimental Medicine, Research Institute for Complex Issues of Cardiovascular Diseases, 6 Sosnovy Boulevard, 650002 Kemerovo, Russia; (D.S.); (V.M.); (Y.M.); (A.S.); (M.S.); (E.K.); (E.T.)
| |
Collapse
|
3
|
Liu J, Tang X, Guan X. Grain protein function prediction based on self-attention mechanism and bidirectional LSTM. Brief Bioinform 2023; 24:6886418. [PMID: 36567619 DOI: 10.1093/bib/bbac493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 10/13/2022] [Accepted: 10/18/2022] [Indexed: 12/27/2022] Open
Abstract
With the development of genome sequencing technology, using computing technology to predict grain protein function has become one of the important tasks of bioinformatics. The protein data of four grains, soybean, maize, indica and japonica are selected in this experimental dataset. In this paper, a novel neural network algorithm Chemical-SA-BiLSTM is proposed for grain protein function prediction. The Chemical-SA-BiLSTM algorithm fuses the chemical properties of proteins on the basis of amino acid sequences, and combines the self-attention mechanism with the bidirectional Long Short-Term Memory network. The experimental results show that the Chemical-SA-BiLSTM algorithm is superior to other classical neural network algorithms, and can more accurately predict the protein function, which proves the effectiveness of the Chemical-SA-BiLSTM algorithm in the prediction of grain protein function. The source code of our method is available at https://github.com/HwaTong/Chemical-SA-BiLSTM.
Collapse
Affiliation(s)
- Jing Liu
- College of Information Engineering, Shanghai Maritime University, 201306, Shanghai, China
| | - Xinghua Tang
- College of Information Engineering, Shanghai Maritime University, 201306, Shanghai, China
| | - Xiao Guan
- School of Health Science and Engineering, University of Shanghai for Science and Technology, 200093, Shanghai, China
| |
Collapse
|
4
|
Biological mass spectrometry analysis for traceability of production method and harvesting seasons of sea cucumber (Apostichopus japonicus). Food Control 2023. [DOI: 10.1016/j.foodcont.2022.109297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
5
|
Gnilopyat S, DePietro PJ, Parry TK, McLaughlin WA. The Pharmacorank Search Tool for the Retrieval of Prioritized Protein Drug Targets and Drug Repositioning Candidates According to Selected Diseases. Biomolecules 2022; 12:1559. [PMID: 36358909 PMCID: PMC9687941 DOI: 10.3390/biom12111559] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 10/19/2022] [Accepted: 10/22/2022] [Indexed: 08/13/2023] Open
Abstract
We present the Pharmacorank search tool as an objective means to obtain prioritized protein drug targets and their associated medications according to user-selected diseases. This tool could be used to obtain prioritized protein targets for the creation of novel medications or to predict novel indications for medications that already exist. To prioritize the proteins associated with each disease, a gene similarity profiling method based on protein functions is implemented. The priority scores of the proteins are found to correlate well with the likelihoods that the associated medications are clinically relevant in the disease's treatment. When the protein priority scores are plotted against the percentage of protein targets that are known to bind medications currently indicated to treat the disease, which we termed the pertinency score, a strong correlation was observed. The correlation coefficient was found to be 0.9978 when using a weighted second-order polynomial fit. As the highly predictive fit was made using a broad range of diseases, we were able to identify a general threshold for the pertinency score as a starting point for considering drug repositioning candidates. Several repositioning candidates are described for proteins that have high predicated pertinency scores, and these provide illustrative examples of the applications of the tool. We also describe focused reviews of repositioning candidates for Alzheimer's disease. Via the tool's URL, https://protein.som.geisinger.edu/Pharmacorank/, an open online interface is provided for interactive use; and there is a site for programmatic access.
Collapse
Affiliation(s)
| | | | | | - William A. McLaughlin
- Department of Medical Education, Geisinger Commonwealth School of Medicine, 525 Pine Street, Scranton, PA 18509, USA
| |
Collapse
|
6
|
Lai YL, Liu CH, Wang SC, Huang SP, Cho YC, Bao BY, Su CC, Yeh HC, Lee CH, Teng PC, Chuu CP, Chen DN, Li CY, Cheng WC. Identification of a Steroid Hormone-Associated Gene Signature Predicting the Prognosis of Prostate Cancer through an Integrative Bioinformatics Analysis. Cancers (Basel) 2022; 14:cancers14061565. [PMID: 35326723 PMCID: PMC8946240 DOI: 10.3390/cancers14061565] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 03/12/2022] [Accepted: 03/17/2022] [Indexed: 02/05/2023] Open
Abstract
Simple Summary Prostate cancer (PC) is the second most common cancer worldwide and steroid hormone plays an important role in prostate carcinogenesis. Most patients with PC are initially sensitive to androgen deprivation therapy (ADT) but eventually become hormone refractory and reflect disease progression. The aim of the study was to investigate the genes which regulate the steroid hormone functional pathways and associate with the disease progression of PC. We identified a panel of eight-gene signatures that modulated steroid-hormone pathways and predicted the prognosis of PC using integrative bioinformatics analysis of multiple datasets validated from external cohorts. This panel could be used for predicting the prognosis of PC patients and might be associated with the drug response of hormonal therapies. Moreover, these genes in the signature could be potential targets to develop a novel treatment for castration-resistant PC therapy. Abstract The importance of anti-androgen therapy for prostate cancer (PC) has been well recognized. However, the mechanisms underlying prostate cancer resistance to anti-androgens are not completely understood. Therefore, identifying pharmacological targets in driving the development of castration-resistant PC is necessary. In the present study, we sought to identify core genes in regulating steroid hormone pathways and associating them with the disease progression of PC. The selection of steroid hormone-associated genes was identified from functional databases, including gene ontology, KEGG, and Reactome. The gene expression profiles and relevant clinical information of patients with PC were obtained from TCGA and used to examine the genes associated with steroid hormone. The machine-learning algorithm was performed for key feature selection and signature construction. With the integrative bioinformatics analysis, an eight-gene signature, including CA2, CYP2E1, HSD17B, SSTR3, SULT1E1, TUBB3, UCN, and UGT2B7 was established. Patients with higher expression of this gene signature had worse progression-free interval in both univariate and multivariate cox models adjusted for clinical variables. The expression of the gene signatures also showed the aggressiveness consistently in two external cohorts, PCS and PAM50. Our findings demonstrated a validated eight-gene signature could successfully predict PC prognosis and regulate the steroid hormone pathway.
Collapse
Affiliation(s)
- Yo-Liang Lai
- Graduate Institute of Biomedical Science, China Medical University, Taichung 40403, Taiwan;
- Department of Radiation Oncology, China Medical University Hospital, Taichung 40403, Taiwan
| | - Chia-Hsin Liu
- Research Center for Cancer Biology, China Medical University, Taichung 40403, Taiwan; (C.-H.L.); (Y.-C.C.)
| | - Shu-Chi Wang
- Department of Medical Laboratory Science and Biotechnology, Kaohsiung Medical University, Kaohsiung 80708, Taiwan;
| | - Shu-Pin Huang
- Department of Urology, School of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 80708, Taiwan; (S.-P.H.); (H.-C.Y.)
- Department of Urology, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80708, Taiwan;
- Graduate Institute of Clinical Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 80708, Taiwan
- Ph.D. Program in Environmental and Occupational Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 80708, Taiwan
| | - Yi-Chun Cho
- Research Center for Cancer Biology, China Medical University, Taichung 40403, Taiwan; (C.-H.L.); (Y.-C.C.)
| | - Bo-Ying Bao
- Department of Pharmacy, China Medical University, Taichung 40403, Taiwan;
| | - Chia-Cheng Su
- Department of Surgery, Division of Urology, Chi-Mei Medical Center, Tainan 71004, Taiwan;
| | - Hsin-Chih Yeh
- Department of Urology, School of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 80708, Taiwan; (S.-P.H.); (H.-C.Y.)
- Department of Urology, Kaohsiung Municipal Ta-Tung Hospital, Kaohsiung 80145, Taiwan
| | - Cheng-Hsueh Lee
- Department of Urology, Kaohsiung Medical University Hospital, Kaohsiung Medical University, Kaohsiung 80708, Taiwan;
| | - Pai-Chi Teng
- Taipei City Hospital Renai Branch, Taipei 106243, Taiwan;
| | - Chih-Pin Chuu
- Institute of Cellular and System Medicine, National Health Research Institutes, Miaoli 350401, Taiwan;
| | - Deng-Neng Chen
- Department Management Information Systems, National Pingtung University of Science and Technology, Pingtung 912301, Taiwan;
| | - Chia-Yang Li
- Graduate Institute of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung 80708, Taiwan
- Department of Medical Research, Kaohsiung Medical University Hospital, Kaohsiung 80756, Taiwan
- Correspondence: (C.-Y.L.); (W.-C.C.)
| | - Wei-Chung Cheng
- Graduate Institute of Biomedical Science, China Medical University, Taichung 40403, Taiwan;
- Department of Radiation Oncology, China Medical University Hospital, Taichung 40403, Taiwan
- Ph.D. Program for Cancer Biology and Drug Discovery, China Medical University and Academia, Sinica 40403, Taiwan
- Correspondence: (C.-Y.L.); (W.-C.C.)
| |
Collapse
|
7
|
Exploiting protein family and protein network data to identify novel drug targets for bladder cancer. Oncotarget 2022; 13:105-117. [PMID: 35035776 PMCID: PMC8758182 DOI: 10.18632/oncotarget.28175] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 12/08/2021] [Indexed: 12/11/2022] Open
Abstract
Bladder cancer remains one of the most common forms of cancer and yet there are limited small molecule targeted therapies. Here, we present a computational platform to identify new potential targets for bladder cancer therapy. Our method initially exploited a set of known driver genes for bladder cancer combined with predicted bladder cancer genes from mutationally enriched protein domain families. We enriched this initial set of genes using protein network data to identify a comprehensive set of 323 putative bladder cancer targets. Pathway and cancer hallmarks analyses highlighted putative mechanisms in agreement with those previously reported for this cancer and revealed protein network modules highly enriched in potential drivers likely to be good targets for targeted therapies. 21 of our potential drug targets are targeted by FDA approved drugs for other diseases — some of them are known drivers or are already being targeted for bladder cancer (FGFR3, ERBB3, HDAC3, EGFR). A further 4 potential drug targets were identified by inheriting drug mappings across our in-house CATH domain functional families (FunFams). Our FunFam data also allowed us to identify drug targets in families that are less prone to side effects i.e., where structurally similar protein domain relatives are less dispersed across the human protein network. We provide information on our novel potential cancer driver genes, together with information on pathways, network modules and hallmarks associated with the predicted and known bladder cancer drivers and we highlight those drivers we predict to be likely drug targets.
Collapse
|
8
|
Walsh AT, Triant DA, Le Tourneau JJ, Shamimuzzaman M, Elsik CG. Hymenoptera Genome Database: new genomes and annotation datasets for improved go enrichment and orthologue analyses. Nucleic Acids Res 2021; 50:D1032-D1039. [PMID: 34747465 PMCID: PMC8728238 DOI: 10.1093/nar/gkab1018] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 10/06/2021] [Accepted: 10/12/2021] [Indexed: 01/02/2023] Open
Abstract
We report an update of the Hymenoptera Genome Database (HGD; http://HymenopteraGenome.org), a genomic database of hymenopteran insect species. The number of species represented in HGD has nearly tripled, with fifty-eight hymenopteran species, including twenty bees, twenty-three ants, eleven wasps and four sawflies. With a reorganized website, HGD continues to provide the HymenopteraMine genomic data mining warehouse and JBrowse/Apollo genome browsers integrated with BLAST. We have computed Gene Ontology (GO) annotations for all species, greatly enhancing the GO annotation data gathered from UniProt with more than a ten-fold increase in the number of GO-annotated genes. We have also generated orthology datasets that encompass all HGD species and provide orthologue clusters for fourteen taxonomic groups. The new GO annotation and orthology data are available for searching in HymenopteraMine, and as bulk file downloads.
Collapse
Affiliation(s)
- Amy T Walsh
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Deborah A Triant
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA
| | | | - Md Shamimuzzaman
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Christine G Elsik
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA.,Division of Plant Science & Technology, University of Missouri, Columbia, MO 65211, USA.,MU Institute for Data Science & Informatics, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
9
|
Wei X, Zhang C, Freddolino PL, Zhang Y. Detecting Gene Ontology misannotations using taxon-specific rate ratio comparisons. Bioinformatics 2021; 36:4383-4388. [PMID: 32470107 DOI: 10.1093/bioinformatics/btaa548] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Revised: 03/24/2020] [Accepted: 05/26/2020] [Indexed: 02/05/2023] Open
Abstract
MOTIVATION Many protein function databases are built on automated or semi-automated curations and can contain various annotation errors. The correction of such misannotations is critical to improving the accuracy and reliability of the databases. RESULTS We proposed a new approach to detect potentially incorrect Gene Ontology (GO) annotations by comparing the ratio of annotation rates (RAR) for the same GO term across different taxonomic groups, where those with a relatively low RAR usually correspond to incorrect annotations. As an illustration, we applied the approach to 20 commonly studied species in two recent UniProt-GOA releases and identified 250 potential misannotations in the 2018-11-6 release, where only 25% of them were corrected in the 2019-6-3 release. Importantly, 56% of the misannotations are 'Inferred from Biological aspect of Ancestor (IBA)' which is in contradiction with previous observations that attributed misannotations mainly to 'Inferred from Sequence or structural Similarity (ISS)', probably reflecting an error source shift due to the new developments of function annotation databases. The results demonstrated a simple but efficient misannotation detection approach that is useful for large-scale comparative protein function studies. AVAILABILITY AND IMPLEMENTATION https://zhanglab.ccmb.med.umich.edu/RAR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiaoqiong Wei
- State Key Laboratory of Biotherapy and Cancer Center/Collaborative Innovation Center of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China.,Department of Computational Medicine and Bioinformatics
| | | | - Peter L Freddolino
- Department of Computational Medicine and Bioinformatics.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
10
|
Zhou G, Wang J, Zhang X, Guo M, Yu G. Predicting functions of maize proteins using graph convolutional network. BMC Bioinformatics 2020; 21:420. [PMID: 33323113 PMCID: PMC7739465 DOI: 10.1186/s12859-020-03745-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Background Maize (Zea mays ssp. mays L.) is the most widely grown and yield crop in the world, as well as an important model organism for fundamental research of the function of genes. The functions of Maize proteins are annotated using the Gene Ontology (GO), which has more than 40000 terms and organizes GO terms in a direct acyclic graph (DAG). It is a huge challenge to accurately annotate relevant GO terms to a Maize protein from such a large number of candidate GO terms. Some deep learning models have been proposed to predict the protein function, but the effectiveness of these approaches is unsatisfactory. One major reason is that they inadequately utilize the GO hierarchy. Results To use the knowledge encoded in the GO hierarchy, we propose a deep Graph Convolutional Network (GCN) based model (DeepGOA) to predict GO annotations of proteins. DeepGOA firstly quantifies the correlations (or edges) between GO terms and updates the edge weights of the DAG by leveraging GO annotations and hierarchy, then learns the semantic representation and latent inter-relations of GO terms in the way by applying GCN on the updated DAG. Meanwhile, Convolutional Neural Network (CNN) is used to learn the feature representation of amino acid sequences with respect to the semantic representations. After that, DeepGOA computes the dot product of the two representations, which enable to train the whole network end-to-end coherently. Extensive experiments show that DeepGOA can effectively integrate GO structural information and amino acid information, and then annotates proteins accurately. Conclusions Experiments on Maize PH207 inbred line and Human protein sequence dataset show that DeepGOA outperforms the state-of-the-art deep learning based methods. The ablation study proves that GCN can employ the knowledge of GO and boost the performance. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=DeepGOA.
Collapse
Affiliation(s)
- Guangjie Zhou
- School of Software, Shandong University, Jinan, China.,College of Computer and Information Sciences, Chongqing, China
| | - Jun Wang
- College of Computer and Information Sciences, Chongqing, China
| | - Xiangliang Zhang
- CEMSE, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China.
| | - Guoxian Yu
- School of Software, Shandong University, Jinan, China. .,College of Computer and Information Sciences, Chongqing, China. .,CEMSE, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.
| |
Collapse
|
11
|
Mishra SK, Muthye V, Kandoi G. Computational Methods for Predicting Functions at the mRNA Isoform Level. Int J Mol Sci 2020; 21:ijms21165686. [PMID: 32784445 PMCID: PMC7460821 DOI: 10.3390/ijms21165686] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 08/05/2020] [Accepted: 08/06/2020] [Indexed: 11/16/2022] Open
Abstract
Multiple mRNA isoforms of the same gene are produced via alternative splicing, a biological mechanism that regulates protein diversity while maintaining genome size. Alternatively spliced mRNA isoforms of the same gene may sometimes have very similar sequence, but they can have significantly diverse effects on cellular function and regulation. The products of alternative splicing have important and diverse functional roles, such as response to environmental stress, regulation of gene expression, human heritable, and plant diseases. The mRNA isoforms of the same gene can have dramatically different functions. Despite the functional importance of mRNA isoforms, very little has been done to annotate their functions. The recent years have however seen the development of several computational methods aimed at predicting mRNA isoform level biological functions. These methods use a wide array of proteo-genomic data to develop machine learning-based mRNA isoform function prediction tools. In this review, we discuss the computational methods developed for predicting the biological function at the individual mRNA isoform level.
Collapse
|
12
|
Abstract
MOTIVATION With the ever-increasing number and diversity of sequenced species, the challenge to characterize genes with functional information is even more important. In most species, this characterization almost entirely relies on automated electronic methods. As such, it is critical to benchmark the various methods. The Critical Assessment of protein Function Annotation algorithms (CAFA) series of community experiments provide the most comprehensive benchmark, with a time-delayed analysis leveraging newly curated experimentally supported annotations. However, the definition of a false positive in CAFA has not fully accounted for the open world assumption (OWA), leading to a systematic underestimation of precision. The main reason for this limitation is the relative paucity of negative experimental annotations. RESULTS This article introduces a new, OWA-compliant, benchmark based on a balanced test set of positive and negative annotations. The negative annotations are derived from expert-curated annotations of protein families on phylogenetic trees. This approach results in a large increase in the average information content of negative annotations. The benchmark has been tested using the naïve and BLAST baseline methods, as well as two orthology-based methods. This new benchmark could complement existing ones in future CAFA experiments. AVAILABILITY AND IMPLEMENTATION All data, as well as code used for analysis, is available from https://lab.dessimoz.org/20_not. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alex Warwick Vesztrocy
- Department of Genetics, Evolution and Environment, University College London, London, WC1E 6BT, UK
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
| | - Christophe Dessimoz
- Department of Genetics, Evolution and Environment, University College London, London, WC1E 6BT, UK
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland
- Department of Computer Science, University College London, London, WC1E 6BT, UK
- Centre for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
13
|
Palmer LD, Jordan AT, Maloney KN, Farrow MA, Gutierrez DB, Gant-Branum R, Burns WJ, Romer CE, Tsui T, Allen JL, Beavers WN, Nei YW, Sherrod SD, Lacy DB, Norris JL, McLean JA, Caprioli RM, Skaar EP. Zinc intoxication induces ferroptosis in A549 human lung cells. Metallomics 2020; 11:982-993. [PMID: 30968088 DOI: 10.1039/c8mt00360b] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Zinc (Zn) is an essential trace metal required for all forms of life, but is toxic at high concentrations. While the toxic effects of high levels of Zn are well documented, the mechanism of cell death appears to vary based on the study and concentration of Zn. Zn has been proposed as an anti-cancer treatment against non-small cell lung cancer (NSCLC). The goal of this analysis was to determine the effects of Zn on metabolism and cell death in A549 cells. Here, high throughput multi-omics analysis identified the molecular effects of Zn intoxication on the proteome, metabolome, and transcriptome of A549 human NSCLC cells after 5 min to 24 h of Zn exposure. Multi-omics analysis combined with additional experimental evidence suggests Zn intoxication induces ferroptosis, an iron and lipid peroxidation-dependent programmed cell death, demonstrating the utility of multi-omics analysis to identify cellular response to intoxicants.
Collapse
Affiliation(s)
- Lauren D Palmer
- Vanderbilt Institute for Infection, Immunology and Inflammation and Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN 37232, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Zhao Y, Wang J, Chen J, Zhang X, Guo M, Yu G. A Literature Review of Gene Function Prediction by Modeling Gene Ontology. Front Genet 2020; 11:400. [PMID: 32391061 PMCID: PMC7193026 DOI: 10.3389/fgene.2020.00400] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Accepted: 03/30/2020] [Indexed: 12/14/2022] Open
Abstract
Annotating the functional properties of gene products, i.e., RNAs and proteins, is a fundamental task in biology. The Gene Ontology database (GO) was developed to systematically describe the functional properties of gene products across species, and to facilitate the computational prediction of gene function. As GO is routinely updated, it serves as the gold standard and main knowledge source in functional genomics. Many gene function prediction methods making use of GO have been proposed. But no literature review has summarized these methods and the possibilities for future efforts from the perspective of GO. To bridge this gap, we review the existing methods with an emphasis on recent solutions. First, we introduce the conventions of GO and the widely adopted evaluation metrics for gene function prediction. Next, we summarize current methods of gene function prediction that apply GO in different ways, such as using hierarchical or flat inter-relationships between GO terms, compressing massive GO terms and quantifying semantic similarities. Although many efforts have improved performance by harnessing GO, we conclude that there remain many largely overlooked but important topics for future research.
Collapse
Affiliation(s)
- Yingwen Zhao
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Jun Wang
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Jian Chen
- State Key Laboratory of Agrobiotechnology and National Maize Improvement Center, China Agricultural University, Beijing, China
| | - Xiangliang Zhang
- CBRC, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China
| | - Guoxian Yu
- College of Computer and Information Science, Southwest University, Chongqing, China
- CBRC, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| |
Collapse
|
15
|
Yu G, Wang K, Fu G, Guo M, Wang J. NMFGO: Gene Function Prediction via Nonnegative Matrix Factorization with Gene Ontology. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:238-249. [PMID: 30059316 DOI: 10.1109/tcbb.2018.2861379] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Gene Ontology (GO) is a controlled vocabulary of terms that describe molecule function, biological roles, and cellular locations of gene products (i.e., proteins and RNAs), it hierarchically organizes more than 43,000 GO terms via the direct acyclic graph. A gene is generally annotated with several of these GO terms. Therefore, accurately predicting the association between genes and massive terms is a difficult challenge. To combat with this challenge, we propose an matrix factorization based approach called NMFGO. NMFGO stores the available GO annotations of genes in a gene-term association matrix and adopts an ontological structure based taxonomic similarity measure to capture the GO hierarchy. Next, it factorizes the association matrix into two low-rank matrices via nonnegative matrix factorization regularized with the GO hierarchy. After that, it employs a semantic similarity based k nearest neighbor classifier in the low-rank matrices approximated subspace to predict gene functions. Empirical study on three model species (S. cerevisiae, H. sapiens, and A. thaliana) shows that NMFGO is robust to the input parameters and achieves significantly better prediction performance than GIC, TO, dRW- kNN, and NtN, which were re-implemented based on the instructions of the original papers. The supplementary file and demo codes of NMFGO are available at http://mlda.swu.edu.cn/codes.php?name=NMFGO.
Collapse
|
16
|
Almehmadi KA, Tsilioni I, Theoharides TC. Increased Expression of miR‐155p5 in Amygdala of Children With Autism Spectrum Disorder. Autism Res 2019; 13:18-23. [DOI: 10.1002/aur.2205] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 08/19/2019] [Accepted: 08/24/2019] [Indexed: 01/23/2023]
Affiliation(s)
- Khulood Abdullah Almehmadi
- Graduate Program in Pharmacology and Drug Development, Sackler School of Graduate Biomedical SciencesTufts University Boston Massachusetts
- Molecular Immunopharmacology and Drug Discovery Laboratory, Department of ImmunologyTufts University School of Medicine Boston Massachusetts
- Department of Pharmacology, Faculty of PharmacyKing Abdulaziz University Jeddah Saudi Arabia
| | - Irene Tsilioni
- Molecular Immunopharmacology and Drug Discovery Laboratory, Department of ImmunologyTufts University School of Medicine Boston Massachusetts
| | - Theoharis C. Theoharides
- Graduate Program in Pharmacology and Drug Development, Sackler School of Graduate Biomedical SciencesTufts University Boston Massachusetts
- Molecular Immunopharmacology and Drug Discovery Laboratory, Department of ImmunologyTufts University School of Medicine Boston Massachusetts
- Department of Internal MedicineTufts University School of Medicine and Tufts Medical Center Boston Massachusetts
| |
Collapse
|
17
|
Chauhan S, Ahmad S. Enabling full‐length evolutionary profiles based deep convolutional neural network for predicting DNA‐binding proteins from sequence. Proteins 2019; 88:15-30. [DOI: 10.1002/prot.25763] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2019] [Revised: 06/01/2019] [Accepted: 06/15/2019] [Indexed: 12/22/2022]
Affiliation(s)
- Sucheta Chauhan
- School of Computational and Integrative SciencesJawaharlal Nehru University New Delhi India
| | - Shandar Ahmad
- School of Computational and Integrative SciencesJawaharlal Nehru University New Delhi India
| |
Collapse
|
18
|
Tian T, Liu Y, Yan H, You Q, Yi X, Du Z, Xu W, Su Z. agriGO v2.0: a GO analysis toolkit for the agricultural community, 2017 update. Nucleic Acids Res 2019; 45:W122-W129. [PMID: 28472432 PMCID: PMC5793732 DOI: 10.1093/nar/gkx382] [Citation(s) in RCA: 1390] [Impact Index Per Article: 278.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2017] [Accepted: 04/25/2017] [Indexed: 01/30/2023] Open
Abstract
The agriGO platform, which has been serving the scientific community for >10 years, specifically focuses on gene ontology (GO) enrichment analyses of plant and agricultural species. We continuously maintain and update the databases and accommodate the various requests of our global users. Here, we present our updated agriGO that has a largely expanded number of supporting species (394) and datatypes (865). In addition, a larger number of species have been classified into groups covering crops, vegetables, fish, birds and insects closely related to the agricultural community. We further improved the computational efficiency, including the batch analysis and P-value distribution (PVD), and the user-friendliness of the web pages. More visualization features were added to the platform, including SEACOMPARE (cross comparison of singular enrichment analysis), direct acyclic graph (DAG) and Scatter Plots, which can be merged by choosing any significant GO term. The updated platform agriGO v2.0 is now publicly accessible at http://systemsbiology.cau.edu.cn/agriGOv2/.
Collapse
Affiliation(s)
- Tian Tian
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Yue Liu
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Hengyu Yan
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Qi You
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xin Yi
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Zhou Du
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Wenying Xu
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Zhen Su
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
19
|
Identification of Proteins Differentially Expressed by Adipose-derived Mesenchymal Stem Cells Isolated from Immunodeficient Mice. Int J Mol Sci 2019; 20:ijms20112672. [PMID: 31151297 PMCID: PMC6600271 DOI: 10.3390/ijms20112672] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2019] [Revised: 05/28/2019] [Accepted: 05/28/2019] [Indexed: 12/13/2022] Open
Abstract
Although cell therapy using adipose-derived mesenchymal stem cells (AdMSCs) regulates immunity, the degree to which cell quality and function are affected by differences in immunodeficiency of donors is unknown. We used liquid chromatography tandem-mass spectrometry (LC MS/MS) to identify the proteins expressed by mouse AdMSCs (mAsMSCs) isolated from normal (C57BL/6) mice and mice with severe combined immunodeficiency (SCID). The protein expression profiles of each strain were 98%–100% identical, indicating that the expression levels of major proteins potentially associated with the therapeutic effects of mAdMSCs were highly similar. Further, comparable levels of cell surface markers (CD44, CD90.2) were detected using flow cytometry or LC MS/MS. MYH9, ACTN1, CANX, GPI, TPM1, EPRS, ITGB1, ANXA3, CNN2, MAPK1, PSME2, CTPS1, OTUB1, PSMB6, HMGB1, RPS19, SEC61A1, CTNNB1, GLO1, RPL22, PSMA2, SYNCRIP, PRDX3, SAMHD1, TCAF2, MAPK3, RPS24, and MYO1E, which are associated with immunity, were expressed at higher levels by the SCID mAdMSCs compared with the C57BL/6 mAdMSCs. In contrast, ANXA9, PCBP2, LGALS3, PPP1R14B, and PSMA6, which are also associated with immunity, were more highly expressed by C57BL/6 mAdMSCs than SCID mAdMSCs. These findings implicate these two sets of proteins in the pathogenesis and maintenance of immunodeficiency.
Collapse
|
20
|
Mangul S, Martin LS, Hill BL, Lam AKM, Distler MG, Zelikovsky A, Eskin E, Flint J. Systematic benchmarking of omics computational tools. Nat Commun 2019; 10:1393. [PMID: 30918265 PMCID: PMC6437167 DOI: 10.1038/s41467-019-09406-4] [Citation(s) in RCA: 86] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 03/06/2019] [Indexed: 01/11/2023] Open
Abstract
Computational omics methods packaged as software have become essential to modern biological research. The increasing dependence of scientists on these powerful software tools creates a need for systematic assessment of these methods, known as benchmarking. Adopting a standardized benchmarking practice could help researchers who use omics data to better leverage recent technological innovations. Our review summarizes benchmarking practices from 25 recent studies and discusses the challenges, advantages, and limitations of benchmarking across various domains of biology. We also propose principles that can make computational biology benchmarking studies more sustainable and reproducible, ultimately increasing the transparency of biomedical data and results. Benchmarking studies are important for comprehensively understanding and evaluating different computational omics methods. Here, the authors review practices from 25 recent studies and propose principles to improve the quality of benchmarking studies.
Collapse
Affiliation(s)
- Serghei Mangul
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA, 90095, USA. .,Institute for Quantitative and Computational Biosciences, University of California Los Angeles, 611 Charles E Young Drive East, Los Angeles, CA, 90095, USA.
| | - Lana S Martin
- Institute for Quantitative and Computational Biosciences, University of California Los Angeles, 611 Charles E Young Drive East, Los Angeles, CA, 90095, USA
| | - Brian L Hill
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA, 90095, USA
| | - Angela Ka-Mei Lam
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA, 90095, USA
| | - Margaret G Distler
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, Atlanta, GA, 30303, USA.,The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
| | - Eleazar Eskin
- Department of Computer Science, University of California Los Angeles, 580 Portola Plaza, Los Angeles, CA, 90095, USA.,Department of Human Genetics, University of California Los Angeles, 695 Charles E. Young, Los Angeles, CA, USA
| | - Jonathan Flint
- Department of Psychiatry and Biobehavioral Sciences, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, 90095, USA
| |
Collapse
|
21
|
Tamary E, Nevo R, Naveh L, Levin‐Zaidman S, Kiss V, Savidor A, Levin Y, Eyal Y, Reich Z, Adam Z. Chlorophyll catabolism precedes changes in chloroplast structure and proteome during leaf senescence. PLANT DIRECT 2019; 3:e00127. [PMID: 31245770 PMCID: PMC6508775 DOI: 10.1002/pld3.127] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 02/25/2019] [Accepted: 02/26/2019] [Indexed: 05/18/2023]
Abstract
The earliest visual changes of leaf senescence occur in the chloroplast as chlorophyll is degraded and photosynthesis declines. Yet, a comprehensive understanding of the sequence of catabolic events occurring in chloroplasts during natural leaf senescence is still missing. Here, we combined confocal and electron microscopy together with proteomics and biochemistry to follow structural and molecular changes during Arabidopsis leaf senescence. We observed that initiation of chlorophyll catabolism precedes other breakdown processes. Chloroplast size, stacking of thylakoids, and efficiency of PSII remain stable until late stages of senescence, whereas the number and size of plastoglobules increase. Unlike catabolic enzymes, whose level increase, the level of most proteins decreases during senescence, and chloroplast proteins are overrepresented among these. However, the rate of their disappearance is variable, mostly uncoordinated and independent of their inherent stability during earlier developmental stages. Unexpectedly, degradation of chlorophyll-binding proteins lags behind chlorophyll catabolism. Autophagy and vacuole proteins are retained at relatively high levels, highlighting the role of extra-plastidic degradation processes especially in late stages of senescence. The observation that chlorophyll catabolism precedes all other catabolic events may suggest that this process enables or signals further catabolic processes in chloroplasts.
Collapse
Affiliation(s)
- Eyal Tamary
- The Robert H. Smith Institute of Plant Sciences and Genetics in AgricultureThe Hebrew UniversityRehovotIsrael
| | - Reinat Nevo
- Department of Biomolecular SciencesWeizmann Institute of ScienceRehovotIsrael
| | - Leah Naveh
- The Robert H. Smith Institute of Plant Sciences and Genetics in AgricultureThe Hebrew UniversityRehovotIsrael
| | - Smadar Levin‐Zaidman
- Department of Chemical Research SupportWeizmann Institute of ScienceRehovotIsrael
| | - Vladimir Kiss
- Department of Biomolecular SciencesWeizmann Institute of ScienceRehovotIsrael
| | - Alon Savidor
- de Botton Institute for Protein ProfilingThe Nancy and Stephen Grand Israel National Center for Personalized MedicineWeizmann Institute of ScienceRehovotIsrael
| | - Yishai Levin
- de Botton Institute for Protein ProfilingThe Nancy and Stephen Grand Israel National Center for Personalized MedicineWeizmann Institute of ScienceRehovotIsrael
| | - Yoram Eyal
- Institute of Plant SciencesThe Volcani Center ARORishon LeZionIsrael
| | - Ziv Reich
- Department of Biomolecular SciencesWeizmann Institute of ScienceRehovotIsrael
| | - Zach Adam
- The Robert H. Smith Institute of Plant Sciences and Genetics in AgricultureThe Hebrew UniversityRehovotIsrael
| |
Collapse
|
22
|
The temporal profile of activity-dependent presynaptic phospho-signalling reveals long-lasting patterns of poststimulus regulation. PLoS Biol 2019; 17:e3000170. [PMID: 30822303 PMCID: PMC6415872 DOI: 10.1371/journal.pbio.3000170] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Revised: 03/13/2019] [Indexed: 12/23/2022] Open
Abstract
Depolarization of presynaptic terminals stimulates calcium influx, which evokes neurotransmitter release and activates phosphorylation-based signalling. Here, we present the first global temporal profile of presynaptic activity-dependent phospho-signalling, which includes two KCl stimulation levels and analysis of the poststimulus period. We profiled 1,917 regulated phosphopeptides and bioinformatically identified six temporal patterns of co-regulated proteins. The presynaptic proteins with large changes in phospho-status were again prominently regulated in the analysis of 7,070 activity-dependent phosphopeptides from KCl-stimulated cultured hippocampal neurons. Active zone scaffold proteins showed a high level of activity-dependent phospho-regulation that far exceeded the response from postsynaptic density scaffold proteins. Accordingly, bassoon was identified as the major target of neuronal phospho-signalling. We developed a probabilistic computational method, KinSwing, which matched protein kinase substrate motifs to regulated phosphorylation sites to reveal underlying protein kinase activity. This approach allowed us to link protein kinases to profiles of co-regulated presynaptic protein networks. Ca2+- and calmodulin-dependent protein kinase IIα (CaMKIIα) responded rapidly, scaled with stimulus strength, and had long-lasting activity. Mitogen-activated protein kinase (MAPK)/extracellular signal–regulated kinase (ERK) was the main protein kinase predicted to control a distinct and significant pattern of poststimulus up-regulation of phosphorylation. This work provides a unique resource of activity-dependent phosphorylation sites of synaptosomes and neurons, the vast majority of which have not been investigated with regard to their functional impact. This resource will enable detailed characterization of the phospho-regulated mechanisms impacting the plasticity of neurotransmitter release. Analysis of activity-dependent phosphorylation-based signalling in synaptosomes revealed six patterns of long-lasting presynaptic regulation from 1,917 phosphopeptides. The authors identified patterns most likely to be regulated by CamKII and MAPK/ERK and showed the active zone scaffold protein bassoon to be a major signalling target. Neurobiological processes are altered by linking neuronal activity to regulated changes in protein phosphorylation levels that influence protein function. Although some of the major targets of activity-dependent phospho-signalling have been identified, a large number of substrates remain unknown. Here, we have screened systematically for these substrates and extended the list from hundreds to thousands of phosphorylation sites, thereby providing a new depth of understanding. We monitored phospho-signalling for 15 min after the stimulation, which to our knowledge had not been attempted at a large scale. We focused on presynaptic protein substrates of phospho-signalling by isolating the presynaptic terminal. We also stimulated hippocampal neurons but did not monitor the poststimulus. Although the phospho-signalling is immensely complex, the findings could be simplified through data exploration. We identified distinct patterns of presynaptic phospho-regulation across the time course that may constitute co-regulated protein networks. In addition, we found a subset of proteins that had many more phosphorylation sites than the average and high-magnitude responses, implying major signalling or functional roles for these proteins. We also determined the likely protein kinases with the strongest responses to the stimulus at different times using KinSwing, a computational tool that we developed. This resource reveals a new depth of activity-dependent phospho-signalling and identifies major signalling targets, major protein kinases, and co-regulated phosphoprotein networks.
Collapse
|
23
|
Kramarz B, Roncaglia P, Meldal BHM, Huntley RP, Martin MJ, Orchard S, Parkinson H, Brough D, Bandopadhyay R, Hooper NM, Lovering RC. Improving the Gene Ontology Resource to Facilitate More Informative Analysis and Interpretation of Alzheimer's Disease Data. Genes (Basel) 2018; 9:E593. [PMID: 30501127 PMCID: PMC6315915 DOI: 10.3390/genes9120593] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 11/22/2018] [Accepted: 11/23/2018] [Indexed: 12/28/2022] Open
Abstract
The analysis and interpretation of high-throughput datasets relies on access to high-quality bioinformatics resources, as well as processing pipelines and analysis tools. Gene Ontology (GO, geneontology.org) is a major resource for gene enrichment analysis. The aim of this project, funded by the Alzheimer's Research United Kingdom (ARUK) foundation and led by the University College London (UCL) biocuration team, was to enhance the GO resource by developing new neurological GO terms, and use GO terms to annotate gene products associated with dementia. Specifically, proteins and protein complexes relevant to processes involving amyloid-beta and tau have been annotated and the resulting annotations are denoted in GO databases as 'ARUK-UCL'. Biological knowledge presented in the scientific literature was captured through the association of GO terms with dementia-relevant protein records; GO itself was revised, and new GO terms were added. This literature biocuration increased the number of Alzheimer's-relevant gene products that were being associated with neurological GO terms, such as 'amyloid-beta clearance' or 'learning or memory', as well as neuronal structures and their compartments. Of the total 2055 annotations that we contributed for the prioritised gene products, 526 have associated proteins and complexes with neurological GO terms. To ensure that these descriptive annotations could be provided for Alzheimer's-relevant gene products, over 70 new GO terms were created. Here, we describe how the improvements in ontology development and biocuration resulting from this initiative can benefit the scientific community and enhance the interpretation of dementia data.
Collapse
Affiliation(s)
- Barbara Kramarz
- UCL Institute of Cardiovascular Science, University College London, Rayne Building, 5 University Street, London WC1E 6JF, UK.
| | - Paola Roncaglia
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | - Birgit H M Meldal
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | - Rachael P Huntley
- UCL Institute of Cardiovascular Science, University College London, Rayne Building, 5 University Street, London WC1E 6JF, UK.
| | - Maria J Martin
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | - Sandra Orchard
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | - Helen Parkinson
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | - David Brough
- Division of Neuroscience and Experimental Psychology, School of Biological Sciences, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, University of Manchester, AV Hill Building, Oxford Road, Manchester M13 9PT, UK.
| | - Rina Bandopadhyay
- UCL Queen Square Institute of Neurology and Reta Lila Weston Institute of Neurological Studies, 1 Wakefield Street, London WC1N 1PJ, UK.
| | - Nigel M Hooper
- Division of Neuroscience and Experimental Psychology, School of Biological Sciences, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, University of Manchester, AV Hill Building, Oxford Road, Manchester M13 9PT, UK.
| | - Ruth C Lovering
- UCL Institute of Cardiovascular Science, University College London, Rayne Building, 5 University Street, London WC1E 6JF, UK.
| |
Collapse
|
24
|
Nahar S, Nakashima Y, Miyagi-Shiohira C, Kinjo T, Toyoda Z, Kobayashi N, Saitoh I, Watanabe M, Noguchi H, Fujita J. Cytokines in adipose-derived mesenchymal stem cells promote the healing of liver disease. World J Stem Cells 2018; 10:146-159. [PMID: 30631390 PMCID: PMC6325075 DOI: 10.4252/wjsc.v10.i11.146] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Revised: 09/07/2018] [Accepted: 10/11/2018] [Indexed: 02/06/2023] Open
Abstract
Adipose-derived mesenchymal stem cells (ADSCs) are a treatment cell source for patients with chronic liver injury. ADSCs are characterized by being harvested from the patient's own subcutaneous adipose tissue, a high cell yield (i.e., reduced immune rejection response), accumulation at a disease nidus, suppression of excessive immune response, production of various growth factors and cytokines, angiogenic effects, anti-apoptotic effects, and control of immune cells via cell-cell interaction. We previously showed that conditioned medium of ADSCs promoted hepatocyte proliferation and improved the liver function in a mouse model of acute liver failure. Furthermore, as found by many other groups, the administration of ADSCs improved liver tissue fibrosis in a mouse model of liver cirrhosis. A comprehensive protein expression analysis by liquid chromatography with tandem mass spectrometry showed that the various cytokines and chemokines produced by ADSCs promote the healing of liver disease. In this review, we examine the ability of expressed protein components of ADSCs to promote healing in cell therapy for liver disease. Previous studies demonstrated that ADSCs are a treatment cell source for patients with chronic liver injury. This review describes the various cytokines and chemokines produced by ADSCs that promote the healing of liver disease.
Collapse
Affiliation(s)
- Saifun Nahar
- Department of Infectious, Respiratory, and Digestive Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan
| | - Yoshiki Nakashima
- Department of Regenerative Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan
| | - Chika Miyagi-Shiohira
- Department of Regenerative Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan
| | - Takao Kinjo
- Department of Basic Laboratory Sciences, School of Health Sciences in the Faculty of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan
| | - Zensei Toyoda
- Department of Basic Laboratory Sciences, School of Health Sciences in the Faculty of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan
| | | | - Issei Saitoh
- Division of Pediatric Dentistry, Graduate School of Medical and Dental Science, Niigata University, Niigata 951-8514, Japan
| | - Masami Watanabe
- Department of Urology, Okayama Univer sity Graduate School of Medicine, Dentistry and Pharmaceutical Sciences, Okayama 700-8558, Japan
| | - Hirofumi Noguchi
- Department of Regenerative Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan.
| | - Jiro Fujita
- Department of Infectious, Respiratory, and Digestive Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan
| |
Collapse
|
25
|
Nahar S, Nakashima Y, Miyagi-Shiohira C, Kinjo T, Kobayashi N, Saitoh I, Watanabe M, Noguchi H, Fujita J. A Comparison of Proteins Expressed between Human and Mouse Adipose-Derived Mesenchymal Stem Cells by a Proteome Analysis through Liquid Chromatography with Tandem Mass Spectrometry. Int J Mol Sci 2018; 19:E3497. [PMID: 30404232 PMCID: PMC6274862 DOI: 10.3390/ijms19113497] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2018] [Revised: 10/27/2018] [Accepted: 11/04/2018] [Indexed: 12/20/2022] Open
Abstract
Adipose-derived mesenchymal stem cells (ADSCs) have become a common cell source for cell transplantation therapy. Clinical studies have used ADSCs to develop treatments for tissue fibrosis, such as liver cirrhosis and pulmonary fibroma. The need to examine and compare basic research data using clinical research data derived from mice and humans is expected to increase in the future. Here, to better characterize the cells, the protein components expressed by human ADSCs used for treatment, and mouse ADSCs used for research, were comprehensively analyzed by liquid chromatography with tandem mass spectrometry. We found that 92% (401 type proteins) of the proteins expressed by ADSCs in humans and mice were consistent. When classified by the protein functions in a gene ontology analysis, the items that differed by >5% between human and mouse ADSCs were "biological adhesion, locomotion" in biological processes, "plasma membrane" in cellular components, and "antioxidant activity, molecular transducer activity" in molecular functions. Most of the listed proteins were sensitive to cell isolation processes. These results show that the proteins expressed by human and murine ADSCs showed a high degree of correlation.
Collapse
Affiliation(s)
- Saifun Nahar
- Department of Infectious, Respiratory and Digestive Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan.
| | - Yoshiki Nakashima
- Department of Regenerative Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan.
| | - Chika Miyagi-Shiohira
- Department of Regenerative Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan.
| | - Takao Kinjo
- Department of Basic Laboratory Sciences, School of Health Sciences in the Faculty of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan.
| | | | - Issei Saitoh
- Division of Pediatric Dentistry, Graduate School of Medical and Dental Science, Niigata University, Niigata 951-8514, Japan.
| | - Masami Watanabe
- Department of Urology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Sciences, Okayama 700-8558, Japan.
| | - Hirofumi Noguchi
- Department of Regenerative Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan.
| | - Jiro Fujita
- Department of Infectious, Respiratory and Digestive Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan.
| |
Collapse
|
26
|
Alexander-Dann B, Pruteanu LL, Oerton E, Sharma N, Berindan-Neagoe I, Módos D, Bender A. Developments in toxicogenomics: understanding and predicting compound-induced toxicity from gene expression data. Mol Omics 2018; 14:218-236. [PMID: 29917034 PMCID: PMC6080592 DOI: 10.1039/c8mo00042e] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Accepted: 05/08/2018] [Indexed: 12/12/2022]
Abstract
The toxicogenomics field aims to understand and predict toxicity by using 'omics' data in order to study systems-level responses to compound treatments. In recent years there has been a rapid increase in publicly available toxicological and 'omics' data, particularly gene expression data, and a corresponding development of methods for its analysis. In this review, we summarize recent progress relating to the analysis of RNA-Seq and microarray data, review relevant databases, and highlight recent applications of toxicogenomics data for understanding and predicting compound toxicity. These include the analysis of differentially expressed genes and their enrichment, signature matching, methods based on interaction networks, and the analysis of co-expression networks. In the future, these state-of-the-art methods will likely be combined with new technologies, such as whole human body models, to produce a comprehensive systems-level understanding of toxicity that reduces the necessity of in vivo toxicity assessment in animal models.
Collapse
Affiliation(s)
- Benjamin Alexander-Dann
- University of Cambridge
, Centre for Molecular Informatics
, Department of Chemistry
,
Lensfield Road
, Cambridge CB2 1EW
, UK
.
;
| | - Lavinia Lorena Pruteanu
- University of Cambridge
, Centre for Molecular Informatics
, Department of Chemistry
,
Lensfield Road
, Cambridge CB2 1EW
, UK
.
;
- Babeş-Bolyai University
, Institute for Doctoral Studies
,
1 Kogălniceanu Street
, Cluj-Napoca 400084
, Romania
- University of Medicine and Pharmacy “Iuliu Haţieganu”
, MedFuture Research Centre for Advanced Medicine
,
23 Marinescu Street/4-6 Pasteur Street
, Cluj-Napoca 400337
, Romania
| | - Erin Oerton
- University of Cambridge
, Centre for Molecular Informatics
, Department of Chemistry
,
Lensfield Road
, Cambridge CB2 1EW
, UK
.
;
| | - Nitin Sharma
- University of Cambridge
, Centre for Molecular Informatics
, Department of Chemistry
,
Lensfield Road
, Cambridge CB2 1EW
, UK
.
;
| | - Ioana Berindan-Neagoe
- University of Medicine and Pharmacy “Iuliu Haţieganu”
, MedFuture Research Centre for Advanced Medicine
,
23 Marinescu Street/4-6 Pasteur Street
, Cluj-Napoca 400337
, Romania
- University of Medicine and Pharmacy “Iuliu Haţieganu”
, Research Center for Functional Genomics
, Biomedicine and Translational Medicine
,
23 Marinescu Street
, Cluj-Napoca 400337
, Romania
- The Oncology Institute “Prof. Dr Ion Chiricuţă”
, Department of Functional Genomics and Experimental Pathology
,
34-36 Republicii Street
, Cluj-Napoca 400015
, Romania
| | - Dezső Módos
- University of Cambridge
, Centre for Molecular Informatics
, Department of Chemistry
,
Lensfield Road
, Cambridge CB2 1EW
, UK
.
;
| | - Andreas Bender
- University of Cambridge
, Centre for Molecular Informatics
, Department of Chemistry
,
Lensfield Road
, Cambridge CB2 1EW
, UK
.
;
| |
Collapse
|
27
|
Jacobson M, Sedeño-Cortés AE, Pavlidis P. Monitoring changes in the Gene Ontology and their impact on genomic data analysis. Gigascience 2018; 7:5069393. [PMID: 30107399 PMCID: PMC6113503 DOI: 10.1093/gigascience/giy103] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Revised: 07/27/2018] [Accepted: 08/06/2018] [Indexed: 01/01/2023] Open
Abstract
Background The Gene Ontology (GO) is one of the most widely used resources in molecular and cellular biology, largely through the use of "enrichment analysis." To facilitate informed use of GO, we present GOtrack (https://gotrack.msl.ubc.ca), which provides access to historical records and trends in the GO and GO annotations. Findings GOtrack gives users access to gene- and term-level information on annotations for nine model organisms as well as an interactive tool that measures the stability of enrichment results over time for user-provided "hit lists" of genes. To document the effects of GO evolution on enrichment, we analyzed more than 2,500 published hit lists of human genes (most older than 9 years ); 53% of hit lists were considered to yield significantly stable enrichment results. Conclusions Because stability is far from assured for any individual hit list, GOtrack can lead to more informed and cautious application of GO to genomics research.
Collapse
Affiliation(s)
- Matthew Jacobson
- Michael Smith Laboratories, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
- Department of Psychiatry, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
| | - Adriana Estela Sedeño-Cortés
- Graduate Program in Bioinformatics, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
| | - Paul Pavlidis
- Michael Smith Laboratories, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
- Department of Psychiatry, 177 Michael Smith Laboratories, 2185 East Mall, University of British Columbia, Vancouver BC V6T1Z4
| |
Collapse
|
28
|
Chiu CY, Chen TP, Chen JR, Wang CJ, Yin SY, Lai SH, Wong KS. Overexpression of matrix metalloproteinase-9 in adolescents with primary spontaneous pneumothorax for surgical intervention. J Thorac Cardiovasc Surg 2018; 156:2328-2336.e2. [PMID: 30033106 DOI: 10.1016/j.jtcvs.2018.05.083] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/24/2017] [Revised: 05/06/2018] [Accepted: 05/09/2018] [Indexed: 11/26/2022]
Abstract
OBJECTIVE To determine gene expression profiles associated with bullae formation in patients with primary spontaneous pneumothorax and to identify candidate genes associated with surgical intervention. METHODS Twenty-four adolescents with primary spontaneous pneumothorax were enrolled prospectively. A global gene expression analysis of 9 paired lung biopsies (lesion and normal adjacent sites) was performed to identify differentially expressed genes associated with spontaneous pneumothorax. Pathway and network analysis was performed using the Database for Annotation, Visualization and Integrated Discovery web tool. Candidate genes and encoding proteins were assessed in blood samples and compared between patients with pneumothorax and healthy control patients. RESULTS A total of 1519 differentially expressed transcripts corresponding to known genes were identified comparing the lesion lung with paired adjacent normal lung. The altered genes were mainly associated with focal adhesion and extracellular matrix-receptor interaction pathways. Genes involved in proteolysis and peptidase activity were up-regulated predominantly, especially matrix metalloproteinase-1 and -9 genes. Compared with the recovery stage, blood levels of matrix metalloproteinase-9/tissue inhibitor of metalloproteinase-1 were increased at the acute stage in patients with pneumothorax and, when compared between patients treated operatively with those treated nonoperatively, were also significantly greater. In addition, ratios of their serum levels were significantly greater in patients with pneumothorax compared with healthy control patients. Furthermore, matrix metalloproteinase-9 was predominantly overexpressed in neutrophils, alveolar macrophages, and mesothelial cells of lung biopsies. CONCLUSIONS An imbalance of cell-extracellular matrix interactions appears to be associated with primary spontaneous pneumothorax. Matrix metalloproteinase-9 overexpression may particularly play a role in contributing to pleural porosity for surgical intervention.
Collapse
Affiliation(s)
- Chih-Yung Chiu
- Department of Pediatrics, Chang Gung Memorial Hospital at Keelung, College of Medicine, Chang Gung University, Taoyuan, Taiwan; Graduate Institute of Clinical Medical Sciences, College of Medicine, Chang Gung University, Taoyuan, Taiwan; Division of Pediatric Pulmonology, Department of Pediatrics, Chang Gung Memorial Hospital at Linkou, College of Medicine, Chang Gung University, Taoyuan, Taiwan
| | - Tzu-Ping Chen
- Graduate Institute of Clinical Medical Sciences, College of Medicine, Chang Gung University, Taoyuan, Taiwan; Department of Thoracic & Cardiovascular Surgery, Chang Gung Memorial Hospital at Keelung, Keelung City, Taiwan.
| | - Jim-Ray Chen
- Department of Pathology, Chang Gung Memorial Hospital at Keelung, College of Medicine, Chang Gung University, Taoyuan, Taiwan
| | - Chia-Jung Wang
- Department of Pediatrics, Chang Gung Memorial Hospital at Keelung, College of Medicine, Chang Gung University, Taoyuan, Taiwan
| | - Shun-Ying Yin
- Department of Thoracic & Cardiovascular Surgery, Chang Gung Memorial Hospital at Keelung, Keelung City, Taiwan
| | - Shen-Hao Lai
- Division of Pediatric Pulmonology, Department of Pediatrics, Chang Gung Memorial Hospital at Linkou, College of Medicine, Chang Gung University, Taoyuan, Taiwan
| | - Kin-Sun Wong
- Division of Pediatric Pulmonology, Department of Pediatrics, Chang Gung Memorial Hospital at Linkou, College of Medicine, Chang Gung University, Taoyuan, Taiwan.
| |
Collapse
|
29
|
Nakashima Y, Nahar S, Miyagi-Shiohira C, Kinjo T, Kobayashi N, Saitoh I, Watanabe M, Fujita J, Noguchi H. A Liquid Chromatography with Tandem Mass Spectrometry-Based Proteomic Analysis of Cells Cultured in DMEM 10% FBS and Chemically Defined Medium Using Human Adipose-Derived Mesenchymal Stem Cells. Int J Mol Sci 2018; 19:ijms19072042. [PMID: 30011845 PMCID: PMC6073410 DOI: 10.3390/ijms19072042] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Revised: 07/09/2018] [Accepted: 07/11/2018] [Indexed: 02/07/2023] Open
Abstract
Human adipose-derived mesenchymal stem cells (hADSCs) are representative cell sources for cell therapy. Classically, Dulbecco's Modified Eagle's medium (DMEM) containing 10% fetal bovine serum (FBS) has been used as culture medium for hADSCs. A chemically defined medium (CDM) containing no heterologous animal components has recently been used to produce therapeutic hADSCs. However, how the culture environment using a medium without FBS affects the protein expression of hADSC is unclear. We subjected hADSCs cultured in CDM and DMEM (10% FBS) to a protein expression analysis by tandem mass spectrometry liquid chromatography and noted 98.2% agreement in the proteins expressed by the CDM and DMEM groups. We classified 761 proteins expressed in both groups by their function in a gene ontology analysis. Thirty-one groups of proteins were classified as growth-related proteins in the CDM and DMEM groups, 16 were classified as antioxidant activity-related, 147 were classified as immune system process-related, 557 were involved in biological regulation, 493 were classified as metabolic process-related, and 407 were classified as related to stimulus responses. These results show that the trend in the expression of major proteins related to the therapeutic effect of hADSCs correlated strongly in both groups.
Collapse
Affiliation(s)
- Yoshiki Nakashima
- Department of Regenerative Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan.
| | - Saifun Nahar
- Department of Infectious, Respiratory, and Digestive Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan.
| | - Chika Miyagi-Shiohira
- Department of Regenerative Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan.
| | - Takao Kinjo
- Department of Basic Laboratory Sciences, School of Health Sciences in the Faculty of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan.
| | | | - Issei Saitoh
- Division of Pediatric Dentistry, Graduate School of Medical and Dental Science, Niigata University, Niigata 951-8514, Japan.
| | - Masami Watanabe
- Department of Urology, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Sciences, Okayama 700-8558, Japan.
| | - Jiro Fujita
- Department of Infectious, Respiratory, and Digestive Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan.
| | - Hirofumi Noguchi
- Department of Regenerative Medicine, Graduate School of Medicine, University of the Ryukyus, Okinawa 903-0215, Japan.
| |
Collapse
|
30
|
Al Manir MS, Brenas JH, Baker CJ, Shaban-Nejad A. A Surveillance Infrastructure for Malaria Analytics: Provisioning Data Access and Preservation of Interoperability. JMIR Public Health Surveill 2018; 4:e10218. [PMID: 29907554 PMCID: PMC6026300 DOI: 10.2196/10218] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 04/16/2018] [Accepted: 05/08/2018] [Indexed: 12/19/2022] Open
Abstract
Background According to the World Health Organization, malaria surveillance is weakest in countries and regions with the highest malaria burden. A core obstacle is that the data required to perform malaria surveillance are fragmented in multiple data silos distributed across geographic regions. Furthermore, consistent integrated malaria data sources are few, and a low degree of interoperability exists between them. As a result, it is difficult to identify disease trends and to plan for effective interventions. Objective We propose the Semantics, Interoperability, and Evolution for Malaria Analytics (SIEMA) platform for use in malaria surveillance based on semantic data federation. Using this approach, it is possible to access distributed data, extend and preserve interoperability between multiple dynamic distributed malaria sources, and facilitate detection of system changes that can interrupt mission-critical global surveillance activities. Methods We used Semantic Automated Discovery and Integration (SADI) Semantic Web Services to enable data access and improve interoperability, and the graphical user interface-enabled semantic query engine HYDRA to implement the target queries typical of malaria programs. We implemented a custom algorithm to detect changes to community-developed terminologies, data sources, and services that are core to SIEMA. This algorithm reports to a dashboard. Valet SADI is used to mitigate the impact of changes by rebuilding affected services. Results We developed a prototype surveillance and change management platform from a combination of third-party tools, community-developed terminologies, and custom algorithms. We illustrated a methodology and core infrastructure to facilitate interoperable access to distributed data sources using SADI Semantic Web services. This degree of access makes it possible to implement complex queries needed by our user community with minimal technical skill. We implemented a dashboard that reports on terminology changes that can render the services inactive, jeopardizing system interoperability. Using this information, end users can control and reactively rebuild services to preserve interoperability and minimize service downtime. Conclusions We introduce a framework suitable for use in malaria surveillance that supports the creation of flexible surveillance queries across distributed data resources. The platform provides interoperable access to target data sources, is domain agnostic, and with updates to core terminological resources is readily transferable to other surveillance activities. A dashboard enables users to review changes to the infrastructure and invoke system updates. The platform significantly extends the range of functionalities offered by malaria information systems, beyond the state-of-the-art.
Collapse
Affiliation(s)
| | - Jon Haël Brenas
- Oak Ridge National Laboratory Center for for Biomedical Informatics, Department of Pediatrics, The University of Tennessee Health Science Center, Memphis, TN, United States
| | - Christopher Jo Baker
- Department of Computer Science, University of New Brunswick, Saint John, NB, Canada.,IPSNP Computing Inc, Saint John, NB, Canada
| | - Arash Shaban-Nejad
- Oak Ridge National Laboratory Center for for Biomedical Informatics, Department of Pediatrics, The University of Tennessee Health Science Center, Memphis, TN, United States
| |
Collapse
|
31
|
You R, Huang X, Zhu S. DeepText2GO: Improving large-scale protein function prediction with deep semantic text representation. Methods 2018; 145:82-90. [PMID: 29883746 DOI: 10.1016/j.ymeth.2018.05.026] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Revised: 04/30/2018] [Accepted: 05/31/2018] [Indexed: 11/16/2022] Open
Abstract
As of April 2018, UniProtKB has collected more than 115 million protein sequences. Less than 0.15% of these proteins, however, have been associated with experimental GO annotations. As such, the use of automatic protein function prediction (AFP) to reduce this huge gap becomes increasingly important. The previous studies conclude that sequence homology based methods are highly effective in AFP. In addition, mining motif, domain, and functional information from protein sequences has been found very helpful for AFP. Other than sequences, alternative information sources such as text, however, may be useful for AFP as well. Instead of using BOW (bag of words) representation in traditional text-based AFP, we propose a new method called DeepText2GO that relies on deep semantic text representation, together with different kinds of available protein information such as sequence homology, families, domains, and motifs, to improve large-scale AFP. Furthermore, DeepText2GO integrates text-based methods with sequence-based ones by means of a consensus approach. Extensive experiments on the benchmark dataset extracted from UniProt/SwissProt have demonstrated that DeepText2GO significantly outperformed both text-based and sequence-based methods, validating its superiority.
Collapse
Affiliation(s)
- Ronghui You
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, China; Center for Computational System Biology, ISTBI, Fudan University, Shanghai 200433, China
| | - Xiaodi Huang
- School of Computing and Mathematics, Charles Sturt University, Albury, NSW 2640, Australia
| | - Shanfeng Zhu
- School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, China; Center for Computational System Biology, ISTBI, Fudan University, Shanghai 200433, China.
| |
Collapse
|
32
|
Genetic variation in 117 myelination-related genes in schizophrenia: Replication of association to lipid biosynthesis genes. Sci Rep 2018; 8:6915. [PMID: 29720671 PMCID: PMC5931982 DOI: 10.1038/s41598-018-25280-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Accepted: 04/10/2018] [Indexed: 01/18/2023] Open
Abstract
Schizophrenia is a serious psychotic disorder with high heritability. Several common genetic variants, rare copy number variants and ultra-rare gene-disrupting mutations have been linked to disease susceptibility, but there is still a large gap between the estimated and explained heritability. Since several studies have indicated brain myelination abnormalities in schizophrenia, we aimed to examine whether variants in myelination-related genes could be associated with risk for schizophrenia. We established a set of 117 myelination genes by database searches and manual curation. We used a combination of GWAS (SCZ_N = 35,476; CTRL_N = 46,839), exome chip (SCZ_N = 269; CTRL_N = 336) and exome sequencing data (SCZ_N = 2,527; CTRL_N = 2,536) from schizophrenia cases and healthy controls to examine common and rare variants. We found that a subset of lipid-related genes was nominally associated with schizophrenia (p = 0.037), but this signal did not survive multiple testing correction (FWER = 0.16) and was mainly driven by the SREBF1 and SREBF2 genes that have already been linked to schizophrenia. Further analysis demonstrated that the lowest nominal p-values were p = 0.0018 for a single common variant (rs8539) and p = 0.012 for burden of rare variants (LRP1 gene), but none of them survived multiple testing correction. Our findings suggest that variation in myelination-related genes is not a major risk factor for schizophrenia.
Collapse
|
33
|
Tomczak A, Mortensen JM, Winnenburg R, Liu C, Alessi DT, Swamy V, Vallania F, Lofgren S, Haynes W, Shah NH, Musen MA, Khatri P. Interpretation of biological experiments changes with evolution of the Gene Ontology and its annotations. Sci Rep 2018; 8:5115. [PMID: 29572502 PMCID: PMC5865181 DOI: 10.1038/s41598-018-23395-2] [Citation(s) in RCA: 72] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Accepted: 03/12/2018] [Indexed: 12/12/2022] Open
Abstract
Gene Ontology (GO) enrichment analysis is ubiquitously used for interpreting high throughput molecular data and generating hypotheses about underlying biological phenomena of experiments. However, the two building blocks of this analysis — the ontology and the annotations — evolve rapidly. We used gene signatures derived from 104 disease analyses to systematically evaluate how enrichment analysis results were affected by evolution of the GO over a decade. We found low consistency between enrichment analyses results obtained with early and more recent GO versions. Furthermore, there continues to be a strong annotation bias in the GO annotations where 58% of the annotations are for 16% of the human genes. Our analysis suggests that GO evolution may have affected the interpretation and possibly reproducibility of experiments over time. Hence, researchers must exercise caution when interpreting GO enrichment analyses and should reexamine previous analyses with the most recent GO version.
Collapse
Affiliation(s)
- Aurelie Tomczak
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA.,Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Jonathan M Mortensen
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Rainer Winnenburg
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Charles Liu
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA
| | - Dominique T Alessi
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Varsha Swamy
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA
| | - Francesco Vallania
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA
| | - Shane Lofgren
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA
| | - Winston Haynes
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA
| | - Nigam H Shah
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Mark A Musen
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Purvesh Khatri
- Stanford Institute for Immunity, Transplantation and Infection (ITI), Stanford University, Stanford, CA, 94305, USA. .,Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
34
|
Zhao Y, Fu G, Wang J, Guo M, Yu G. Gene function prediction based on Gene Ontology Hierarchy Preserving Hashing. Genomics 2018; 111:334-342. [PMID: 29477548 DOI: 10.1016/j.ygeno.2018.02.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Revised: 02/02/2018] [Accepted: 02/16/2018] [Indexed: 12/27/2022]
Abstract
Gene Ontology (GO) uses structured vocabularies (or terms) to describe the molecular functions, biological roles, and cellular locations of gene products in a hierarchical ontology. GO annotations associate genes with GO terms and indicate the given gene products carrying out the biological functions described by the relevant terms. However, predicting correct GO annotations for genes from a massive set of GO terms as defined by GO is a difficult challenge. To combat with this challenge, we introduce a Gene Ontology Hierarchy Preserving Hashing (HPHash) based semantic method for gene function prediction. HPHash firstly measures the taxonomic similarity between GO terms. It then uses a hierarchy preserving hashing technique to keep the hierarchical order between GO terms, and to optimize a series of hashing functions to encode massive GO terms via compact binary codes. After that, HPHash utilizes these hashing functions to project the gene-term association matrix into a low-dimensional one and performs semantic similarity based gene function prediction in the low-dimensional space. Experimental results on three model species (Homo sapiens, Mus musculus and Rattus norvegicus) for interspecies gene function prediction show that HPHash performs better than other related approaches and it is robust to the number of hash functions. In addition, we also take HPHash as a plugin for BLAST based gene function prediction. From the experimental results, HPHash again significantly improves the prediction performance. The codes of HPHash are available at: http://mlda.swu.edu.cn/codes.php?name=HPHash.
Collapse
Affiliation(s)
- Yingwen Zhao
- College of Computer and Information Science, Southwest University, Chongqing 400715, China
| | - Guangyuan Fu
- College of Computer and Information Science, Southwest University, Chongqing 400715, China
| | - Jun Wang
- College of Computer and Information Science, Southwest University, Chongqing 400715, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China; Beijing Key Laboratory of Intelligent Processing for Building Big Data, Beijing 100044, China.
| | - Guoxian Yu
- College of Computer and Information Science, Southwest University, Chongqing 400715, China.
| |
Collapse
|
35
|
HashGO: hashing gene ontology for protein function prediction. Comput Biol Chem 2017; 71:264-273. [DOI: 10.1016/j.compbiolchem.2017.09.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Accepted: 09/25/2017] [Indexed: 10/18/2022]
|
36
|
Yu G, Lu C, Wang J. NoGOA: predicting noisy GO annotations using evidences and sparse representation. BMC Bioinformatics 2017; 18:350. [PMID: 28732468 PMCID: PMC5521088 DOI: 10.1186/s12859-017-1764-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2017] [Accepted: 07/14/2017] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Gene Ontology (GO) is a community effort to represent functional features of gene products. GO annotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation, only a small portion of annotations are manually checked by curators, and the others are electronically inferred. Although quality control techniques have been applied to ensure the quality of annotations, the community consistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of annotations, however, how to identify noisy annotations is an important but yet seldom studied open problem. RESULTS We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse representation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage of sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily predicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene. Next, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files archived on different periods, and then weights entries of the association matrix via estimated ratios and propagates weights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association matrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H. sapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly better results than other related methods and removing noisy annotations improves the performance of gene function prediction. CONCLUSIONS The comparative study justifies the effectiveness of integrating evidence codes with sparse representation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/codes.php?name=NoGOA .
Collapse
Affiliation(s)
- Guoxian Yu
- College of Computer and Information Sciences, Southwest University, Chongqing, China.
| | - Chang Lu
- College of Computer and Information Sciences, Southwest University, Chongqing, China
| | - Jun Wang
- College of Computer and Information Sciences, Southwest University, Chongqing, China
| |
Collapse
|
37
|
Brohi RD, Wang L, Hassine NB, Cao J, Talpur HS, Wu D, Huang CJ, Rehman ZU, Bhattarai D, Huo LJ. Expression, Localization of SUMO-1, and Analyses of Potential SUMOylated Proteins in Bubalus bubalis Spermatozoa. Front Physiol 2017; 8:354. [PMID: 28659810 PMCID: PMC5468435 DOI: 10.3389/fphys.2017.00354] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2017] [Accepted: 05/15/2017] [Indexed: 11/19/2022] Open
Abstract
Mature spermatozoa have highly condensed DNA that is essentially silent both transcriptionally and translationally. Therefore, post translational modifications are very important for regulating sperm motility, morphology, and for male fertility in general. Protein sumoylation was recently demonstrated in human and rodent spermatozoa, with potential consequences for sperm motility and DNA integrity. We examined the expression and localization of small ubiquitin-related modifier-1 (SUMO-1) in the sperm of water buffalo (Bubalus bubalis) using immunofluorescence analysis. We confirmed the expression of SUMO-1 in the acrosome. We further found that SUMO-1 was lost if the acrosome reaction was induced by calcium ionophore A23187. Proteins modified or conjugated by SUMO-1 in water buffalo sperm were pulled down and analyzed by mass spectrometry. Sixty proteins were identified, including proteins important for sperm morphology and motility, such as relaxin receptors and cytoskeletal proteins, including tubulin chains, actins, and dyneins. Forty-six proteins were predicted as potential sumoylation targets. The expression of SUMO-1 in the acrosome region of water buffalo sperm and the identification of potentially SUMOylated proteins important for sperm function implicates sumoylation as a crucial PTM related to sperm function.
Collapse
Affiliation(s)
- Rahim Dad Brohi
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Education Ministry of China, College of Animal Science and Technology, Huazhong Agricultural UniversityWuhan, China.,Department of Hubei Province's Engineering Research Center in Buffalo Breeding and ProductsWuhan, China
| | - Li Wang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Education Ministry of China, College of Animal Science and Technology, Huazhong Agricultural UniversityWuhan, China.,Department of Hubei Province's Engineering Research Center in Buffalo Breeding and ProductsWuhan, China
| | | | - Jing Cao
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Education Ministry of China, College of Animal Science and Technology, Huazhong Agricultural UniversityWuhan, China.,Department of Hubei Province's Engineering Research Center in Buffalo Breeding and ProductsWuhan, China
| | - Hira Sajjad Talpur
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Education Ministry of China, College of Animal Science and Technology, Huazhong Agricultural UniversityWuhan, China.,Department of Hubei Province's Engineering Research Center in Buffalo Breeding and ProductsWuhan, China
| | - Di Wu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Education Ministry of China, College of Animal Science and Technology, Huazhong Agricultural UniversityWuhan, China.,Department of Hubei Province's Engineering Research Center in Buffalo Breeding and ProductsWuhan, China
| | - Chun-Jie Huang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Education Ministry of China, College of Animal Science and Technology, Huazhong Agricultural UniversityWuhan, China.,Department of Hubei Province's Engineering Research Center in Buffalo Breeding and ProductsWuhan, China
| | - Zia-Ur Rehman
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Education Ministry of China, College of Animal Science and Technology, Huazhong Agricultural UniversityWuhan, China.,Department of Hubei Province's Engineering Research Center in Buffalo Breeding and ProductsWuhan, China
| | - Dinesh Bhattarai
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Education Ministry of China, College of Animal Science and Technology, Huazhong Agricultural UniversityWuhan, China.,Department of Hubei Province's Engineering Research Center in Buffalo Breeding and ProductsWuhan, China
| | - Li-Jun Huo
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Education Ministry of China, College of Animal Science and Technology, Huazhong Agricultural UniversityWuhan, China.,Department of Hubei Province's Engineering Research Center in Buffalo Breeding and ProductsWuhan, China
| |
Collapse
|
38
|
Koç I, Caetano-Anollés G. The natural history of molecular functions inferred from an extensive phylogenomic analysis of gene ontology data. PLoS One 2017; 12:e0176129. [PMID: 28467492 PMCID: PMC5414959 DOI: 10.1371/journal.pone.0176129] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Accepted: 04/05/2017] [Indexed: 11/18/2022] Open
Abstract
The origin and natural history of molecular functions hold the key to the emergence of cellular organization and modern biochemistry. Here we use a genomic census of Gene Ontology (GO) terms to reconstruct phylogenies at the three highest (1, 2 and 3) and the lowest (terminal) levels of the hierarchy of molecular functions, which reflect the broadest and the most specific GO definitions, respectively. These phylogenies define evolutionary timelines of functional innovation. We analyzed 249 free-living organisms comprising the three superkingdoms of life, Archaea, Bacteria, and Eukarya. Phylogenies indicate catalytic, binding and transport functions were the oldest, suggesting a 'metabolism-first' origin scenario for biochemistry. Metabolism made use of increasingly complicated organic chemistry. Primordial features of ancient molecular functions and functional recruitments were further distilled by studying the oldest child terms of the oldest level 1 GO definitions. Network analyses showed the existence of an hourglass pattern of enzyme recruitment in the molecular functions of the directed acyclic graph of molecular functions. Older high-level molecular functions were thoroughly recruited at younger lower levels, while very young high-level functions were used throughout the timeline. This pattern repeated in every one of the three mappings, which gave a criss-cross pattern. The timelines and their mappings were remarkable. They revealed the progressive evolutionary development of functional toolkits, starting with the early rise of metabolic activities, followed chronologically by the rise of macromolecular biosynthesis, the establishment of controlled interactions with the environment and self, adaptation to oxygen, and enzyme coordinated regulation, and ending with the rise of structural and cellular complexity. This historical account holds important clues for dissection of the emergence of biomcomplexity and life.
Collapse
Affiliation(s)
- Ibrahim Koç
- Molecular Biology and Genetics, Gebze Technical University, Kocaeli, Turkey
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America
| |
Collapse
|
39
|
Comparing Relational and Ontological Triple Stores in Healthcare Domain. ENTROPY 2017. [DOI: 10.3390/e19010030] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
40
|
Abstract
The Gene Ontology (GO) is a formidable resource, but there are several considerations about it that are essential to understand the data and interpret it correctly. The GO is sufficiently simple that it can be used without deep understanding of its structure or how it is developed, which is both a strength and a weakness. In this chapter, we discuss some common misinterpretations of the ontology and the annotations. A better understanding of the pitfalls and the biases in the GO should help users make the most of this very rich resource. We also review some of the misconceptions and misleading assumptions commonly made about GO, including the effect of data incompleteness, the importance of annotation qualifiers, and the transitivity or lack thereof associated with different ontology relations. We also discuss several biases that can confound aggregate analyses such as gene enrichment analyses. For each of these pitfalls and biases, we suggest remedies and best practices.
Collapse
Affiliation(s)
- Pascale Gaudet
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1 rue Michel-Servet, 1211, Geneva 4, Switzerland. .,Department of Human Protein Sciences, Faculty of Medicine, University of Geneva, 1211, Geneva, Switzerland.
| | - Christophe Dessimoz
- Department of Genetics, Evolution & Environment, University College London, Gower St, London, WC1E 6BT, UK.,Swiss Institute of Bioinformatics, Biophore Building, 1015, Lausanne, Switzerland.,Department of Ecology and Evolution, University of Lausanne, Street Biophore, 1015, Lausanne, Switzerland.,Center of Integrative Genomics, University of Lausanne, Biophore, 1015, Lausanne, Switzerland.,Department of Computer Science, University College London, Gower St, WC1E 6BT, London, UK
| |
Collapse
|
41
|
Lu C, Wang J, Zhang Z, Yang P, Yu G. NoisyGOA: Noisy GO annotations prediction using taxonomic and semantic similarity. Comput Biol Chem 2016; 65:203-211. [PMID: 27670689 DOI: 10.1016/j.compbiolchem.2016.09.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Accepted: 09/07/2016] [Indexed: 10/21/2022]
Abstract
Gene Ontology (GO) provides GO annotations (GOA) that associate gene products with GO terms that summarize their cellular, molecular and functional aspects in the context of biological pathways. GO Consortium (GOC) resorts to various quality assurances to ensure the correctness of annotations. Due to resources limitations, only a small portion of annotations are manually added/checked by GO curators, and a large portion of available annotations are computationally inferred. While computationally inferred annotations provide greater coverage of known genes, they may also introduce annotation errors (noise) that could mislead the interpretation of the gene functions and their roles in cellular and biological processes. In this paper, we investigate how to identify noisy annotations, a rarely addressed problem, and propose a novel approach called NoisyGOA. NoisyGOA first measures taxonomic similarity between ontological terms using the GO hierarchy and semantic similarity between genes. Next, it leverages the taxonomic similarity and semantic similarity to predict noisy annotations. We compare NoisyGOA with other alternative methods on identifying noisy annotations under different simulated cases of noisy annotations, and on archived GO annotations. NoisyGOA achieved higher accuracy than other alternative methods in comparison. These results demonstrated both taxonomic similarity and semantic similarity contribute to the identification of noisy annotations. Our study shows that annotation errors are predictable and removing noisy annotations improves the performance of gene function prediction. This study can prompt the community to study methods for removing inaccurate annotations, a critical step for annotating gene and pathway functions.
Collapse
Affiliation(s)
- Chang Lu
- College of Computer and Information Science, Southwest University, Chongqing 400715, China
| | - Jun Wang
- College of Computer and Information Science, Southwest University, Chongqing 400715, China
| | - Zili Zhang
- College of Computer and Information Science, Southwest University, Chongqing 400715, China
| | - Pengyi Yang
- School of Mathematics and Statistics, The University of Sydney, New South Wales, Australia
| | - Guoxian Yu
- College of Computer and Information Science, Southwest University, Chongqing 400715, China.
| |
Collapse
|
42
|
Cozzetto D, Minneci F, Currant H, Jones DT. FFPred 3: feature-based function prediction for all Gene Ontology domains. Sci Rep 2016; 6:31865. [PMID: 27561554 PMCID: PMC4999993 DOI: 10.1038/srep31865] [Citation(s) in RCA: 70] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2016] [Accepted: 07/25/2016] [Indexed: 11/09/2022] Open
Abstract
Predicting protein function has been a major goal of bioinformatics for several decades, and it has gained fresh momentum thanks to recent community-wide blind tests aimed at benchmarking available tools on a genomic scale. Sequence-based predictors, especially those performing homology-based transfers, remain the most popular but increasing understanding of their limitations has stimulated the development of complementary approaches, which mostly exploit machine learning. Here we present FFPred 3, which is intended for assigning Gene Ontology terms to human protein chains, when homology with characterized proteins can provide little aid. Predictions are made by scanning the input sequences against an array of Support Vector Machines (SVMs), each examining the relationship between protein function and biophysical attributes describing secondary structure, transmembrane helices, intrinsically disordered regions, signal peptides and other motifs. This update features a larger SVM library that extends its coverage to the cellular component sub-ontology for the first time, prompted by the establishment of a dedicated evaluation category within the Critical Assessment of Functional Annotation. The effectiveness of this approach is demonstrated through benchmarking experiments, and its usefulness is illustrated by analysing the potential functional consequences of alternative splicing in human and their relationship to patterns of biological features.
Collapse
Affiliation(s)
- Domenico Cozzetto
- Bioinformatics Group, Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK
| | - Federico Minneci
- Bioinformatics Group, Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK
| | - Hannah Currant
- Bioinformatics Group, Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK
| | - David T Jones
- Bioinformatics Group, Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK
| |
Collapse
|
43
|
Falda M, Lavezzo E, Fontana P, Bianco L, Berselli M, Formentin E, Toppo S. Eliciting the Functional Taxonomy from protein annotations and taxa. Sci Rep 2016; 6:31971. [PMID: 27534507 PMCID: PMC4989186 DOI: 10.1038/srep31971] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2016] [Accepted: 08/01/2016] [Indexed: 11/30/2022] Open
Abstract
The advances of omics technologies have triggered the production of an enormous volume of data coming from thousands of species. Meanwhile, joint international efforts like the Gene Ontology (GO) consortium have worked to provide functional information for a vast amount of proteins. With these data available, we have developed FunTaxIS, a tool that is the first attempt to infer functional taxonomy (i.e. how functions are distributed over taxa) combining functional and taxonomic information. FunTaxIS is able to define a taxon specific functional space by exploiting annotation frequencies in order to establish if a function can or cannot be used to annotate a certain species. The tool generates constraints between GO terms and taxa and then propagates these relations over the taxonomic tree and the GO graph. Since these constraints nearly cover the whole taxonomy, it is possible to obtain the mapping of a function over the taxonomy. FunTaxIS can be used to make functional comparative analyses among taxa, to detect improper associations between taxa and functions, and to discover how functional knowledge is either distributed or missing. A benchmark test set based on six different model species has been devised to get useful insights on the generated taxonomic rules.
Collapse
Affiliation(s)
- Marco Falda
- Department of Molecular Medicine, University of Padova, Padova, 35131, Italy
| | - Enrico Lavezzo
- Department of Molecular Medicine, University of Padova, Padova, 35131, Italy
| | - Paolo Fontana
- Istituto Agrario San Michele all'Adige Research and Innovation Centre, Foundation Edmund Mach, Trento, 38010, Italy
| | - Luca Bianco
- Istituto Agrario San Michele all'Adige Research and Innovation Centre, Foundation Edmund Mach, Trento, 38010, Italy
| | - Michele Berselli
- Department of Molecular Medicine, University of Padova, Padova, 35131, Italy
| | - Elide Formentin
- Department of Biology, University of Padova, Padova, 35131, Italy
| | - Stefano Toppo
- Department of Molecular Medicine, University of Padova, Padova, 35131, Italy
| |
Collapse
|
44
|
Kumar V, Khan AW, Saxena RK, Garg V, Varshney RK. First-generation HapMap in Cajanus spp. reveals untapped variations in parental lines of mapping populations. PLANT BIOTECHNOLOGY JOURNAL 2016; 14:1673-81. [PMID: 26821983 PMCID: PMC5066660 DOI: 10.1111/pbi.12528] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2015] [Revised: 12/06/2015] [Accepted: 12/10/2015] [Indexed: 05/02/2023]
Abstract
Whole genome re-sequencing (WGRS) was conducted on a panel of 20 Cajanus spp. accessions (crossing parentals of recombinant inbred lines, introgression lines, multiparent advanced generation intercross and nested association mapping population) comprising of two wild species and 18 cultivated species accessions. A total of 791.77 million paired-end reads were generated with an effective mapping depth of ~12X per accession. Analysis of WGRS data provided 5 465 676 genome-wide variations including 4 686 422 SNPs and 779 254 InDels across the accessions. Large structural variations in the form of copy number variations (2598) and presence and absence variations (970) were also identified. Additionally, 2 630 904 accession-specific variations comprising of 2 278 571 SNPs (86.6%), 166 243 deletions (6.3%) and 186 090 insertions (7.1%) were also reported. Identified polymorphic sites in this study provide the first-generation HapMap in Cajanus spp. which will be useful in mapping the genomic regions responsible for important traits.
Collapse
Affiliation(s)
- Vinay Kumar
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Aamir W Khan
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Rachit K Saxena
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Vanika Garg
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Rajeev K Varshney
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
- School of Plant Biology and Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia
| |
Collapse
|
45
|
Sangrador-Vegas A, Mitchell AL, Chang HY, Yong SY, Finn RD. GO annotation in InterPro: why stability does not indicate accuracy in a sea of changing annotations. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw027. [PMID: 26994912 PMCID: PMC4799721 DOI: 10.1093/database/baw027] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Accepted: 02/19/2016] [Indexed: 11/17/2022]
Abstract
The removal of annotation from biological databases is often perceived as an indicator of erroneous annotation. As a corollary, annotation stability is considered to be a measure of reliability. However, diverse data-driven events can affect the stability of annotations in both primary protein sequence databases and the protein family databases that are built upon the sequence databases and used to help annotate them. Here, we describe some of these events and their consequences for the InterPro database, and demonstrate that annotation removal or reassignment is not always linked to incorrect annotation by the curator. Database URL:http://www.ebi.ac.uk/interpro
Collapse
Affiliation(s)
- Amaia Sangrador-Vegas
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Alex L Mitchell
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Hsin-Yu Chang
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Siew-Yit Yong
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
46
|
Yu MK, Kramer M, Dutkowski J, Srivas R, Licon K, Kreisberg J, Ng CT, Krogan N, Sharan R, Ideker T. Translation of Genotype to Phenotype by a Hierarchy of Cell Subsystems. Cell Syst 2016; 2:77-88. [PMID: 26949740 PMCID: PMC4772745 DOI: 10.1016/j.cels.2016.02.003] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Accurately translating genotype to phenotype requires accounting for the functional impact of genetic variation at many biological scales. Here we present a strategy for genotype-phenotype reasoning based on existing knowledge of cellular subsystems. These subsystems and their hierarchical organization are defined by the Gene Ontology or a complementary ontology inferred directly from previously published datasets. Guided by the ontology's hierarchical structure, we organize genotype data into an "ontotype," that is, a hierarchy of perturbations representing the effects of genetic variation at multiple cellular scales. The ontotype is then interpreted using logical rules generated by machine learning to predict phenotype. This approach substantially outperforms previous, non-hierarchical methods for translating yeast genotype to cell growth phenotype, and it accurately predicts the growth outcomes of two new screens of 2,503 double gene knockouts impacting DNA repair or nuclear lumen. Ontotypes also generalize to larger knockout combinations, setting the stage for interpreting the complex genetics of disease.
Collapse
Affiliation(s)
- Michael Ku Yu
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla CA 92093, USA
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
| | - Michael Kramer
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
- Biomedical Sciences Program, University of California San Diego, La Jolla CA 92093, USA
| | - Janusz Dutkowski
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
- Data4Cure, La Jolla, CA 92037, USA
| | - Rohith Srivas
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
- Department of Bioengineering, University of California San Diego, La Jolla CA 92093, USA
| | - Katherine Licon
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
| | - Jason Kreisberg
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
| | | | - Nevan Krogan
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco 94143, USA
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv 69978, Israel
| | - Trey Ideker
- Department of Medicine, University of California San Diego, La Jolla CA 92093, USA
| |
Collapse
|
47
|
Milano M, Agapito G, Guzzi PH, Cannataro M. An experimental study of information content measurement of gene ontology terms. INT J MACH LEARN CYB 2016. [DOI: 10.1007/s13042-015-0482-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
48
|
Ceusters W, Nasri-Heir C, Alnaas D, Cairns BE, Michelotti A, Ohrbach R. Perspectives on next steps in classification of oro-facial pain - Part 3: biomarkers of chronic oro-facial pain - from research to clinic. J Oral Rehabil 2015; 42:956-66. [PMID: 26200973 PMCID: PMC4715524 DOI: 10.1111/joor.12324] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/31/2015] [Indexed: 11/28/2022]
Abstract
The purpose of this study was to review the current status of biomarkers used in oro-facial pain conditions. Specifically, we critically appraise their relative strengths and weaknesses for assessing mechanisms associated with the oro-facial pain conditions and interpret that information in the light of their current value for use in diagnosis. In the third section, we explore biomarkers through the perspective of ontological realism. We discuss ontological problems of biomarkers as currently widely conceptualised and implemented. This leads to recommendations for research practice aimed to a better understanding of the potential contribution that biomarkers might make to oro-facial pain diagnosis and thereby fulfil our goal for an expanded multidimensional framework for oro-facial pain conditions that would include a third axis.
Collapse
Affiliation(s)
- Werner Ceusters
- Department of Biomedical Informatics, University at Buffalo, NY, USA
| | | | | | - Brian E Cairns
- Faculty of Pharmaceutical Sciences, University of British Columbia, Vancouver, Canada
| | - Ambra Michelotti
- Section of Orthodontics, School of Dentistry, University of Naples Federico II, Naples, Italy
| | - Richard Ohrbach
- Department of Oral Diagnostic Sciences, University at Buffalo, NY, USA
| |
Collapse
|
49
|
Lavezzo E, Falda M, Fontana P, Bianco L, Toppo S. Enhancing protein function prediction with taxonomic constraints--The Argot2.5 web server. Methods 2015; 93:15-23. [PMID: 26318087 DOI: 10.1016/j.ymeth.2015.08.021] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Revised: 08/14/2015] [Accepted: 08/25/2015] [Indexed: 10/23/2022] Open
Abstract
Argot2.5 (Annotation Retrieval of Gene Ontology Terms) is a web server designed to predict protein function. It is an updated version of the previous Argot2 enriched with new features in order to enhance its usability and its overall performance. The algorithmic strategy exploits the grouping of Gene Ontology terms by means of semantic similarity to infer protein function. The tool has been challenged over two independent benchmarks and compared to Argot2, PANNZER, and a baseline method relying on BLAST, proving to obtain a better performance thanks to the contribution of some key interventions in critical steps of the working pipeline. The most effective changes regard: (a) the selection of the input data from sequence similarity searches performed against a clustered version of UniProt databank and a remodeling of the weights given to Pfam hits, (b) the application of taxonomic constraints to filter out annotations that cannot be applied to proteins belonging to the species under investigation. The taxonomic rules are derived from our in-house developed tool, FunTaxIS, that extends those provided by the Gene Ontology consortium. The web server is free for academic users and is available online at http://www.medcomp.medicina.unipd.it/Argot2-5/.
Collapse
Affiliation(s)
- Enrico Lavezzo
- Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Marco Falda
- Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Paolo Fontana
- Istituto Agrario San Michele all'Adige Research and Innovation Centre, Foundation Edmund Mach, Trento, Italy
| | - Luca Bianco
- Istituto Agrario San Michele all'Adige Research and Innovation Centre, Foundation Edmund Mach, Trento, Italy
| | - Stefano Toppo
- Department of Molecular Medicine, University of Padova, Padova, Italy.
| |
Collapse
|
50
|
Li W, Freudenberg J, Oswald M. Principles for the organization of gene-sets. Comput Biol Chem 2015; 59 Pt B:139-49. [PMID: 26188561 DOI: 10.1016/j.compbiolchem.2015.04.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2015] [Accepted: 04/08/2015] [Indexed: 12/23/2022]
Abstract
A gene-set, an important concept in microarray expression analysis and systems biology, is a collection of genes and/or their products (i.e. proteins) that have some features in common. There are many different ways to construct gene-sets, but a systematic organization of these ways is lacking. Gene-sets are mainly organized ad hoc in current public-domain databases, with group header names often determined by practical reasons (such as the types of technology in obtaining the gene-sets or a balanced number of gene-sets under a header). Here we aim at providing a gene-set organization principle according to the level at which genes are connected: homology, physical map proximity, chemical interaction, biological, and phenotypic-medical levels. We also distinguish two types of connections between genes: actual connection versus sharing of a label. Actual connections denote direct biological interactions, whereas shared label connection denotes shared membership in a group. Some extensions of the framework are also addressed such as overlapping of gene-sets, modules, and the incorporation of other non-protein-coding entities such as microRNAs.
Collapse
Affiliation(s)
- Wentian Li
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, North Shore LIJ Health System, Manhasset, NY, USA.
| | - Jan Freudenberg
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, North Shore LIJ Health System, Manhasset, NY, USA
| | - Michaela Oswald
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, North Shore LIJ Health System, Manhasset, NY, USA
| |
Collapse
|