1
|
Lin CX, Li HD, Deng C, Liu W, Erhardt S, Wu FX, Zhao XM, Guan Y, Wang J, Wang D, Hu B, Wang J. An integrated brain-specific network identifies genes associated with neuropathologic and clinical traits of Alzheimer’s disease. Brief Bioinform 2021; 23:6483067. [DOI: 10.1093/bib/bbab522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 10/26/2021] [Accepted: 11/13/2021] [Indexed: 11/12/2022] Open
Abstract
Abstract
Alzheimer’s disease (AD) has a strong genetic predisposition. However, its risk genes remain incompletely identified. We developed an Alzheimer’s brain gene network-based approach to predict AD-associated genes by leveraging the functional pattern of known AD-associated genes. Our constructed network outperformed existing networks in predicting AD genes. We then systematically validated the predictions using independent genetic, transcriptomic, proteomic data, neuropathological and clinical data. First, top-ranked genes were enriched in AD-associated pathways. Second, using external gene expression data from the Mount Sinai Brain Bank study, we found that the top-ranked genes were significantly associated with neuropathological and clinical traits, including the Consortium to Establish a Registry for Alzheimer’s Disease score, Braak stage score and clinical dementia rating. The analysis of Alzheimer’s brain single-cell RNA-seq data revealed cell-type-specific association of predicted genes with early pathology of AD. Third, by interrogating proteomic data in the Religious Orders Study and Memory and Aging Project and Baltimore Longitudinal Study of Aging studies, we observed a significant association of protein expression level with cognitive function and AD clinical severity. The network, method and predictions could become a valuable resource to advance the identification of risk genes for AD.
Collapse
Affiliation(s)
- Cui-Xiang Lin
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P. R. China
- Hunan Provincial Key Lab of Bioinformatics, Central South University, Changsha, Hunan 410083, P. R. China
| | - Hong-Dong Li
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P. R. China
- Hunan Provincial Key Lab of Bioinformatics, Central South University, Changsha, Hunan 410083, P. R. China
| | - Chao Deng
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P. R. China
- Hunan Provincial Key Lab of Bioinformatics, Central South University, Changsha, Hunan 410083, P. R. China
| | - Weisheng Liu
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P. R. China
- Hunan Provincial Key Lab of Bioinformatics, Central South University, Changsha, Hunan 410083, P. R. China
| | - Shannon Erhardt
- Department of Pediatrics, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SKS7N5A9, Canada
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States
| | - Jun Wang
- Department of Pediatrics, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Daifeng Wang
- Department of Biostatistics and Medical Informatics and Waisman Center, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Bin Hu
- Institute of Engineering Medicine, Beijing Institute of Technology, Beijing, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P. R. China
- Hunan Provincial Key Lab of Bioinformatics, Central South University, Changsha, Hunan 410083, P. R. China
| |
Collapse
|
2
|
Lin CX, Li HD, Deng C, Guan Y, Wang J. TissueNexus: a database of human tissue functional gene networks built with a large compendium of curated RNA-seq data. Nucleic Acids Res 2021; 50:D710-D718. [PMID: 34850130 PMCID: PMC8728275 DOI: 10.1093/nar/gkab1133] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 10/10/2021] [Accepted: 11/18/2021] [Indexed: 01/02/2023] Open
Abstract
Mapping gene interactions within tissues/cell types plays a crucial role in understanding the genetic basis of human physiology and disease. Tissue functional gene networks (FGNs) are essential models for mapping complex gene interactions. We present TissueNexus, a database of 49 human tissue/cell line FGNs constructed by integrating heterogeneous genomic data. We adopted an advanced machine learning approach for data integration because Bayesian classifiers, which is the main approach used for constructing existing tissue gene networks, cannot capture the interaction and nonlinearity of genomic features well. A total of 1,341 RNA-seq datasets containing 52,087 samples were integrated for all of these networks. Because the tissue label for RNA-seq data may be annotated with different names or be missing, we performed intensive hand-curation to improve quality. We further developed a user-friendly database for network search, visualization, and functional analysis. We illustrate the application of TissueNexus in prioritizing disease genes. The database is publicly available at https://www.diseaselinks.com/TissueNexus/.
Collapse
Affiliation(s)
- Cui-Xiang Lin
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P.R. China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, Hunan 410083, P.R. China
| | - Hong-Dong Li
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P.R. China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, Hunan 410083, P.R. China
| | - Chao Deng
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P.R. China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, Hunan 410083, P.R. China
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, P.R. China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, Hunan 410083, P.R. China
| |
Collapse
|
3
|
Nudelman I, Kudrin D, Nudelman G, Deshpande R, Hartmann BM, Kleinstein SH, Myers CL, Sealfon SC, Zaslavsky E. Comparing Host Module Activation Patterns and Temporal Dynamics in Infection by Influenza H1N1 Viruses. Front Immunol 2021; 12:691758. [PMID: 34335598 PMCID: PMC8317020 DOI: 10.3389/fimmu.2021.691758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 06/14/2021] [Indexed: 11/13/2022] Open
Abstract
Influenza is a serious global health threat that shows varying pathogenicity among different virus strains. Understanding similarities and differences among activated functional pathways in the host responses can help elucidate therapeutic targets responsible for pathogenesis. To compare the types and timing of functional modules activated in host cells by four influenza viruses of varying pathogenicity, we developed a new DYNAmic MOdule (DYNAMO) method that addresses the need to compare functional module utilization over time. This integrative approach overlays whole genome time series expression data onto an immune-specific functional network, and extracts conserved modules exhibiting either different temporal patterns or overall transcriptional activity. We identified a common core response to influenza virus infection that is temporally shifted for different viruses. We also identified differentially regulated functional modules that reveal unique elements of responses to different virus strains. Our work highlights the usefulness of combining time series gene expression data with a functional interaction map to capture temporal dynamics of the same cellular pathways under different conditions. Our results help elucidate conservation of the immune response both globally and at a granular level, and provide mechanistic insight into the differences in the host response to infection by influenza strains of varying pathogenicity.
Collapse
Affiliation(s)
- Irina Nudelman
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, United States.,Division of General Internal Medicine, New York University Langone Medical Centre, New York, NY, United States
| | - Daniil Kudrin
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - German Nudelman
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Raamesh Deshpande
- Department of Computer Science and Engineering, University of Minnesota - Twin Cities, Minneapolis, MN, United States
| | - Boris M Hartmann
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, United States.,Center for Advanced Research on Diagnostic Assays (CARDA), Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Steven H Kleinstein
- Department of Pathology, Yale University School of Medicine, New Haven, CT, United States
| | - Chad L Myers
- Department of Computer Science and Engineering, University of Minnesota - Twin Cities, Minneapolis, MN, United States.,Program in Biomedical Informatics and Computational Biology, University of Minnesota - Twin Cities, Minneapolis, MN, United States
| | - Stuart C Sealfon
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, United States.,Center for Advanced Research on Diagnostic Assays (CARDA), Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Elena Zaslavsky
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY, United States.,Center for Advanced Research on Diagnostic Assays (CARDA), Icahn School of Medicine at Mount Sinai, New York, NY, United States
| |
Collapse
|
4
|
Wu L, Han L, Li Q, Wang G, Zhang H, Li L. Using Interactome Big Data to Crack Genetic Mysteries and Enhance Future Crop Breeding. MOLECULAR PLANT 2021; 14:77-94. [PMID: 33340690 DOI: 10.1016/j.molp.2020.12.012] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 12/11/2020] [Accepted: 12/14/2020] [Indexed: 05/27/2023]
Abstract
The functional genes underlying phenotypic variation and their interactions represent "genetic mysteries". Understanding and utilizing these genetic mysteries are key solutions for mitigating the current threats to agriculture posed by population growth and individual food preferences. Due to advances in high-throughput multi-omics technologies, we are stepping into an Interactome Big Data era that is certain to revolutionize genetic research. In this article, we provide a brief overview of current strategies to explore genetic mysteries. We then introduce the methods for constructing and analyzing the Interactome Big Data and summarize currently available interactome resources. Next, we discuss how Interactome Big Data can be used as a versatile tool to dissect genetic mysteries. We propose an integrated strategy that could revolutionize genetic research by combining Interactome Big Data with machine learning, which involves mining information hidden in Big Data to identify the genetic models or networks that control various traits, and also provide a detailed procedure for systematic dissection of genetic mysteries,. Finally, we discuss three promising future breeding strategies utilizing the Interactome Big Data to improve crop yields and quality.
Collapse
Affiliation(s)
- Leiming Wu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Linqian Han
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Qing Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | - Guoying Wang
- Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Hongwei Zhang
- Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China.
| | - Lin Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China.
| |
Collapse
|
5
|
Li HD, Bai T, Sandford E, Burmeister M, Guan Y. BaiHui: cross-species brain-specific network built with hundreds of hand-curated datasets. Bioinformatics 2020; 35:2486-2488. [PMID: 30521009 DOI: 10.1093/bioinformatics/bty1001] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2018] [Revised: 11/28/2018] [Accepted: 12/04/2018] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Functional gene networks, representing how likely two genes work in the same biological process, are important models for studying gene interactions in complex tissues. However, a limitation of the current network-building scheme is the lack of leveraging evidence from multiple model organisms as well as the lack of expert curation and quality control of the input genomic data. RESULTS Here, we present BaiHui, a brain-specific functional gene network built by probabilistically integrating expertly-hand-curated (by reading original publications) heterogeneous and multi-species genomic data in human, mouse and rat brains. To facilitate the use of this network, we deployed a web server through which users can query their genes of interest, visualize the network, gain functional insight from enrichment analysis and download network data. We also illustrated how this network could be used to generate testable hypotheses on disease gene prioritization of brain disorders. AVAILABILITY AND IMPLEMENTATION BaiHui is freely available at: http://guanlab.ccmb.med.umich.edu/BaiHui/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hong-Dong Li
- Center for Bioinformatics, School of Information Science and Engineering, Central South University, Changsha, People's Republic of China.,Department of Computational Medicine and Bioinformatics
| | - Tianjian Bai
- Department of Computational Medicine and Bioinformatics
| | - Erin Sandford
- Molecular and Behavioral Neuroscience Institute, University of Michigan, Ann Arbor, MI, USA
| | - Margit Burmeister
- Department of Computational Medicine and Bioinformatics.,Molecular and Behavioral Neuroscience Institute, University of Michigan, Ann Arbor, MI, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics
| |
Collapse
|
6
|
Li H, Hu S, Neamati N, Guan Y. TAIJI: approaching experimental replicates-level accuracy for drug synergy prediction. Bioinformatics 2020; 35:2338-2339. [PMID: 30462169 DOI: 10.1093/bioinformatics/bty955] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Revised: 11/16/2018] [Accepted: 11/20/2018] [Indexed: 01/01/2023] Open
Abstract
MOTIVATION Combination therapy is widely used in cancer treatment to overcome drug resistance. High-throughput drug screening is the standard approach to study the drug combination effects, yet it becomes impractical when the number of drugs under consideration is large. Therefore, accurate and fast computational tools for predicting drug synergistic effects are needed to guide experimental design for developing candidate drug pairs. RESULTS Here, we present TAIJI, a high-performance software for fast and accurate prediction of drug synergism. It is based on the winning algorithm in the AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge, which is a unique platform to unbiasedly evaluate the performance of current state-of-the-art methods, and includes 160 team-based submission methods. When tested across a broad spectrum of 85 different cancer cell lines and 1089 drug combinations, TAIJI achieved a high prediction correlation (0.53), approaching the accuracy level of experimental replicates (0.56). The runtime is at the scale of minutes to achieve this state-of-the-field performance. AVAILABILITY AND IMPLEMENTATION TAIJI is freely available on GitHub (https://github.com/GuanLab/TAIJI). It is functional with built-in Perl and Python. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hongyang Li
- Department of Computational Medicine and Bioinformatics
| | - Shuai Hu
- Department of Computational Medicine and Bioinformatics.,Department of Medicinal Chemistry, College of Pharmacy, Rogel Cancer Center and
| | - Nouri Neamati
- Department of Medicinal Chemistry, College of Pharmacy, Rogel Cancer Center and
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics.,Nephrology Division, Department of Internal Medicine, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
7
|
Zhang H, Li C, Xin Y, Cui X, Cui J, Zhou G. Suppression of NSDHL attenuates adipogenesis with a downregulation of LXR-SREBP1 pathway in 3T3-L1 cells. Biosci Biotechnol Biochem 2020; 84:980-988. [PMID: 31985358 DOI: 10.1080/09168451.2020.1719823] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Previous RNA-Seq analyses revealed that NAD(P)H steroid dehydrogenase-like (NSDHL) has a different expression during 3T3-L1 differentiation; however, its roles in adipogenesis are unknown. In the present study, using quantitative real-time PCR, we confirmed that NSDHL knockdown increased the proliferation of 3T3-L1 preadipocytes, but attenuated the differentiation of 3T3-L1 preadipocytes, as evidenced by reduced lipid accumulation and down-regulation of PPARγ gene expression. Further analyses showed that the expression peak of NSDHL was at the early stage of 3T3-L1 preadipocytes differentiation and LXR-SREBP1 signaling pathway was downregulated in NSDHL-knockdown 3T3-L1 cells. Collectively, our findings indicate that NSDHL is a novel modulator of adipogenesis. Moreover, our data provide insight into the complex relationships between sterol sensing, LXR-SREBP1 signaling pathway, and PPARγ in 3T3-L1 cells.
Collapse
Affiliation(s)
- Haiyan Zhang
- College of Life Science, Liaocheng University, Liaocheng, China
| | - Chengping Li
- College of Life Science, Liaocheng University, Liaocheng, China
| | - Youzhi Xin
- College of Life Science, Liaocheng University, Liaocheng, China.,Chinese Academy of Geological Sciences, Beijing, China
| | - Xiao Cui
- College of Life Science, Liaocheng University, Liaocheng, China
| | - Jianwei Cui
- College of Life Science, Liaocheng University, Liaocheng, China
| | - Guoli Zhou
- College of Life Science, Liaocheng University, Liaocheng, China
| |
Collapse
|
8
|
Li H, Siddiqui O, Zhang H, Guan Y. Joint learning improves protein abundance prediction in cancers. BMC Biol 2019; 17:107. [PMID: 31870366 PMCID: PMC6929375 DOI: 10.1186/s12915-019-0730-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Accepted: 12/04/2019] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND The classic central dogma in biology is the information flow from DNA to mRNA to protein, yet complicated regulatory mechanisms underlying protein translation often lead to weak correlations between mRNA and protein abundances. This is particularly the case in cancer samples and when evaluating the same gene across multiple samples. RESULTS Here, we report a method for predicting proteome from transcriptome, using a training dataset provided by NCI-CPTAC and TCGA, consisting of transcriptome and proteome data from 77 breast and 105 ovarian cancer samples. First, we establish a generic model capturing the correlation between mRNA and protein abundance of a single gene. Second, we build a gene-specific model capturing the interdependencies among multiple genes in a regulatory network. Third, we create a cross-tissue model by joint learning the information of shared regulatory networks and pathways across cancer tissues. Our method ranked first in the NCI-CPTAC DREAM Proteogenomics Challenge, and the predictive performance is close to the accuracy of experimental replicates. Key functional pathways and network modules controlling the proteomic abundance in cancers were revealed, in particular metabolism-related genes. CONCLUSIONS We present a method to predict proteome from transcriptome, leveraging data from different cancer tissues to build a trans-tissue model, and suggest how to integrate information from multiple cancers to provide a foundation for further research.
Collapse
Affiliation(s)
- Hongyang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA.
| | - Omer Siddiqui
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA
| | - Hongjiu Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA. .,Department of Internal Medicine, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
9
|
Sun C, Li H, Mills RE, Guan Y. Prognostic model for multiple myeloma progression integrating gene expression and clinical features. Gigascience 2019; 8:giz153. [PMID: 31886876 PMCID: PMC6936209 DOI: 10.1093/gigascience/giz153] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Revised: 12/05/2019] [Accepted: 12/06/2019] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND Multiple myeloma (MM) is a hematological cancer caused by abnormal accumulation of monoclonal plasma cells in bone marrow. With the increase in treatment options, risk-adapted therapy is becoming more and more important. Survival analysis is commonly applied to study progression or other events of interest and stratify the risk of patients. RESULTS In this study, we present the current state-of-the-art model for MM prognosis and the molecular biomarker set for stratification: the winning algorithm in the 2017 Multiple Myeloma DREAM Challenge, Sub-Challenge 3. Specifically, we built a non-parametric complete hazard ranking model to map the right-censored data into a linear space, where commonplace machine learning techniques, such as Gaussian process regression and random forests, can play their roles. Our model integrated both the gene expression profile and clinical features to predict the progression of MM. Compared with conventional models, such as Cox model and random survival forests, our model achieved higher accuracy in 3 within-cohort predictions. In addition, it showed robust predictive power in cross-cohort validations. Key molecular signatures related to MM progression were identified from our model, which may function as the core determinants of MM progression and provide important guidance for future research and clinical practice. Functional enrichment analysis and mammalian gene-gene interaction network revealed crucial biological processes and pathways involved in MM progression. The model is dockerized and publicly available at https://www.synapse.org/#!Synapse:syn11459638. Both data and reproducible code are included in the docker. CONCLUSIONS We present the current state-of-the-art prognostic model for MM integrating gene expression and clinical features validated in an independent test set.
Collapse
Affiliation(s)
- Chen Sun
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Hongyang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Ryan E Mills
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
- Department of Human Genetics, University of Michigan, 1241 East Catherine Street, Ann Arbor, MI 48109, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
- Department of Internal Medicine, Nephrology Division, University of Michigan, 1150 West Medical Center Drive, Ann Arbor, MI 48109, USA
| |
Collapse
|
10
|
Perlasca P, Frasca M, Ba CT, Notaro M, Petrini A, Casiraghi E, Grossi G, Gliozzo J, Valentini G, Mesiti M. UNIPred-Web: a web tool for the integration and visualization of biomolecular networks for protein function prediction. BMC Bioinformatics 2019; 20:422. [PMID: 31412768 PMCID: PMC6694573 DOI: 10.1186/s12859-019-2959-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Accepted: 06/18/2019] [Indexed: 01/06/2023] Open
Abstract
Background One of the main issues in the automated protein function prediction (AFP) problem is the integration of multiple networked data sources. The UNIPred algorithm was thereby proposed to efficiently integrate —in a function-specific fashion— the protein networks by taking into account the imbalance that characterizes protein annotations, and to subsequently predict novel hypotheses about unannotated proteins. UNIPred is publicly available as R code, which might result of limited usage for non-expert users. Moreover, its application requires efforts in the acquisition and preparation of the networks to be integrated. Finally, the UNIPred source code does not handle the visualization of the resulting consensus network, whereas suitable views of the network topology are necessary to explore and interpret existing protein relationships. Results We address the aforementioned issues by proposing UNIPred-Web, a user-friendly Web tool for the application of the UNIPred algorithm to a variety of biomolecular networks, already supplied by the system, and for the visualization and exploration of protein networks. We support different organisms and different types of networks —e.g., co-expression, shared domains and physical interaction networks. Users are supported in the different phases of the process, ranging from the selection of the networks and the protein function to be predicted, to the navigation of the integrated network. The system also supports the upload of user-defined protein networks. The vertex-centric and the highly interactive approach of UNIPred-Web allow a narrow exploration of specific proteins, and an interactive analysis of large sub-networks with only a few mouse clicks. Conclusions UNIPred-Web offers a practical and intuitive (visual) guidance to biologists interested in gaining insights into protein biomolecular functions. UNIPred-Web provides facilities for the integration of networks, and supplies a framework for the imbalance-aware protein network integration of nine organisms, the prediction of thousands of GO protein functions, and a easy-to-use graphical interface for the visual analysis, navigation and interpretation of the integrated networks and of the functional predictions.
Collapse
Affiliation(s)
- Paolo Perlasca
- Department of Computer Science, Università degli Studi di Milano, Via Celoria 18, Milano, 20133, Italy
| | - Marco Frasca
- Department of Computer Science, Università degli Studi di Milano, Via Celoria 18, Milano, 20133, Italy
| | - Cheick Tidiane Ba
- Department of Computer Science, Università degli Studi di Milano, Via Celoria 18, Milano, 20133, Italy
| | - Marco Notaro
- Department of Computer Science, Università degli Studi di Milano, Via Celoria 18, Milano, 20133, Italy
| | - Alessandro Petrini
- Department of Computer Science, Università degli Studi di Milano, Via Celoria 18, Milano, 20133, Italy
| | - Elena Casiraghi
- Department of Computer Science, Università degli Studi di Milano, Via Celoria 18, Milano, 20133, Italy
| | - Giuliano Grossi
- Department of Computer Science, Università degli Studi di Milano, Via Celoria 18, Milano, 20133, Italy
| | - Jessica Gliozzo
- Department of Computer Science, Università degli Studi di Milano, Via Celoria 18, Milano, 20133, Italy.,Fondazione IRCCS Ca' Granda - Ospedale Maggiore Policlinico, Università degli Studi di Milano, Via della Commenda 10, Milano, 20122, Italy
| | - Giorgio Valentini
- Department of Computer Science, Università degli Studi di Milano, Via Celoria 18, Milano, 20133, Italy
| | - Marco Mesiti
- Department of Computer Science, Università degli Studi di Milano, Via Celoria 18, Milano, 20133, Italy.
| |
Collapse
|
11
|
Guala D, Ogris C, Müller N, Sonnhammer ELL. Genome-wide functional association networks: background, data & state-of-the-art resources. Brief Bioinform 2019; 21:1224-1237. [PMID: 31281921 PMCID: PMC7373183 DOI: 10.1093/bib/bbz064] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Revised: 04/29/2019] [Accepted: 05/04/2019] [Indexed: 02/06/2023] Open
Abstract
The vast amount of experimental data from recent advances in the field of high-throughput biology begs for integration into more complex data structures such as genome-wide functional association networks. Such networks have been used for elucidation of the interplay of intra-cellular molecules to make advances ranging from the basic science understanding of evolutionary processes to the more translational field of precision medicine. The allure of the field has resulted in rapid growth of the number of available network resources, each with unique attributes exploitable to answer different biological questions. Unfortunately, the high volume of network resources makes it impossible for the intended user to select an appropriate tool for their particular research question. The aim of this paper is to provide an overview of the underlying data and representative network resources as well as to mention methods of integration, allowing a customized approach to resource selection. Additionally, this report will provide a primer for researchers venturing into the field of network integration.
Collapse
Affiliation(s)
- Dimitri Guala
- Science for Life Laboratory, Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Box 1031, 17121 Solna, Sweden
| | - Christoph Ogris
- Computational Cell Maps, Institute of Computational Biology, Helmholtz Center Munich, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany
| | - Nikola Müller
- Computational Cell Maps, Institute of Computational Biology, Helmholtz Center Munich, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany
| | - Erik L L Sonnhammer
- Science for Life Laboratory, Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Box 1031, 17121 Solna, Sweden
| |
Collapse
|
12
|
Shriwash N, Singh P, Arora S, Ali SM, Ali S, Dohare R. Identification of differentially expressed genes in small and non-small cell lung cancer based on meta-analysis of mRNA. Heliyon 2019; 5:e01707. [PMID: 31338439 PMCID: PMC6580189 DOI: 10.1016/j.heliyon.2019.e01707] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Revised: 04/02/2019] [Accepted: 05/08/2019] [Indexed: 12/21/2022] Open
Abstract
Lung cancer has the lowest survival rate spread globally resulting in a large number of deaths. This is attributed to insufficient measures such as lack of early detection and chemoresistance in the patients. It can be subdivided into two histological groups: Non-Small-Cell Lung Cancer (NSCLC), which is most prevalent (85% of all lung cancers) but less destructive; and Small-Cell Lung Cancer (SCLC), which is intermittently metastatic and less prevalent (15% of all lung cancers). The present study deals with the analysis of gene expression of two subtypes to identify the Differentially Expressed Genes (DEGs). For this study, we selected two datasets from the Omnibus database, which included 50 non-small cell lung cancer samples, 31 small cell lung cancer samples, and 48 samples from normal lung tissue. After DEGs identification using the meta-analysis approach, they were then subjected to further analysis following p-value adjustment via the Benjamini-Hochberg method. We identified 440 overexpressed and 489 underexpressed genes in NSCLC, and 489 overexpressed and 525 underexpressed genes in SCLC, compared with normal lung tissues. Furthermore, we identified 3 overlapping genes between upregulated DEGs in NSCLC and downregulated DEGs in SCLC; and 8 overlapping genes between upregulated DEGs in SCLC and downregulated DEGs in NSCLC. Accordingly, a Protein-Protein Interaction (PPI) network of the overlapping genes was generated, which contained a total of 261 genes, of which the top five were TRIM29, ANK3, CSTA, FGG, and AGR2. These five candidate genes reported herein may prove to be potential therapeutic targets.
Collapse
Affiliation(s)
- Nitesh Shriwash
- Department of Computer Science, Faculty of Natural Science, Jamia Millia Islamia, New Delhi, 110025, India
| | - Prithvi Singh
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi, 110025, India
| | - Shweta Arora
- Department of Biotechnology, Faculty of Natural Science, Jamia Millia Islamia, New Delhi, 110025, India
| | - Syed Mansoor Ali
- Department of Biotechnology, Faculty of Natural Science, Jamia Millia Islamia, New Delhi, 110025, India
| | - Sher Ali
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi, 110025, India
| | - Ravins Dohare
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi, 110025, India
| |
Collapse
|
13
|
Li H, Li T, Quang D, Guan Y. Network Propagation Predicts Drug Synergy in Cancers. Cancer Res 2018; 78:5446-5457. [PMID: 30054332 DOI: 10.1158/0008-5472.can-18-0740] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Revised: 06/27/2018] [Accepted: 07/23/2018] [Indexed: 11/16/2022]
Abstract
Combination therapies are commonly used to treat patients with complex diseases that respond poorly to single-agent therapies. In vitro high-throughput drug screening is a standard method for preclinical prioritization of synergistic drug combinations, but it can be impractical for large drug sets. Computational methods are thus being actively explored; however, most published methods were built on a limited size of cancer cell lines or drugs, and it remains a challenge to predict synergism at a large scale where the diversity within the data escalates the difficulty of prediction. Here, we present a state-of-the-field synergy prediction algorithm, which ranked first in all subchallenges in the AstraZeneca-Sanger Drug Combination Prediction DREAM Challenge. The model was built and evaluated using the largest drug combination screening dataset at the time of the competition, consisting of approximately 11,500 experimentally tested synergy scores of 118 drugs in 85 cancer cell lines. We developed a novel feature extraction strategy by integrating the cross-cell and cross-drug information with a novel network propagation method and then assembled the information in monotherapy and simulated molecular data to predict drug synergy. This represents a significant conceptual advancement of synergy prediction, using extracted features in the form of simulated posttreatment molecular profiles when only the pretreatment molecular profile is available. Our cross-tissue synergism prediction algorithm achieves promising accuracy comparable with the correlation between experimental replicates and can be applied to other cancer cell lines and drugs to guide therapeutic choices.Significance: This study presents a novel network propagation-based method that predicts anticancer drug synergy to the accuracy of experimental replicates, which establishes a state-of-the-field method as benchmarked by the pharmacogenomics research community involving models generated by 160 teams. Cancer Res; 78(18); 5446-57. ©2018 AACR.
Collapse
Affiliation(s)
- Hongyang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - Tingyang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - Daniel Quang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan.
| |
Collapse
|
14
|
Schäfer M, Klein HU, Schwender H. Integrative analysis of multiple genomic variables using a hierarchical Bayesian model. Bioinformatics 2018; 33:3220-3227. [PMID: 28582573 DOI: 10.1093/bioinformatics/btx356] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Accepted: 05/31/2017] [Indexed: 12/13/2022] Open
Abstract
Motivation Genes showing congruent differences in several genomic variables between two biological conditions are crucial to unravel causalities behind phenotypes of interest. Detecting such genes is important in biomedical research, e.g. when identifying genes responsible for cancer development. Small sample sizes common in next-generation sequencing studies are a key challenge, and there are still only very few statistical methods to analyze more than two genomic variables in an integrative, model-based way. Here, we present a novel bioinformatics approach to detect congruent differences between two biological conditions in a larger number of different measurements such as various epigenetic marks or mRNA transcript levels. Results We propose a coefficient quantifying the degree to which genes present consistent alterations in multiple (more than two) genomic variables when comparing samples presenting a condition of interest (e.g. cancer) to a reference group. A hierarchical Bayesian model is employed to assess uncertainty on a gene level, incorporating information on functional relationships between genes. We demonstrate the approach on different data sets containing RNA-seq gene transcripton and up to four ChIP-seq histone modification measurements. Both the coefficient-based ranking and the inference based on the model lead to a plausible prioritizing of candidate genes when analyzing multiple genomic variables. Availability and implementation BUGS code in the Supplement. Contact m.schaefer@uni-duesseldorf.de. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Martin Schäfer
- Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany
| | - Hans-Ulrich Klein
- Program in Translational Neuropsychiatric Genomics, Ann Romney Center for Neurologic Diseases, Department of Neurology, Brigham and Women's Hospital, Boston, MA 02115, USA.,Harvard Medical School, Boston, MA 02115, USA.,Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02141, USA
| | - Holger Schwender
- Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany
| |
Collapse
|
15
|
Duda M, Zhang H, Li HD, Wall DP, Burmeister M, Guan Y. Brain-specific functional relationship networks inform autism spectrum disorder gene prediction. Transl Psychiatry 2018; 8:56. [PMID: 29507298 PMCID: PMC5838237 DOI: 10.1038/s41398-018-0098-6] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Revised: 10/20/2017] [Accepted: 12/30/2017] [Indexed: 11/09/2022] Open
Abstract
Autism spectrum disorder (ASD) is a neuropsychiatric disorder with strong evidence of genetic contribution, and increased research efforts have resulted in an ever-growing list of ASD candidate genes. However, only a fraction of the hundreds of nominated ASD-related genes have identified de novo or transmitted loss of function (LOF) mutations that can be directly attributed to the disorder. For this reason, a means of prioritizing candidate genes for ASD would help filter out false-positive results and allow researchers to focus on genes that are more likely to be causative. Here we constructed a machine learning model by leveraging a brain-specific functional relationship network (FRN) of genes to produce a genome-wide ranking of ASD risk genes. We rigorously validated our gene ranking using results from two independent sequencing experiments, together representing over 5000 simplex and multiplex ASD families. Finally, through functional enrichment analysis on our highly prioritized candidate gene network, we identified a small number of pathways that are key in early neural development, providing further support for their potential role in ASD.
Collapse
Affiliation(s)
- Marlena Duda
- 0000000086837370grid.214458.eDepartment of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI USA
| | - Hongjiu Zhang
- 0000000086837370grid.214458.eDepartment of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI USA
| | - Hong-Dong Li
- 0000000086837370grid.214458.eDepartment of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI USA ,0000 0001 0379 7164grid.216417.7Center for Bioinformatics, School of Information Science and Engineering, Central South University, Changsha, China
| | - Dennis P. Wall
- 0000000419368956grid.168010.eDepartment of Pediatrics, Division of Systems Medicine, Stanford University, Stanford, CA USA ,0000000419368956grid.168010.eDepartment of Biomedical Data Science, Stanford University, Stanford, CA USA
| | - Margit Burmeister
- 0000000086837370grid.214458.eDepartment of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI USA ,0000000086837370grid.214458.eMolecular and Behavioral Neuroscience Institute, University of Michigan, Ann Arbor, MI USA ,0000000086837370grid.214458.eDepartment of Human Genetics, University of Michigan, Ann Arbor, MI USA ,0000000086837370grid.214458.eDepartment of Psychiatry, University of Michigan, Ann Arbor, MI USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA. .,Department of Internal Medicine, Usniversity of Michigan, Ann Arbor, MI, USA. .,Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
16
|
Pir P, Le Novère N. Mathematical Models of Pluripotent Stem Cells: At the Dawn of Predictive Regenerative Medicine. Methods Mol Biol 2016; 1386:331-50. [PMID: 26677190 DOI: 10.1007/978-1-4939-3283-2_15] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Regenerative medicine, ranging from stem cell therapy to organ regeneration, is promising to revolutionize treatments of diseases and aging. These approaches require a perfect understanding of cell reprogramming and differentiation. Predictive modeling of cellular systems has the potential to provide insights about the dynamics of cellular processes, and guide their control. Moreover in many cases, it provides alternative to experimental tests, difficult to perform for practical or ethical reasons. The variety and accuracy of biological processes represented in mathematical models grew in-line with the discovery of underlying molecular mechanisms. High-throughput data generation led to the development of models based on data analysis, as an alternative to more established modeling based on prior mechanistic knowledge. In this chapter, we give an overview of existing mathematical models of pluripotency and cell fate, to illustrate the variety of methods and questions. We conclude that current approaches are yet to overcome a number of limitations: Most of the computational models have so far focused solely on understanding the regulation of pluripotency, and the differentiation of selected cell lineages. In addition, models generally interrogate only a few biological processes. However, a better understanding of the reprogramming process leading to ESCs and iPSCs is required to improve stem-cell therapies. One also needs to understand the links between signaling, metabolism, regulation of gene expression, and the epigenetics machinery.
Collapse
Affiliation(s)
- Pınar Pir
- Babraham Institute, Babraham Research Campus, Cambridge, CB22 3AT, UK.
| | - Nicolas Le Novère
- Babraham Institute, Babraham Research Campus, Cambridge, CB22 3AT, UK.
| |
Collapse
|
17
|
Zhu F, Panwar B, Dodge HH, Li H, Hampstead BM, Albin RL, Paulson HL, Guan Y. COMPASS: A computational model to predict changes in MMSE scores 24-months after initial assessment of Alzheimer's disease. Sci Rep 2016; 6:34567. [PMID: 27703197 PMCID: PMC5050516 DOI: 10.1038/srep34567] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2016] [Accepted: 09/15/2016] [Indexed: 12/26/2022] Open
Abstract
We present COMPASS, a COmputational Model to Predict the development of Alzheimer’s diSease Spectrum, to model Alzheimer’s disease (AD) progression. This was the best-performing method in recent crowdsourcing benchmark study, DREAM Alzheimer’s Disease Big Data challenge to predict changes in Mini-Mental State Examination (MMSE) scores over 24-months using standardized data. In the present study, we conducted three additional analyses beyond the DREAM challenge question to improve the clinical contribution of our approach, including: (1) adding pre-validated baseline cognitive composite scores of ADNI-MEM and ADNI-EF, (2) identifying subjects with significant declines in MMSE scores, and (3) incorporating SNPs of top 10 genes connected to APOE identified from functional-relationship network. For (1) above, we significantly improved predictive accuracy, especially for the Mild Cognitive Impairment (MCI) group. For (2), we achieved an area under ROC of 0.814 in predicting significant MMSE decline: our model has 100% precision at 5% recall, and 91% accuracy at 10% recall. For (3), “genetic only” model has Pearson’s correlation of 0.15 to predict progression in the MCI group. Even though addition of this limited genetic model to COMPASS did not improve prediction of progression of MCI group, the predictive ability of SNP information extended beyond well-known APOE allele.
Collapse
Affiliation(s)
- Fan Zhu
- Department of Computational Medicine and Bioinformatics, University of Michigan, USA
| | - Bharat Panwar
- Department of Computational Medicine and Bioinformatics, University of Michigan, USA
| | - Hiroko H Dodge
- Department of Neurology and Michigan Alzheimer's Disease Center, University of Michigan, USA.,Department of Neurology and Layton Aging and Alzheimer's Disease Center, Oregon Health &Science University, USA
| | - Hongdong Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, USA
| | - Benjamin M Hampstead
- Department of Psychiatry, University of Michigan, USA.,Mental Health Service, VA Ann Arbor Healthcare System, USA
| | - Roger L Albin
- Department of Neurology and Michigan Alzheimer's Disease Center, University of Michigan, USA.,Neurology Service &Geriatric Research Education and Clinical Centers, VA Ann Arbor Healthcare System, USA.,University of Michigan Udall Center, USA
| | - Henry L Paulson
- Department of Neurology and Michigan Alzheimer's Disease Center, University of Michigan, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, USA.,Departments of Internal Medicine and Human Genetics, and School of Public Health, University of Michigan, USA.,Departments of Internal Medicine and of Electrical Engineering and Computer Science, University of Michigan, USA
| |
Collapse
|
18
|
Guan Y, Martini S, Mariani LH. Genes Caught In Flagranti: Integrating Renal Transcriptional Profiles With Genotypes and Phenotypes. Semin Nephrol 2016. [PMID: 26215861 DOI: 10.1016/j.semnephrol.2015.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
In the past decade, population genetics has gained tremendous success in identifying genetic variations that are statistically relevant to renal diseases and kidney function. However, it is challenging to interpret the functional relevance of the genetic variations found by population genetics studies. In this review, we discuss studies that integrate multiple levels of data, especially transcriptome profiles and phenotype data, to assign functional roles of genetic variations involved in kidney function. Furthermore, we introduce state-of-the-art machine learning algorithms, Bayesian networks, support vector machines, and Gaussian process regression, which have been applied successfully to integrating genetic, regulatory, and clinical information to predict clinical outcomes. These methods are likely to be deployed successfully in the nephrology field in the near future.
Collapse
Affiliation(s)
- Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI; Department of Internal Medicine, University of Michigan, Ann Arbor, MI; Department of Computer Science and Engineering, University of Michigan, Ann Arbor, MI
| | - Sebastian Martini
- Department of Internal Medicine, University of Michigan, Ann Arbor, MI; Nephrologisches Zentrum, Medizinische Klinik und Poliklinik IV, Klinikum der Universität München, Ludwig-Maximilians-University Munich, Munich, Germany
| | - Laura H Mariani
- Department of Internal Medicine, University of Michigan, Ann Arbor, MI
| |
Collapse
|
19
|
Abstract
The laboratory mouse is the primary mammalian species used for studying alternative splicing events. Recent studies have generated computational models to predict functions for splice isoforms in the mouse. However, the functional relationship network, describing the probability of splice isoforms participating in the same biological process or pathway, has not yet been studied in the mouse. Here we describe a rich genome-wide resource of mouse networks at the isoform level, which was generated using a unique framework that was originally developed to infer isoform functions. This network was built through integrating heterogeneous genomic and protein data, including RNA-seq, exon array, protein docking and pseudo-amino acid composition. Through simulation and cross-validation studies, we demonstrated the accuracy of the algorithm in predicting isoform-level functional relationships. We showed that this network enables the users to reveal functional differences of the isoforms of the same gene, as illustrated by literature evidence with Anxa6 (annexin a6) as an example. We expect this work will become a useful resource for the mouse genetics community to understand gene functions. The network is publicly available at: http://guanlab.ccmb.med.umich.edu/isoformnetwork.
Collapse
|
20
|
Candemir E, Kollert L, Weißflog L, Geis M, Müller A, Post AM, O'Leary A, Harro J, Reif A, Freudenberg F. Interaction of NOS1AP with the NOS-I PDZ domain: Implications for schizophrenia-related alterations in dendritic morphology. Eur Neuropsychopharmacol 2016; 26:741-55. [PMID: 26861996 DOI: 10.1016/j.euroneuro.2016.01.008] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/19/2015] [Revised: 12/23/2015] [Accepted: 01/23/2016] [Indexed: 12/19/2022]
Abstract
Schizophrenia involves morphological brain changes, including changes in synaptic plasticity and altered dendritic development. Amongst the most promising candidate molecules for schizophrenia are neuronal nitric oxide (NO) synthase (NOS-I, also known as nNOS) and its adapter protein NOS1AP (previously named CAPON). However, the precise molecular mechanisms by which NOS-I and NOS1AP affect disease pathology remain to be resolved. Interestingly, overexpression of NOS1AP affects dendritic morphology, possibly through increased association with the NOS-I PDZ domain. To investigate the effect of NOS1AP on dendritic morphology we overexpressed different NOS1AP isoforms, NOS1AP deletion mutants and the aminoterminal 133 amino acids of NOS-I (NOS-IN133) containing an extended PDZ domain. We examined the interaction of the overexpressed constructs with endogenous NOS-I by co-immunoprecipitation and the consequences of increased NOS-I/NOS1AP PDZ interaction in primary cultures of hippocampal and cortical neurons from C57BL/6J mice. Neurons overexpressing NOS1AP isoforms or deletion mutants showed highly altered spine morphology and excessive growth of filopodia-like protrusions. Sholl analysis of immunostained primary cultured neurons revealed that dendritic branching was mildly affected by NOS1AP overexpression. Our results hint towards an involvement of NOS-I/NOS1AP interaction in the regulation of dendritic spine plasticity. As altered dendritic spine development and filopodial outgrowth are important neuropathological features of schizophrenia, our findings may provide insight into part of the molecular mechanisms involved in brain morphology alterations observed in schizophrenia. As the NOS-I/NOS1AP interface can be targeted by small molecules, our findings ultimately might help to develop novel treatment strategies for schizophrenia patients.
Collapse
Affiliation(s)
- Esin Candemir
- Department of Psychiatry, Psychosomatic Medicine and Psychotherapy, University Hospital of Frankfurt, 60528 Frankfurt am Main, Germany; Department of Psychiatry, Psychosomatics, and Psychotherapy, University Hospital of Würzburg, 97080 Würzburg, Germany; Graduate School of Life Sciences, University of Würzburg, 97080 Würzburg, Germany
| | - Leonie Kollert
- Department of Psychiatry, Psychosomatics, and Psychotherapy, University Hospital of Würzburg, 97080 Würzburg, Germany
| | - Lena Weißflog
- Department of Psychiatry, Psychosomatic Medicine and Psychotherapy, University Hospital of Frankfurt, 60528 Frankfurt am Main, Germany; Department of Psychiatry, Psychosomatics, and Psychotherapy, University Hospital of Würzburg, 97080 Würzburg, Germany
| | - Maria Geis
- Department of Psychiatry, Psychosomatics, and Psychotherapy, University Hospital of Würzburg, 97080 Würzburg, Germany
| | - Antje Müller
- Department of Psychiatry, Psychosomatics, and Psychotherapy, University Hospital of Würzburg, 97080 Würzburg, Germany
| | - Antonia M Post
- Department of Psychiatry, Psychosomatic Medicine and Psychotherapy, University Hospital of Frankfurt, 60528 Frankfurt am Main, Germany; Department of Psychiatry, Psychosomatics, and Psychotherapy, University Hospital of Würzburg, 97080 Würzburg, Germany
| | - Aet O'Leary
- Department of Psychiatry, Psychosomatics, and Psychotherapy, University Hospital of Würzburg, 97080 Würzburg, Germany; Division of Neuropsychopharmacology, Department of Psychology, University of Tartu, Ravila 14A, Tartu 50411 Estonia
| | - Jaanus Harro
- Division of Neuropsychopharmacology, Department of Psychology, University of Tartu, Ravila 14A, Tartu 50411 Estonia
| | - Andreas Reif
- Department of Psychiatry, Psychosomatic Medicine and Psychotherapy, University Hospital of Frankfurt, 60528 Frankfurt am Main, Germany; Department of Psychiatry, Psychosomatics, and Psychotherapy, University Hospital of Würzburg, 97080 Würzburg, Germany
| | - Florian Freudenberg
- Department of Psychiatry, Psychosomatic Medicine and Psychotherapy, University Hospital of Frankfurt, 60528 Frankfurt am Main, Germany; Department of Psychiatry, Psychosomatics, and Psychotherapy, University Hospital of Würzburg, 97080 Würzburg, Germany.
| |
Collapse
|
21
|
Li HD, Omenn GS, Guan Y. A proteogenomic approach to understand splice isoform functions through sequence and expression-based computational modeling. Brief Bioinform 2016; 17:1024-1031. [PMID: 26740460 DOI: 10.1093/bib/bbv109] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Revised: 11/03/2015] [Indexed: 01/23/2023] Open
Abstract
The products of multi-exon genes are a mixture of alternatively spliced isoforms, from which the translated proteins can have similar, different or even opposing functions. It is therefore essential to differentiate and annotate functions for individual isoforms. Computational approaches provide an efficient complement to expensive and time-consuming experimental studies. The input data of these methods range from DNA sequence, to RNA selection pressure, to expressed sequence tags, to full-length complementary DNA, to exon array, to RNA-seq expression, to proteomic data. Notably, RNA-seq technology generates quantitative profiling of transcript expression at the genome scale, with an unprecedented amount of expression data available for developing isoform function prediction methods. Integrative analysis of these data at different molecular levels enables a proteogenomic approach to systematically interrogate isoform functions. Here, we briefly review the state-of-the-art methods according to their input data sources, discuss their advantages and limitations and point out potential ways to improve prediction accuracies.
Collapse
|
22
|
Kim E, Hwang S, Kim H, Shim H, Kang B, Yang S, Shim JH, Shin SY, Marcotte EM, Lee I. MouseNet v2: a database of gene networks for studying the laboratory mouse and eight other model vertebrates. Nucleic Acids Res 2016; 44:D848-54. [PMID: 26527726 PMCID: PMC4702832 DOI: 10.1093/nar/gkv1155] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2015] [Revised: 10/05/2015] [Accepted: 10/19/2015] [Indexed: 12/26/2022] Open
Abstract
Laboratory mouse, Mus musculus, is one of the most important animal tools in biomedical research. Functional characterization of the mouse genes, hence, has been a long-standing goal in mammalian and human genetics. Although large-scale knockout phenotyping is under progress by international collaborative efforts, a large portion of mouse genome is still poorly characterized for cellular functions and associations with disease phenotypes. A genome-scale functional network of mouse genes, MouseNet, was previously developed in context of MouseFunc competition, which allowed only limited input data for network inferences. Here, we present an improved mouse co-functional network, MouseNet v2 (available at http://www.inetbio.org/mousenet), which covers 17 714 genes (>88% of coding genome) with 788 080 links, along with a companion web server for network-assisted functional hypothesis generation. The network database has been substantially improved by large expansion of genomics data. For example, MouseNet v2 database contains 183 co-expression networks inferred from 8154 public microarray samples. We demonstrated that MouseNet v2 is predictive for mammalian phenotypes as well as human diseases, which suggests its usefulness in discovery of novel disease genes and dissection of disease pathways. Furthermore, MouseNet v2 database provides functional networks for eight other vertebrate models used in various research fields.
Collapse
Affiliation(s)
- Eiru Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Sohyun Hwang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, TX 78712, USA
| | - Hyojin Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Hongseok Shim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Byunghee Kang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Sunmo Yang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Jae Ho Shim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Seung Yeon Shin
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| | - Edward M Marcotte
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, TX 78712, USA
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, Korea
| |
Collapse
|
23
|
Abstract
Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.
Collapse
Affiliation(s)
- Ulf Schmitz
- Dept of Systems Biology & Bioinformatics, University of Rostock, Rostock, Germany
| | - Olaf Wolkenhauer
- Dept of Systems Biology & Bioinformatics, University of Rostock, Rostock, Germany
| |
Collapse
|
24
|
Gonzalez GH, Tahsin T, Goodale BC, Greene AC, Greene CS. Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery. Brief Bioinform 2015; 17:33-42. [PMID: 26420781 PMCID: PMC4719073 DOI: 10.1093/bib/bbv087] [Citation(s) in RCA: 103] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Indexed: 02/06/2023] Open
Abstract
Precision medicine will revolutionize the way we treat and prevent disease. A major barrier to the implementation of precision medicine that clinicians and translational scientists face is understanding the underlying mechanisms of disease. We are starting to address this challenge through automatic approaches for information extraction, representation and analysis. Recent advances in text and data mining have been applied to a broad spectrum of key biomedical questions in genomics, pharmacogenomics and other fields. We present an overview of the fundamental methods for text and data mining, as well as recent advances and emerging applications toward precision medicine.
Collapse
|
25
|
Li HD, Menon R, Govindarajoo B, Panwar B, Zhang Y, Omenn GS, Guan Y. Functional Networks of Highest-Connected Splice Isoforms: From The Chromosome 17 Human Proteome Project. J Proteome Res 2015. [PMID: 26216192 DOI: 10.1021/acs.jproteome.5b00494] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Alternative splicing allows a single gene to produce multiple transcript-level splice isoforms from which the translated proteins may show differences in their expression and function. Identifying the major functional or canonical isoform is important for understanding gene and protein functions. Identification and characterization of splice isoforms is a stated goal of the HUPO Human Proteome Project and of neXtProt. Multiple efforts have catalogued splice isoforms as "dominant", "principal", or "major" isoforms based on expression or evolutionary traits. In contrast, we recently proposed highest connected isoforms (HCIs) as a new class of canonical isoforms that have the strongest interactions in a functional network and revealed their significantly higher (differential) transcript-level expression compared to nonhighest connected isoforms (NCIs) regardless of tissues/cell lines in the mouse. HCIs and their expression behavior in the human remain unexplored. Here we identified HCIs for 6157 multi-isoform genes using a human isoform network that we constructed by integrating a large compendium of heterogeneous genomic data. We present examples for pairs of transcript isoforms of ABCC3, RBM34, ERBB2, and ANXA7. We found that functional networks of isoforms of the same gene can show large differences. Interestingly, differential expression between HCIs and NCIs was also observed in the human on an independent set of 940 RNA-seq samples across multiple tissues, including heart, kidney, and liver. Using proteomic data from normal human retina and placenta, we showed that HCIs are a promising indicator of expressed protein isoforms exemplified by NUDFB6 and M6PR. Furthermore, we found that a significant percentage (20%, p = 0.0003) of human and mouse HCIs are homologues, suggesting their conservation between species. Our identified HCIs expand the repertoire of canonical isoforms and are expected to facilitate studying main protein products, understanding gene regulation, and possibly evolution. The network is available through our web server as a rich resource for investigating isoform functional relationships (http://guanlab.ccmb.med.umich.edu/hisonet). All MS/MS data were available at ProteomeXchange Web site (http://www.proteomexchange.org) through their identifiers (retina: PXD001242, placenta: PXD000754).
Collapse
Affiliation(s)
- Hong-Dong Li
- Department of Computational Medicine and Bioinformatics, ‡Department of Internal Medicine, §Department of Human Genetics and School of Public Health, ∥Department of Electrical Engineering and Computer Science University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Rajasree Menon
- Department of Computational Medicine and Bioinformatics, ‡Department of Internal Medicine, §Department of Human Genetics and School of Public Health, ∥Department of Electrical Engineering and Computer Science University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Brandon Govindarajoo
- Department of Computational Medicine and Bioinformatics, ‡Department of Internal Medicine, §Department of Human Genetics and School of Public Health, ∥Department of Electrical Engineering and Computer Science University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Bharat Panwar
- Department of Computational Medicine and Bioinformatics, ‡Department of Internal Medicine, §Department of Human Genetics and School of Public Health, ∥Department of Electrical Engineering and Computer Science University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, ‡Department of Internal Medicine, §Department of Human Genetics and School of Public Health, ∥Department of Electrical Engineering and Computer Science University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Gilbert S Omenn
- Department of Computational Medicine and Bioinformatics, ‡Department of Internal Medicine, §Department of Human Genetics and School of Public Health, ∥Department of Electrical Engineering and Computer Science University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, ‡Department of Internal Medicine, §Department of Human Genetics and School of Public Health, ∥Department of Electrical Engineering and Computer Science University of Michigan , Ann Arbor, Michigan 48109, United States
| |
Collapse
|
26
|
Zhu F, Panwar B, Guan Y. Algorithms for modeling global and context-specific functional relationship networks. Brief Bioinform 2015; 17:686-95. [PMID: 26254431 DOI: 10.1093/bib/bbv065] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2015] [Indexed: 02/07/2023] Open
Abstract
Functional genomics has enormous potential to facilitate our understanding of normal and disease-specific physiology. In the past decade, intensive research efforts have been focused on modeling functional relationship networks, which summarize the probability of gene co-functionality relationships. Such modeling can be based on either expression data only or heterogeneous data integration. Numerous methods have been deployed to infer the functional relationship networks, while most of them target the global (non-context-specific) functional relationship networks. However, it is expected that functional relationships consistently reprogram under different tissues or biological processes. Thus, advanced methods have been developed targeting tissue-specific or developmental stage-specific networks. This article brings together the state-of-the-art functional relationship network modeling methods, emphasizes the need for heterogeneous genomic data integration and context-specific network modeling and outlines future directions for functional relationship networks.
Collapse
|
27
|
Gui J, Greene CS, Sullivan C, Taylor W, Moore JH, Kim C. Testing multiple hypotheses through IMP weighted FDR based on a genetic functional network with application to a new zebrafish transcriptome study. BioData Min 2015; 8:17. [PMID: 26097506 PMCID: PMC4474579 DOI: 10.1186/s13040-015-0050-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2014] [Accepted: 06/08/2015] [Indexed: 11/10/2022] Open
Abstract
In genome-wide studies, hundreds of thousands of hypothesis tests are performed simultaneously. Bonferroni correction and False Discovery Rate (FDR) can effectively control type I error but often yield a high false negative rate. We aim to develop a more powerful method to detect differentially expressed genes. We present a Weighted False Discovery Rate (WFDR) method that incorporate biological knowledge from genetic networks. We first identify weights using Integrative Multi-species Prediction (IMP) and then apply the weights in WFDR to identify differentially expressed genes through an IMP-WFDR algorithm. We performed a gene expression experiment to identify zebrafish genes that change expression in the presence of arsenic during a systemic Pseudomonas aeruginosa infection. Zebrafish were exposed to arsenic at 10 parts per billion and/or infected with P. aeruginosa. Appropriate controls were included. We then applied IMP-WFDR during the analysis of differentially expressed genes. We compared the mRNA expression for each group and found over 200 differentially expressed genes and several enriched pathways including defense response pathways, arsenic response pathways, and the Notch signaling pathway.
Collapse
Affiliation(s)
- Jiang Gui
- Department of Biomedical Data Science, Geisel school of medicine, Dartmouth College, Hanover, NH USA.,Dartmouth-Hitchcock Medical Center, 883 Rubin Bldg, HB7927, One Medical Center Dr., Lebanon, NH USA
| | - Casey S Greene
- Department of Genetics, Geisel school of medicine, Dartmouth College, Hanover, NH USA
| | - Con Sullivan
- Department of Molecular and Biomedical Sciences, University of Maine, Orono, ME USA.,Graduate School of Biomedical Science and Engineeering, University of Maine, Orono, ME USA
| | - Walter Taylor
- Department of Genetics, Geisel school of medicine, Dartmouth College, Hanover, NH USA
| | - Jason H Moore
- Department of Biostatistics and Epidemiology, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA
| | - Carol Kim
- Department of Molecular and Biomedical Sciences, University of Maine, Orono, ME USA.,Graduate School of Biomedical Science and Engineeering, University of Maine, Orono, ME USA
| |
Collapse
|
28
|
Musungu B, Bhatnagar D, Brown RL, Fakhoury AM, Geisler M. A predicted protein interactome identifies conserved global networks and disease resistance subnetworks in maize. Front Genet 2015; 6:201. [PMID: 26089837 PMCID: PMC4454876 DOI: 10.3389/fgene.2015.00201] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Accepted: 05/21/2015] [Indexed: 12/30/2022] Open
Abstract
Interactomes are genome-wide roadmaps of protein-protein interactions. They have been produced for humans, yeast, the fruit fly, and Arabidopsis thaliana and have become invaluable tools for generating and testing hypotheses. A predicted interactome for Zea mays (PiZeaM) is presented here as an aid to the research community for this valuable crop species. PiZeaM was built using a proven method of interologs (interacting orthologs) that were identified using both one-to-one and many-to-many orthology between genomes of maize and reference species. Where both maize orthologs occurred for an experimentally determined interaction in the reference species, we predicted a likely interaction in maize. A total of 49,026 unique interactions for 6004 maize proteins were predicted. These interactions are enriched for processes that are evolutionarily conserved, but include many otherwise poorly annotated proteins in maize. The predicted maize interactions were further analyzed by comparing annotation of interacting proteins, including different layers of ontology. A map of pairwise gene co-expression was also generated and compared to predicted interactions. Two global subnetworks were constructed for highly conserved interactions. These subnetworks showed clear clustering of proteins by function. Another subnetwork was created for disease response using a bait and prey strategy to capture interacting partners for proteins that respond to other organisms. Closer examination of this subnetwork revealed the connectivity between biotic and abiotic hormone stress pathways. We believe PiZeaM will provide a useful tool for the prediction of protein function and analysis of pathways for Z. mays researchers and is presented in this paper as a reference tool for the exploration of protein interactions in maize.
Collapse
Affiliation(s)
- Bryan Musungu
- Department of Plant Biology, Southern Illinois University Carbondale, IL, USA
| | - Deepak Bhatnagar
- Food and Feed Safety Research, Southern Regional Research Center, United States Department of Agriculture, Agricultural Research Service New Orleans, LA, USA
| | - Robert L Brown
- Food and Feed Safety Research, Southern Regional Research Center, United States Department of Agriculture, Agricultural Research Service New Orleans, LA, USA
| | - Ahmad M Fakhoury
- Department of Plant Soil and Agriculture Systems, Southern Illinois University Carbondale, IL, USA
| | - Matt Geisler
- Department of Plant Biology, Southern Illinois University Carbondale, IL, USA
| |
Collapse
|
29
|
Li HD, Omenn GS, Guan Y. MIsoMine: a genome-scale high-resolution data portal of expression, function and networks at the splice isoform level in the mouse. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav045. [PMID: 25953081 PMCID: PMC4423410 DOI: 10.1093/database/bav045] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/19/2015] [Accepted: 04/15/2015] [Indexed: 12/22/2022]
Abstract
Products of multiexon genes, especially in higher organisms, are a mixture of isoforms with different or even opposing functions, and therefore need to be treated separately. However, most studies and available resources such as Gene Ontology provide only gene-level function annotations, and therefore lose the differential information at the isoform level. Here we report MIsoMine, a high-resolution portal to multiple levels of functional information of alternatively spliced isoforms in the mouse. This data portal provides tissue-specific expression patterns and co-expression networks, along with such previously published functional genomic data as protein domains, predicted isoform-level functions and functional relationships. The core utility of MIsoMine is allowing users to explore a preprocessed, quality-controlled set of RNA-seq data encompassing diverse tissues and cell lineages. Tissue-specific co-expression networks were established, allowing a 2D ranking of isoforms and tissues by co-expression patterns. The results of the multiple isoforms of the same gene are presented in parallel to facilitate direct comparison, with cross-talking to prioritized functions at the isoform level. MIsoMine provides the first isoform-level resolution effort at genome-scale. We envision that this data portal will be a valuable resource for exploring functional genomic data, and will complement the existing functionalities of the mouse genome informatics database and the gene expression database for the laboratory mouse. Database URL: http://guanlab.ccmb.med.umich.edu/misomine/
Collapse
Affiliation(s)
- Hong-Dong Li
- Department of Computational Medicine and Bioinformatics, Department of Internal Medicine and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
| | - Gilbert S Omenn
- Department of Computational Medicine and Bioinformatics, Department of Internal Medicine and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA Department of Computational Medicine and Bioinformatics, Department of Internal Medicine and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, Department of Internal Medicine and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA Department of Computational Medicine and Bioinformatics, Department of Internal Medicine and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA Department of Computational Medicine and Bioinformatics, Department of Internal Medicine and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
30
|
Zhu F, Shi L, Engel JD, Guan Y. Regulatory network inferred using expression data of small sample size: application and validation in erythroid system. Bioinformatics 2015; 31:2537-44. [PMID: 25840044 DOI: 10.1093/bioinformatics/btv186] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2014] [Accepted: 03/27/2015] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Modeling regulatory networks using expression data observed in a differentiation process may help identify context-specific interactions. The outcome of the current algorithms highly depends on the quality and quantity of a single time-course dataset, and the performance may be compromised for datasets with a limited number of samples. RESULTS In this work, we report a multi-layer graphical model that is capable of leveraging many publicly available time-course datasets, as well as a cell lineage-specific data with small sample size, to model regulatory networks specific to a differentiation process. First, a collection of network inference methods are used to predict the regulatory relationships in individual public datasets. Then, the inferred directional relationships are weighted and integrated together by evaluating against the cell lineage-specific dataset. To test the accuracy of this algorithm, we collected a time-course RNA-Seq dataset during human erythropoiesis to infer regulatory relationships specific to this differentiation process. The resulting erythroid-specific regulatory network reveals novel regulatory relationships activated in erythropoiesis, which were further validated by genome-wide TR4 binding studies using ChIP-seq. These erythropoiesis-specific regulatory relationships were not identifiable by single dataset-based methods or context-independent integrations. Analysis of the predicted targets reveals that they are all closely associated with hematopoietic lineage differentiation.
Collapse
Affiliation(s)
- Fan Zhu
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Lihong Shi
- State Key Laboratory of Experimental Hematology, Institute of Hematology and Blood Diseases Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin 300020, China
| | | | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA, Department of Internal Medicine, and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
31
|
Silva JV, Yoon S, Domingues S, Guimarães S, Goltsev AV, da Cruz E Silva EF, Mendes JFF, da Cruz E Silva OAB, Fardilha M. Amyloid precursor protein interaction network in human testis: sentinel proteins for male reproduction. BMC Bioinformatics 2015; 16:12. [PMID: 25591988 PMCID: PMC4384327 DOI: 10.1186/s12859-014-0432-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2014] [Accepted: 12/16/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Amyloid precursor protein (APP) is widely recognized for playing a central role in Alzheimer's disease pathogenesis. Although APP is expressed in several tissues outside the human central nervous system, the functions of APP and its family members in other tissues are still poorly understood. APP is involved in several biological functions which might be potentially important for male fertility, such as cell adhesion, cell motility, signaling, and apoptosis. Furthermore, APP superfamily members are known to be associated with fertility. Knowledge on the protein networks of APP in human testis and spermatozoa will shed light on the function of APP in the male reproductive system. RESULTS We performed a Yeast Two-Hybrid screen and a database search to study the interaction network of APP in human testis and sperm. To gain insights into the role of APP superfamily members in fertility, the study was extended to APP-like protein 2 (APLP2). We analyzed several topological properties of the APP interaction network and the biological and physiological properties of the proteins in the APP interaction network were also specified by gene ontologyand pathways analyses. We classified significant features related to the human male reproduction for the APP interacting proteins and identified modules of proteins with similar functional roles which may show cooperative behavior for male fertility. CONCLUSIONS The present work provides the first report on the APP interactome in human testis. Our approach allowed the identification of novel interactions and recognition of key APP interacting proteins for male reproduction, particularly in sperm-oocyte interaction.
Collapse
Affiliation(s)
- Joana Vieira Silva
- Laboratory of Signal Transduction, Centre for Cell Biology, Health Sciences Department and Biology Department, University of Aveiro, 3810-193, Aveiro, Portugal.
| | - Sooyeon Yoon
- Department of Physics, I3N, University of Aveiro, 3810-193, Aveiro, Portugal.
| | - Sara Domingues
- Laboratory of Neurosciences, Centre for Cell Biology, Health Sciences Department and Biology Department, University of Aveiro, 3810-193, Aveiro, Portugal.
| | - Sofia Guimarães
- Laboratory of Neurosciences, Centre for Cell Biology, Health Sciences Department and Biology Department, University of Aveiro, 3810-193, Aveiro, Portugal.
| | - Alexander V Goltsev
- Department of Physics, I3N, University of Aveiro, 3810-193, Aveiro, Portugal.
| | - Edgar Figueiredo da Cruz E Silva
- Laboratory of Signal Transduction, Centre for Cell Biology, Health Sciences Department and Biology Department, University of Aveiro, 3810-193, Aveiro, Portugal.
| | | | - Odete Abreu Beirão da Cruz E Silva
- Laboratory of Neurosciences, Centre for Cell Biology, Health Sciences Department and Biology Department, University of Aveiro, 3810-193, Aveiro, Portugal.
| | - Margarida Fardilha
- Laboratory of Signal Transduction, Centre for Cell Biology, Health Sciences Department and Biology Department, University of Aveiro, 3810-193, Aveiro, Portugal.
- Centro de Biologia Celular, SACS, Edifício 30, Universidade de Aveiro, 3810-193, Aveiro, Portugal.
| |
Collapse
|
32
|
Dowell KG, Simons AK, Bai H, Kell B, Wang ZZ, Yun K, Hibbs MA. Novel insights into embryonic stem cell self-renewal revealed through comparative human and mouse systems biology networks. Stem Cells 2014; 32:1161-72. [PMID: 24307629 DOI: 10.1002/stem.1612] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2013] [Accepted: 10/11/2013] [Indexed: 12/25/2022]
Abstract
Embryonic stem cells (ESCs), characterized by their ability to both self-renew and differentiate into multiple cell lineages, are a powerful model for biomedical research and developmental biology. Human and mouse ESCs share many features, yet have distinctive aspects, including fundamental differences in the signaling pathways and cell cycle controls that support self-renewal. Here, we explore the molecular basis of human ESC self-renewal using Bayesian network machine learning to integrate cell-type-specific, high-throughput data for gene function discovery. We integrated high-throughput ESC data from 83 human studies (~1.8 million data points collected under 1,100 conditions) and 62 mouse studies (~2.4 million data points collected under 1,085 conditions) into separate human and mouse predictive networks focused on ESC self-renewal to analyze shared and distinct functional relationships among protein-coding gene orthologs. Computational evaluations show that these networks are highly accurate, literature validation confirms their biological relevance, and reverse transcriptase polymerase chain reaction (RT-PCR) validation supports our predictions. Our results reflect the importance of key regulatory genes known to be strongly associated with self-renewal and pluripotency in both species (e.g., POU5F1, SOX2, and NANOG), identify metabolic differences between species (e.g., threonine metabolism), clarify differences between human and mouse ESC developmental signaling pathways (e.g., leukemia inhibitory factor (LIF)-activated JAK/STAT in mouse; NODAL/ACTIVIN-A-activated fibroblast growth factor in human), and reveal many novel genes and pathways predicted to be functionally associated with self-renewal in each species. These interactive networks are available online at www.StemSight.org for stem cell researchers to develop new hypotheses, discover potential mechanisms involving sparsely annotated genes, and prioritize genes of interest for experimental validation.
Collapse
Affiliation(s)
- Karen G Dowell
- The Jackson Laboratory, Bar Harbor, Maine, USA; Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, Maine, USA
| | | | | | | | | | | | | |
Collapse
|
33
|
Xu Y, Guo M, Zou Q, Liu X, Wang C, Liu Y. System-level insights into the cellular interactome of a non-model organism: inferring, modelling and analysing functional gene network of soybean (Glycine max). PLoS One 2014; 9:e113907. [PMID: 25423109 PMCID: PMC4244207 DOI: 10.1371/journal.pone.0113907] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2014] [Accepted: 10/24/2014] [Indexed: 01/30/2023] Open
Abstract
Cellular interactome, in which genes and/or their products interact on several levels, forming transcriptional regulatory-, protein interaction-, metabolic-, signal transduction networks, etc., has attracted decades of research focuses. However, such a specific type of network alone can hardly explain the various interactive activities among genes. These networks characterize different interaction relationships, implying their unique intrinsic properties and defects, and covering different slices of biological information. Functional gene network (FGN), a consolidated interaction network that models fuzzy and more generalized notion of gene-gene relations, have been proposed to combine heterogeneous networks with the goal of identifying functional modules supported by multiple interaction types. There are yet no successful precedents of FGNs on sparsely studied non-model organisms, such as soybean (Glycine max), due to the absence of sufficient heterogeneous interaction data. We present an alternative solution for inferring the FGNs of soybean (SoyFGNs), in a pioneering study on the soybean interactome, which is also applicable to other organisms. SoyFGNs exhibit the typical characteristics of biological networks: scale-free, small-world architecture and modularization. Verified by co-expression and KEGG pathways, SoyFGNs are more extensive and accurate than an orthology network derived from Arabidopsis. As a case study, network-guided disease-resistance gene discovery indicates that SoyFGNs can provide system-level studies on gene functions and interactions. This work suggests that inferring and modelling the interactome of a non-model plant are feasible. It will speed up the discovery and definition of the functions and interactions of other genes that control important functions, such as nitrogen fixation and protein or lipid synthesis. The efforts of the study are the basis of our further comprehensive studies on the soybean functional interactome at the genome and microRNome levels. Additionally, a web tool for information retrieval and analysis of SoyFGNs can be accessed at SoyFN: http://nclab.hit.edu.cn/SoyFN.
Collapse
Affiliation(s)
- Yungang Xu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Maozu Guo
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Quan Zou
- School of Information Science and Technology, Xiamen University, Xiamen, China
| | - Xiaoyan Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Yang Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
34
|
Li HD, Menon R, Omenn GS, Guan Y. Revisiting the identification of canonical splice isoforms through integration of functional genomics and proteomics evidence. Proteomics 2014; 14:2709-18. [PMID: 25265570 DOI: 10.1002/pmic.201400170] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Revised: 08/11/2014] [Accepted: 09/23/2014] [Indexed: 01/08/2023]
Abstract
Canonical isoforms in different databases have been defined as the most prevalent, most conserved, most expressed, longest, or the one with the clearest description of domains or posttranslational modifications. In this article, we revisit these definitions of canonical isoforms based on functional genomics and proteomics evidence, focusing on mouse data. We report a novel functional relationship network-based approach for identifying the highest connected isoforms (HCIs). We show that 46% of these HCIs are not the longest transcripts. In addition, this approach revealed many genes that have more than one highly connected isoforms. Averaged across 175 RNA-seq datasets covering diverse tissues and conditions, 65% of the HCIs show higher expression levels than nonhighest connected isoforms at the transcript level. At the protein level, these HCIs highly overlap with the expressed splice variants, based on proteomic data from eight different normal tissues. These results suggest that a more confident definition of canonical isoforms can be made through integration of multiple lines of evidence, including HCIs defined by biological processes and pathways, expression prevalence at the transcript level, and relative or absolute abundance at the protein level. This integrative proteogenomics approach can successfully identify principal isoforms that are responsible for the canonical functions of genes.
Collapse
Affiliation(s)
- Hong-Dong Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | | | | | | |
Collapse
|
35
|
van Dam S, Craig T, de Magalhães JP. GeneFriends: a human RNA-seq-based gene and transcript co-expression database. Nucleic Acids Res 2014; 43:D1124-32. [PMID: 25361971 PMCID: PMC4383890 DOI: 10.1093/nar/gku1042] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Co-expression networks have proven effective at assigning putative functions to genes based on the functional annotation of their co-expressed partners, in candidate gene prioritization studies and in improving our understanding of regulatory networks. The growing number of genome resequencing efforts and genome-wide association studies often identify loci containing novel genes and there is a need to infer their functions and interaction partners. To facilitate this we have expanded GeneFriends, an online database that allows users to identify co-expressed genes with one or more user-defined genes. This expansion entails an RNA-seq-based co-expression map that includes genes and transcripts that are not present in the microarray-based co-expression maps, including over 10 000 non-coding RNAs. The results users obtain from GeneFriends include a co-expression network as well as a summary of the functional enrichment among the co-expressed genes. Novel insights can be gathered from this database for different splice variants and ncRNAs, such as microRNAs and lincRNAs. Furthermore, our updated tool allows candidate transcripts to be linked to diseases and processes using a guilt-by-association approach. GeneFriends is freely available from http://www.GeneFriends.org and can be used to quickly identify and rank candidate targets relevant to the process or disease under study.
Collapse
Affiliation(s)
- Sipko van Dam
- Integrative Genomics of Ageing Group, Institute of Integrative Biology, University of Liverpool, Liverpool, UK
| | - Thomas Craig
- Integrative Genomics of Ageing Group, Institute of Integrative Biology, University of Liverpool, Liverpool, UK
| | - João Pedro de Magalhães
- Integrative Genomics of Ageing Group, Institute of Integrative Biology, University of Liverpool, Liverpool, UK
| |
Collapse
|
36
|
Wang J, Yang J, Mao S, Chai X, Hu Y, Hou X, Tang Y, Bi C, Li X. MitProNet: A knowledgebase and analysis platform of proteome, interactome and diseases for mammalian mitochondria. PLoS One 2014; 9:e111187. [PMID: 25347823 PMCID: PMC4210245 DOI: 10.1371/journal.pone.0111187] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2014] [Accepted: 09/26/2014] [Indexed: 12/18/2022] Open
Abstract
Mitochondrion plays a central role in diverse biological processes in most eukaryotes, and its dysfunctions are critically involved in a large number of diseases and the aging process. A systematic identification of mitochondrial proteomes and characterization of functional linkages among mitochondrial proteins are fundamental in understanding the mechanisms underlying biological functions and human diseases associated with mitochondria. Here we present a database MitProNet which provides a comprehensive knowledgebase for mitochondrial proteome, interactome and human diseases. First an inventory of mammalian mitochondrial proteins was compiled by widely collecting proteomic datasets, and the proteins were classified by machine learning to achieve a high-confidence list of mitochondrial proteins. The current version of MitProNet covers 1124 high-confidence proteins, and the remainders were further classified as middle- or low-confidence. An organelle-specific network of functional linkages among mitochondrial proteins was then generated by integrating genomic features encoded by a wide range of datasets including genomic context, gene expression profiles, protein-protein interactions, functional similarity and metabolic pathways. The functional-linkage network should be a valuable resource for the study of biological functions of mitochondrial proteins and human mitochondrial diseases. Furthermore, we utilized the network to predict candidate genes for mitochondrial diseases using prioritization algorithms. All proteins, functional linkages and disease candidate genes in MitProNet were annotated according to the information collected from their original sources including GO, GEO, OMIM, KEGG, MIPS, HPRD and so on. MitProNet features a user-friendly graphic visualization interface to present functional analysis of linkage networks. As an up-to-date database and analysis platform, MitProNet should be particularly helpful in comprehensive studies of complicated biological mechanisms underlying mitochondrial functions and human mitochondrial diseases. MitProNet is freely accessible at http://bio.scu.edu.cn:8085/MitProNet.
Collapse
Affiliation(s)
- Jiabin Wang
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Jian Yang
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Song Mao
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Xiaoqiang Chai
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Yuling Hu
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Xugang Hou
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Yiheng Tang
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Cheng Bi
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| | - Xiao Li
- College of Life Sciences, Sichuan University, Ministry of Education Key Laboratory for Bio-resource and Eco-environment, Sichuan Key Laboratory of Molecular Biology and Biotechnology, Chengdu, People’s Republic of China
| |
Collapse
|
37
|
Zhu F, Shi L, Li H, Eksi R, Engel JD, Guan Y. Modeling dynamic functional relationship networks and application to ex vivo human erythroid differentiation. ACTA ACUST UNITED AC 2014; 30:3325-33. [PMID: 25115705 DOI: 10.1093/bioinformatics/btu542] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
MOTIVATION Functional relationship networks, which summarize the probability of co-functionality between any two genes in the genome, could complement the reductionist focus of modern biology for understanding diverse biological processes in an organism. One major limitation of the current networks is that they are static, while one might expect functional relationships to consistently reprogram during the differentiation of a cell lineage. To address this potential limitation, we developed a novel algorithm that leverages both differentiation stage-specific expression data and large-scale heterogeneous functional genomic data to model such dynamic changes. We then applied this algorithm to the time-course RNA-Seq data we collected for ex vivo human erythroid cell differentiation. RESULTS Through computational cross-validation and literature validation, we show that the resulting networks correctly predict the (de)-activated functional connections between genes during erythropoiesis. We identified known critical genes, such as HBD and GATA1, and functional connections during erythropoiesis using these dynamic networks, while the traditional static network was not able to provide such information. Furthermore, by comparing the static and the dynamic networks, we identified novel genes (such as OSBP2 and PDZK1IP1) that are potential drivers of erythroid cell differentiation. This novel method of modeling dynamic networks is applicable to other differentiation processes where time-course genome-scale expression data are available, and should assist in generating greater understanding of the functional dynamics at play across the genome during development. AVAILABILITY AND IMPLEMENTATION The network described in this article is available at http://guanlab.ccmb.med.umich.edu/stageSpecificNetwork.
Collapse
Affiliation(s)
- Fan Zhu
- Department of Computational Medicine and Bioinformatics, Department of Cell and Developmental Biology, Department of Internal Medicine and Department of Computer Science and Engineering, University of Michigan, MI48109, USA
| | - Lihong Shi
- Department of Computational Medicine and Bioinformatics, Department of Cell and Developmental Biology, Department of Internal Medicine and Department of Computer Science and Engineering, University of Michigan, MI48109, USA
| | - Hongdong Li
- Department of Computational Medicine and Bioinformatics, Department of Cell and Developmental Biology, Department of Internal Medicine and Department of Computer Science and Engineering, University of Michigan, MI48109, USA
| | - Ridvan Eksi
- Department of Computational Medicine and Bioinformatics, Department of Cell and Developmental Biology, Department of Internal Medicine and Department of Computer Science and Engineering, University of Michigan, MI48109, USA
| | - James Douglas Engel
- Department of Computational Medicine and Bioinformatics, Department of Cell and Developmental Biology, Department of Internal Medicine and Department of Computer Science and Engineering, University of Michigan, MI48109, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, Department of Cell and Developmental Biology, Department of Internal Medicine and Department of Computer Science and Engineering, University of Michigan, MI48109, USA Department of Computational Medicine and Bioinformatics, Department of Cell and Developmental Biology, Department of Internal Medicine and Department of Computer Science and Engineering, University of Michigan, MI48109, USA Department of Computational Medicine and Bioinformatics, Department of Cell and Developmental Biology, Department of Internal Medicine and Department of Computer Science and Engineering, University of Michigan, MI48109, USA
| |
Collapse
|
38
|
Li HD, Menon R, Omenn GS, Guan Y. The emerging era of genomic data integration for analyzing splice isoform function. Trends Genet 2014; 30:340-7. [PMID: 24951248 DOI: 10.1016/j.tig.2014.05.005] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2014] [Revised: 05/21/2014] [Accepted: 05/23/2014] [Indexed: 01/17/2023]
Abstract
The vast majority of multi-exon genes in humans undergo alternative splicing, which greatly increases the functional diversity of protein species. Predicting functions at the isoform level is essential to further our understanding of developmental abnormalities and cancers, which frequently exhibit aberrant splicing and dysregulation of isoform expression. However, determination of isoform function is very difficult, and efforts to predict isoform function have been limited in the functional genomics field. Deep sequencing of RNA now provides an unprecedented amount of expression data at the transcript level. We describe here emerging computational approaches that integrate such large-scale whole-transcriptome sequencing (RNA-seq) data for predicting the functions of alternatively spliced isoforms, and we discuss their applications in developmental and cancer biology. We outline future directions for isoform function prediction, emphasizing the need for heterogeneous genomic data integration and tissue-specific, dynamic isoform-level network modeling, which will allow the field to realize its full potential.
Collapse
Affiliation(s)
- Hong-Dong Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Rajasree Menon
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Gilbert S Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA; Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, MI, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA; Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, MI, USA; Department of Electrical Engineering and Computer Science, Ann Arbor, MI, USA.
| |
Collapse
|
39
|
Recla JM, Robledo RF, Gatti DM, Bult CJ, Churchill GA, Chesler EJ. Precise genetic mapping and integrative bioinformatics in Diversity Outbred mice reveals Hydin as a novel pain gene. Mamm Genome 2014; 25:211-22. [PMID: 24700285 PMCID: PMC4032469 DOI: 10.1007/s00335-014-9508-0] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2014] [Accepted: 03/05/2014] [Indexed: 12/21/2022]
Abstract
Mouse genetics is a powerful approach for discovering genes and other genome features influencing human pain sensitivity. Genetic mapping studies have historically been limited by low mapping resolution of conventional mouse crosses, resulting in pain-related quantitative trait loci (QTL) spanning several megabases and containing hundreds of candidate genes. The recently developed Diversity Outbred (DO) population is derived from the same eight inbred founder strains as the Collaborative Cross, including three wild-derived strains. DO mice offer increased genetic heterozygosity and allelic diversity compared to crosses involving standard mouse strains. The high rate of recombinatorial precision afforded by DO mice makes them an ideal resource for high-resolution genetic mapping, allowing the circumvention of costly fine-mapping studies. We utilized a cohort of ~300 DO mice to map a 3.8 Mbp QTL on chromosome 8 associated with acute thermal pain sensitivity, which we have tentatively named Tpnr6. We used haplotype block partitioning to narrow Tpnr6 to a width of ~230 Kbp, reducing the number of putative candidate genes from 44 to 3. The plausibility of each candidate gene’s role in pain response was assessed using an integrative bioinformatics approach, combining data related to protein domain, biological annotation, gene expression pattern, and protein functional interaction. Our results reveal a novel, putative role for the protein-coding gene, Hydin, in thermal pain response, possibly through the gene’s role in ciliary motility in the choroid plexus–cerebrospinal fluid system of the brain. Real-time quantitative-PCR analysis showed no expression differences in Hydin transcript levels between pain-sensitive and pain-resistant mice, suggesting that Hydin may influence hot-plate behavior through other biological mechanisms.
Collapse
Affiliation(s)
- Jill M Recla
- IGERT Program in Functional Genomics, Graduate School of Biomedical Sciences and Engineering, The University of Maine, Orono, ME, 04469, USA,
| | | | | | | | | | | |
Collapse
|
40
|
Musso G, Tasan M, Mosimann C, Beaver JE, Plovie E, Carr LA, Chua HN, Dunham J, Zuberi K, Rodriguez H, Morris Q, Zon L, Roth FP, MacRae CA. Novel cardiovascular gene functions revealed via systematic phenotype prediction in zebrafish. Development 2014; 141:224-35. [PMID: 24346703 DOI: 10.1242/dev.099796] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Comprehensive functional annotation of vertebrate genomes is fundamental to biological discovery. Reverse genetic screening has been highly useful for determination of gene function, but is untenable as a systematic approach in vertebrate model organisms given the number of surveyable genes and observable phenotypes. Unbiased prediction of gene-phenotype relationships offers a strategy to direct finite experimental resources towards likely phenotypes, thus maximizing de novo discovery of gene functions. Here we prioritized genes for phenotypic assay in zebrafish through machine learning, predicting the effect of loss of function of each of 15,106 zebrafish genes on 338 distinct embryonic anatomical processes. Focusing on cardiovascular phenotypes, the learning procedure predicted known knockdown and mutant phenotypes with high precision. In proof-of-concept studies we validated 16 high-confidence cardiac predictions using targeted morpholino knockdown and initial blinded phenotyping in embryonic zebrafish, confirming a significant enrichment for cardiac phenotypes as compared with morpholino controls. Subsequent detailed analyses of cardiac function confirmed these results, identifying novel physiological defects for 11 tested genes. Among these we identified tmem88a, a recently described attenuator of Wnt signaling, as a discrete regulator of the patterning of intercellular coupling in the zebrafish cardiac epithelium. Thus, we show that systematic prioritization in zebrafish can accelerate the pace of developmental gene function discovery.
Collapse
Affiliation(s)
- Gabriel Musso
- Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Xu Y, Guo M, Liu X, Wang C, Liu Y. SoyFN: a knowledge database of soybean functional networks. Database (Oxford) 2014; 2014:bau019. [PMID: 24618044 PMCID: PMC3949006 DOI: 10.1093/database/bau019] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2013] [Revised: 01/22/2014] [Accepted: 02/06/2014] [Indexed: 01/08/2023]
Abstract
Many databases for soybean genomic analysis have been built and made publicly available, but few of them contain knowledge specifically targeting the omics-level gene-gene, gene-microRNA (miRNA) and miRNA-miRNA interactions. Here, we present SoyFN, a knowledge database of soybean functional gene networks and miRNA functional networks. SoyFN provides user-friendly interfaces to retrieve, visualize, analyze and download the functional networks of soybean genes and miRNAs. In addition, it incorporates much information about KEGG pathways, gene ontology annotations and 3'-UTR sequences as well as many useful tools including SoySearch, ID mapping, Genome Browser, eFP Browser and promoter motif scan. SoyFN is a schema-free database that can be accessed as a Web service from any modern programming language using a simple Hypertext Transfer Protocol call. The Web site is implemented in Java, JavaScript, PHP, HTML and Apache, with all major browsers supported. We anticipate that this database will be useful for members of research communities both in soybean experimental science and bioinformatics. Database URL: http://nclab.hit.edu.cn/SoyFN.
Collapse
Affiliation(s)
- Yungang Xu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China and School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China
| | - Maozu Guo
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China and School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China
| | - Xiaoyan Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China and School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China
| | - Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China and School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China
| | - Yang Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China and School of Life Science and Technology, Harbin Institute of Technology, Harbin 150001, P.R. China
| |
Collapse
|
42
|
Liu H, Beck TN, Golemis EA, Serebriiskii IG. Integrating in silico resources to map a signaling network. Methods Mol Biol 2014; 1101:197-245. [PMID: 24233784 DOI: 10.1007/978-1-62703-721-1_11] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The abundance of publicly available life science databases offers a wealth of information that can support interpretation of experimentally derived data and greatly enhance hypothesis generation. Protein interaction and functional networks are not simply new renditions of existing data: they provide the opportunity to gain insights into the specific physical and functional role a protein plays as part of the biological system. In this chapter, we describe different in silico tools that can quickly and conveniently retrieve data from existing data repositories and we discuss how the available tools are best utilized for different purposes. While emphasizing protein-protein interaction databases (e.g., BioGrid and IntAct), we also introduce metasearch platforms such as STRING and GeneMANIA, pathway databases (e.g., BioCarta and Pathway Commons), text mining approaches (e.g., PubMed and Chilibot), and resources for drug-protein interactions, genetic information for model organisms and gene expression information based on microarray data mining. Furthermore, we provide a simple step-by-step protocol for building customized protein-protein interaction networks in Cytoscape, a powerful network assembly and visualization program, integrating data retrieved from these various databases. As we illustrate, generation of composite interaction networks enables investigators to extract significantly more information about a given biological system than utilization of a single database or sole reliance on primary literature.
Collapse
Affiliation(s)
- Hanqing Liu
- Fox Chase Cancer Center, Philadelphia, PA, USA
| | | | | | | |
Collapse
|
43
|
Parameswaran S, Kumar S, Verma RS, Sharma RK. Cardiomyocyte culture - an update on the in vitro cardiovascular model and future challenges. Can J Physiol Pharmacol 2013; 91:985-98. [PMID: 24289068 DOI: 10.1139/cjpp-2013-0161] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The success of any work with isolated cardiomyocytes depends on the reproducibility of cell isolation, because the cells do not divide. To date, there is no suitable in vitro model to study human adult cardiac cell biology. Although embryonic stem cells and induced pluripotent stem cells are able to differentiate into cardiomyocytes in vitro, the efficiency of this process is low. Isolation and expansion of human cardiomyocyte progenitor cells from cardiac surgical waste or, alternatively, from fetal heart tissue is another option. However, to overcome various issues related to human tissue usage, especially ethical concerns, researchers use large- and small-animal models to study cardiac pathophysiology. A simple model to study the changes at the cellular level is cultures of cardiomyocytes. Although primary murine cardiomyocyte cultures have their own advantages and drawbacks, alternative strategies have been developed in the last two decades to minimise animal usage and interspecies differences. This review discusses the use of freshly isolated murine cardiomyocytes and cardiomyocyte alternatives for use in cardiac disease models and other related studies.
Collapse
Affiliation(s)
- Sreejit Parameswaran
- a Department of Pathology and Laboratory Medicine, College of Medicine, University of Saskatchewan, Saskatoon, SK S7N 0W8, Canada
| | | | | | | |
Collapse
|
44
|
Huang J, Qin Y, Liu B, Li GY, Ouyang L, Wang JH. In silico analysis and experimental validation of molecular mechanisms of salvianolic acid A-inhibited LPS-stimulated inflammation, in RAW264.7 macrophages. Cell Prolif 2013; 46:595-605. [PMID: 24033467 DOI: 10.1111/cpr.12056] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2013] [Accepted: 06/03/2013] [Indexed: 11/27/2022] Open
Abstract
OBJECTIVES The aim of this study was to explore mechanisms by which salvianolic acid A (SAA) revealed its anti-inflammatory activity, in lipopolysaccharide (LPS)-stimulated RAW264.7 cells. MATERIALS AND METHODS Nitric oxide (NO) concentration was determined by the Griess reaction and cell viability was assessed by MTT assay. Interleukin-6, TNFα and interleukin-1β were determined by ELISA. The RAW264.7 cells were transfected with siRNA against p38 or HO-1. Expressions of COX-2, inducible NO synthase (iNOS), NF-κB, HO-1, p-p38 and phosphorylation of IκB kinase α/β were detected by western blotting. Potential targets of SAA were analysed by homology modelling, target prediction, protein-protein interaction prediction and docking studies. RESULTS Salvianolic acid A suppressed LPS-triggered production of NO, TNFα and Interleukin-6. It also reduced protein expression of inducible NO synthase and COX-2, and reduced translocation of NF-κB to nuclei. Moreover, SAA promoted expression of phosphorylated p38, and downstream HO-1. Zn (II) protoporphyrin IX, a specific inhibitor of HO-1, or siRNA against HO-1 could effectively increase transfer of NF-κB. SAA was predicted to target amyloid-beta protein-like protein and arachidonate 5-lipoxygenase, that could regulate p38 and HO-1. CONCLUSIONS In silico analysis and experimental validation together demonstrated that SAA exhibited its anti-inflammatory effect via the p38-HO-1 pathway in LPS-stimulated RAW264.7 cells, reduced transfer of NF-κB to the nuclei and thus reduced production of inflammatory mediators.
Collapse
Affiliation(s)
- J Huang
- School of Traditional Chinese Materia Medica, Shenyang Pharmaceutical University, Shenyang, 110016, China
| | | | | | | | | | | |
Collapse
|
45
|
Dowell KG, Simons AK, Wang ZZ, Yun K, Hibbs MA. Cell-type-specific predictive network yields novel insights into mouse embryonic stem cell self-renewal and cell fate. PLoS One 2013; 8:e56810. [PMID: 23468881 PMCID: PMC3585227 DOI: 10.1371/journal.pone.0056810] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2012] [Accepted: 01/14/2013] [Indexed: 01/25/2023] Open
Abstract
Self-renewal, the ability of a stem cell to divide repeatedly while maintaining an undifferentiated state, is a defining characteristic of all stem cells. Here, we clarify the molecular foundations of mouse embryonic stem cell (mESC) self-renewal by applying a proven Bayesian network machine learning approach to integrate high-throughput data for protein function discovery. By focusing on a single stem-cell system, at a specific developmental stage, within the context of well-defined biological processes known to be active in that cell type, we produce a consensus predictive network that reflects biological reality more closely than those made by prior efforts using more generalized, context-independent methods. In addition, we show how machine learning efforts may be misled if the tissue specific role of mammalian proteins is not defined in the training set and circumscribed in the evidential data. For this study, we assembled an extensive compendium of mESC data: ∼2.2 million data points, collected from 60 different studies, under 992 conditions. We then integrated these data into a consensus mESC functional relationship network focused on biological processes associated with embryonic stem cell self-renewal and cell fate determination. Computational evaluations, literature validation, and analyses of predicted functional linkages show that our results are highly accurate and biologically relevant. Our mESC network predicts many novel players involved in self-renewal and serves as the foundation for future pluripotent stem cell studies. This network can be used by stem cell researchers (at http://StemSight.org) to explore hypotheses about gene function in the context of self-renewal and to prioritize genes of interest for experimental validation.
Collapse
Affiliation(s)
- Karen G. Dowell
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, Maine, United States of America
| | - Allen K. Simons
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | - Zack Z. Wang
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, Maine, United States of America
- Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Kyuson Yun
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, Maine, United States of America
| | - Matthew A. Hibbs
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, Maine, United States of America
- Trinity University, Department of Computer Science, San Antonio, Texas, United States of America
- * E-mail:
| |
Collapse
|
46
|
Reyes-Palomares A, Rodríguez-López R, Ranea JAG, Jiménez FS, Medina MA. Global analysis of the human pathophenotypic similarity gene network merges disease module components. PLoS One 2013; 8:e56653. [PMID: 23437198 PMCID: PMC3578923 DOI: 10.1371/journal.pone.0056653] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2012] [Accepted: 01/12/2013] [Indexed: 12/22/2022] Open
Abstract
The molecular complexity of genetic diseases requires novel approaches to break it down into coherent biological modules. For this purpose, many disease network models have been created and analyzed. We highlight two of them, "the human diseases networks" (HDN) and "the orphan disease networks" (ODN). However, in these models, each single node represents one disease or an ambiguous group of diseases. In these cases, the notion of diseases as unique entities reduces the usefulness of network-based methods. We hypothesize that using the clinical features (pathophenotypes) to define pathophenotypic connections between disease-causing genes improve our understanding of the molecular events originated by genetic disturbances. For this, we have built a pathophenotypic similarity gene network (PSGN) and compared it with the unipartite projections (based on gene-to-gene edges) similar to those used in previous network models (HDN and ODN). Unlike these disease network models, the PSGN uses semantic similarities. This pathophenotypic similarity has been calculated by comparing pathophenotypic annotations of genes (human abnormalities of HPO terms) in the "Human Phenotype Ontology". The resulting network contains 1075 genes (nodes) and 26197 significant pathophenotypic similarities (edges). A global analysis of this network reveals: unnoticed pairs of genes showing significant pathophenotypic similarity, a biological meaningful re-arrangement of the pathological relationships between genes, correlations of biochemical interactions with higher similarity scores and functional biases in metabolic and essential genes toward the pathophenotypic specificity and the pleiotropy, respectively. Additionally, pathophenotypic similarities and metabolic interactions of genes associated with maple syrup urine disease (MSUD) have been used to merge into a coherent pathological module.Our results indicate that pathophenotypes contribute to identify underlying co-dependencies among disease-causing genes that are useful to describe disease modularity.
Collapse
Affiliation(s)
- Armando Reyes-Palomares
- Department of Molecular Biology and Biochemistry, Faculty of Sciences, University of Málaga, Málaga, Spain
- CIBER de Enfermedades Raras (CIBERER), Málaga, Spain
| | - Rocío Rodríguez-López
- Department of Molecular Biology and Biochemistry, Faculty of Sciences, University of Málaga, Málaga, Spain
- CIBER de Enfermedades Raras (CIBERER), Málaga, Spain
| | - Juan A. G. Ranea
- Department of Molecular Biology and Biochemistry, Faculty of Sciences, University of Málaga, Málaga, Spain
- CIBER de Enfermedades Raras (CIBERER), Málaga, Spain
| | - Francisca Sánchez Jiménez
- Department of Molecular Biology and Biochemistry, Faculty of Sciences, University of Málaga, Málaga, Spain
- CIBER de Enfermedades Raras (CIBERER), Málaga, Spain
| | - Miguel Angel Medina
- Department of Molecular Biology and Biochemistry, Faculty of Sciences, University of Málaga, Málaga, Spain
- CIBER de Enfermedades Raras (CIBERER), Málaga, Spain
| |
Collapse
|
47
|
Abstract
Complex diseases are caused by a combination of genetic and environmental factors. Uncovering the molecular pathways through which genetic factors affect a phenotype is always difficult, but in the case of complex diseases this is further complicated since genetic factors in affected individuals might be different. In recent years, systems biology approaches and, more specifically, network based approaches emerged as powerful tools for studying complex diseases. These approaches are often built on the knowledge of physical or functional interactions between molecules which are usually represented as an interaction network. An interaction network not only reports the binary relationships between individual nodes but also encodes hidden higher level organization of cellular communication. Computational biologists were challenged with the task of uncovering this organization and utilizing it for the understanding of disease complexity, which prompted rich and diverse algorithmic approaches to be proposed. We start this chapter with a description of the general characteristics of complex diseases followed by a brief introduction to physical and functional networks. Next we will show how these networks are used to leverage genotype, gene expression, and other types of data to identify dysregulated pathways, infer the relationships between genotype and phenotype, and explain disease heterogeneity. We group the methods by common underlying principles and first provide a high level description of the principles followed by more specific examples. We hope that this chapter will give readers an appreciation for the wealth of algorithmic techniques that have been developed for the purpose of studying complex diseases as well as insight into their strengths and limitations.
Collapse
Affiliation(s)
- Dong-Yeon Cho
- National Center for Biotechnology Information, NLM, NIH, Bethesda, Maryland, United States of America
| | - Yoo-Ah Kim
- National Center for Biotechnology Information, NLM, NIH, Bethesda, Maryland, United States of America
| | - Teresa M. Przytycka
- National Center for Biotechnology Information, NLM, NIH, Bethesda, Maryland, United States of America
- * E-mail:
| |
Collapse
|
48
|
Wang PI, Hwang S, Kincaid RP, Sullivan CS, Lee I, Marcotte EM. RIDDLE: reflective diffusion and local extension reveal functional associations for unannotated gene sets via proximity in a gene network. Genome Biol 2012; 13:R125. [PMID: 23268829 PMCID: PMC4056375 DOI: 10.1186/gb-2012-13-12-r125] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2012] [Accepted: 12/26/2012] [Indexed: 01/08/2023] Open
Abstract
The growing availability of large-scale functional networks has promoted the development of many successful techniques for predicting functions of genes. Here we extend these network-based principles and techniques to functionally characterize whole sets of genes. We present RIDDLE (Reflective Diffusion and Local Extension), which uses well developed guilt-by-association principles upon a human gene network to identify associations of gene sets. RIDDLE is particularly adept at characterizing sets with no annotations, a major challenge where most traditional set analyses fail. Notably, RIDDLE found microRNA-450a to be strongly implicated in ocular diseases and development. A web application is available at http://www.functionalnet.org/RIDDLE.
Collapse
|
49
|
Guan Y, Gorenshteyn D, Burmeister M, Wong AK, Schimenti JC, Handel MA, Bult CJ, Hibbs MA, Troyanskaya OG. Tissue-specific functional networks for prioritizing phenotype and disease genes. PLoS Comput Biol 2012; 8:e1002694. [PMID: 23028291 PMCID: PMC3459891 DOI: 10.1371/journal.pcbi.1002694] [Citation(s) in RCA: 88] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2012] [Accepted: 08/02/2012] [Indexed: 12/16/2022] Open
Abstract
Integrated analyses of functional genomics data have enormous potential for identifying phenotype-associated genes. Tissue-specificity is an important aspect of many genetic diseases, reflecting the potentially different roles of proteins and pathways in diverse cell lineages. Accounting for tissue specificity in global integration of functional genomics data is challenging, as “functionality” and “functional relationships” are often not resolved for specific tissue types. We address this challenge by generating tissue-specific functional networks, which can effectively represent the diversity of protein function for more accurate identification of phenotype-associated genes in the laboratory mouse. Specifically, we created 107 tissue-specific functional relationship networks through integration of genomic data utilizing knowledge of tissue-specific gene expression patterns. Cross-network comparison revealed significantly changed genes enriched for functions related to specific tissue development. We then utilized these tissue-specific networks to predict genes associated with different phenotypes. Our results demonstrate that prediction performance is significantly improved through using the tissue-specific networks as compared to the global functional network. We used a testis-specific functional relationship network to predict genes associated with male fertility and spermatogenesis phenotypes, and experimentally confirmed one top prediction, Mbyl1. We then focused on a less-common genetic disease, ataxia, and identified candidates uniquely predicted by the cerebellum network, which are supported by both literature and experimental evidence. Our systems-level, tissue-specific scheme advances over traditional global integration and analyses and establishes a prototype to address the tissue-specific effects of genetic perturbations, diseases and drugs. Tissue specificity is an important aspect of many genetic diseases, reflecting the potentially different roles of proteins and pathways in diverse cell lineages. We propose an effective strategy to model tissue-specific functional relationship networks in the laboratory mouse. We integrated large scale genomics datasets as well as low-throughput tissue-specific expression profiles to estimate the probability that two proteins are co-functioning in the tissue under study. These networks can accurately reflect the diversity of protein functions across different organs and tissue compartments. By computationally exploring the tissue-specific networks, we can accurately predict novel phenotype-related gene candidates. We experimentally confirmed a top candidate gene, Mybl1, to affect several male fertility phenotypes, predicted based on male-reproductive system-specific networks and we predicted candidates related to a rare genetic disease ataxia, which are supported by experimental and literature evidence. The above results demonstrate the power of modeling tissue-specific dynamics of co-functionality through computational approaches.
Collapse
Affiliation(s)
- Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
- Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Dmitriy Gorenshteyn
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
- Department of Molecular Biology, Princeton University, Princeton, New Jersey, United States of America
| | - Margit Burmeister
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
- Molecular & Behavioral Neuroscience Institution, Department of Psychiatry, and Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Aaron K. Wong
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - John C. Schimenti
- Department of Biomedical Sciences, College of Veterinary Medicine, Cornell University, Ithaca, New York, United States of America
| | - Mary Ann Handel
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | - Carol J. Bult
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | - Matthew A. Hibbs
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- Trinity University, Computer Science Department, San Antonio, Texas, United States of America
- * E-mail: (MAH); (OGT)
| | - Olga G. Troyanskaya
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
- * E-mail: (MAH); (OGT)
| |
Collapse
|
50
|
Abstract
Omics, including genomics, proteomics, and metabolomics, enable us to explain symbioses in terms of the underlying molecules and their interactions. The central task is to transform molecular catalogs of genes, metabolites, etc., into a dynamic understanding of symbiosis function. We review four exemplars of omics studies that achieve this goal, through defined biological questions relating to metabolic integration and regulation of animal-microbial symbioses, the genetic autonomy of bacterial symbionts, and symbiotic protection of animal hosts from pathogens. As omic datasets become increasingly complex, computationally sophisticated downstream analyses are essential to reveal interactions not evident from visual inspection of the data. We discuss two approaches, phylogenomics and transcriptional clustering, that can divide the primary output of omics studies-long lists of factors-into manageable subsets, and we describe how they have been applied to analyze large datasets and generate testable hypotheses.
Collapse
Affiliation(s)
- J Chaston
- Department of Entomology, Comstock Hall, Cornell University, Ithaca, New York 14853, USA
| | | |
Collapse
|