1
|
Weng Z, Yue Z, Zhu Y, Chen JY. DEMA: a distance-bounded energy-field minimization algorithm to model and layout biomolecular networks with quantitative features. Bioinformatics 2022; 38:i359-i368. [PMID: 35758816 PMCID: PMC9235497 DOI: 10.1093/bioinformatics/btac261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
SUMMARY In biology, graph layout algorithms can reveal comprehensive biological contexts by visually positioning graph nodes in their relevant neighborhoods. A layout software algorithm/engine commonly takes a set of nodes and edges and produces layout coordinates of nodes according to edge constraints. However, current layout engines normally do not consider node, edge or node-set properties during layout and only curate these properties after the layout is created. Here, we propose a new layout algorithm, distance-bounded energy-field minimization algorithm (DEMA), to natively consider various biological factors, i.e., the strength of gene-to-gene association, the gene's relative contribution weight and the functional groups of genes, to enhance the interpretation of complex network graphs. In DEMA, we introduce a parameterized energy model where nodes are repelled by the network topology and attracted by a few biological factors, i.e., interaction coefficient, effect coefficient and fold change of gene expression. We generalize these factors as gene weights, protein-protein interaction weights, gene-to-gene correlations and the gene set annotations-four parameterized functional properties used in DEMA. Moreover, DEMA considers further attraction/repulsion/grouping coefficient to enable different preferences in generating network views. Applying DEMA, we performed two case studies using genetic data in autism spectrum disorder and Alzheimer's disease, respectively, for gene candidate discovery. Furthermore, we implement our algorithm as a plugin to Cytoscape, an open-source software platform for visualizing networks; hence, it is convenient. Our software and demo can be freely accessed at http://discovery.informatics.uab.edu/dema. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhenyu Weng
- Communication and Information Security Lab, Institute of Big Data Technologies, Shenzhen Graduate School, Peking University, Shenzhen 518055, China
| | - Zongliang Yue
- Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Yuesheng Zhu
- Communication and Information Security Lab, Institute of Big Data Technologies, Shenzhen Graduate School, Peking University, Shenzhen 518055, China
| | - Jake Yue Chen
- Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| |
Collapse
|
2
|
Yue Z, Slominski R, Bharti S, Chen JY. PAGER Web APP: An Interactive, Online Gene Set and Network Interpretation Tool for Functional Genomics. Front Genet 2022; 13:820361. [PMID: 35495152 PMCID: PMC9039620 DOI: 10.3389/fgene.2022.820361] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 03/17/2022] [Indexed: 12/30/2022] Open
Abstract
Functional genomics studies have helped researchers annotate differentially expressed gene lists, extract gene expression signatures, and identify biological pathways from omics profiling experiments conducted on biological samples. The current geneset, network, and pathway analysis (GNPA) web servers, e.g., DAVID, EnrichR, WebGestaltR, or PAGER, do not allow automated integrative functional genomic downstream analysis. In this study, we developed a new web-based interactive application, "PAGER Web APP", which supports online R scripting of integrative GNPA. In a case study of melanoma drug resistance, we showed that the new PAGER Web APP enabled us to discover highly relevant pathways and network modules, leading to novel biological insights. We also compared PAGER Web APP's pathway analysis results retrieved among PAGER, EnrichR, and WebGestaltR to show its advantages in integrative GNPA. The interactive online web APP is publicly accessible from the link, https://aimed-lab.shinyapps.io/PAGERwebapp/.
Collapse
Affiliation(s)
- Zongliang Yue
- Informatics Institute in the School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Radomir Slominski
- Informatics Institute in the School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
- Graduate Biomedical Sciences Program, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Samuel Bharti
- Informatics Institute in the School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Jake Y. Chen
- Informatics Institute in the School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
| |
Collapse
|
3
|
Ma X, Yan H, Yang J, Liu Y, Li Z, Sheng M, Cao Y, Yu X, Yi X, Xu W, Su Z. PlantGSAD: a comprehensive gene set annotation database for plant species. Nucleic Acids Res 2021; 50:D1456-D1467. [PMID: 34534340 PMCID: PMC8728169 DOI: 10.1093/nar/gkab794] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 08/26/2021] [Accepted: 09/01/2021] [Indexed: 12/17/2022] Open
Abstract
With the accumulation of massive data sets from high-throughput experiments and the rapid emergence of new types of omics data, gene sets have become more diverse and essential for the refinement of gene annotation at multidimensional levels. Accordingly, we collected and defined 236 007 gene sets across different categories for 44 plant species in the Plant Gene Set Annotation Database (PlantGSAD). These gene sets were divided into nine main categories covering many functional subcategories, such as trait ontology, co-expression modules, chromatin states, and liquid-liquid phase separation. The annotations from the collected gene sets covered all of the genes in the Brassicaceae species Arabidopsis and Poaceae species Oryza sativa. Several GSEA tools are implemented in PlantGSAD to improve the efficiency of the analysis, including custom SEA for a flexible strategy based on customized annotations, SEACOMPARE for the cross-comparison of SEA results, and integrated visualization features for ontological analysis that intuitively reflects their parent-child relationships. In summary, PlantGSAD provides numerous gene sets for multiple plant species and highly efficient analysis tools. We believe that PlantGSAD will become a multifunctional analysis platform that can be used to predict and elucidate the functions and mechanisms of genes of interest. PlantGSAD is publicly available at http://systemsbiology.cau.edu.cn/PlantGSEAv2/.
Collapse
Affiliation(s)
- Xuelian Ma
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Hengyu Yan
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Jiaotong Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Yue Liu
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Zhongqiu Li
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Minghao Sheng
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Yaxin Cao
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xinyue Yu
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xin Yi
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Wenying Xu
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Zhen Su
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| |
Collapse
|
4
|
Yue Z, Yan D, Guo G, Chen JY. Biological Network Mining. Methods Mol Biol 2021; 2328:139-151. [PMID: 34251623 DOI: 10.1007/978-1-0716-1534-8_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this book chapter, we introduce a pipeline to mine significant biomedical entities (or bioentities) in biological networks. Our focus is on prioritizing both bioentities themselves and the associations between bioentities in order to reveal their biological functions. We will introduce three tools BEERE, WIPER, and PAGER 2.0 that can be used together for network analysis and function interpretation: (1) BEERE is a network analysis tool for "Biomedical Entity Expansion, Ranking and Explorations," (2) WIPER is an entity-to-entity association ranking tool, and (3) PAGER 2.0 is a service for gene enrichment analysis.
Collapse
Affiliation(s)
- Zongliang Yue
- The University of Alabama at Birmingham, Birmingham, AL, USA
| | - Da Yan
- The University of Alabama at Birmingham, Birmingham, AL, USA.
| | - Guimu Guo
- The University of Alabama at Birmingham, Birmingham, AL, USA
| | - Jake Y Chen
- The University of Alabama at Birmingham, Birmingham, AL, USA
| |
Collapse
|
5
|
Powers RK, Goodspeed A, Pielke-Lombardo H, Tan AC, Costello JC. GSEA-InContext: identifying novel and common patterns in expression experiments. Bioinformatics 2019; 34:i555-i564. [PMID: 29950010 PMCID: PMC6022535 DOI: 10.1093/bioinformatics/bty271] [Citation(s) in RCA: 139] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Motivation Gene Set Enrichment Analysis (GSEA) is routinely used to analyze and interpret coordinate pathway-level changes in transcriptomics experiments. For an experiment where less than seven samples per condition are compared, GSEA employs a competitive null hypothesis to test significance. A gene set enrichment score is tested against a null distribution of enrichment scores generated from permuted gene sets, where genes are randomly selected from the input experiment. Looking across a variety of biological conditions, however, genes are not randomly distributed with many showing consistent patterns of up- or down-regulation. As a result, common patterns of positively and negatively enriched gene sets are observed across experiments. Placing a single experiment into the context of a relevant set of background experiments allows us to identify both the common and experiment-specific patterns of gene set enrichment. Results We compiled a compendium of 442 small molecule transcriptomic experiments and used GSEA to characterize common patterns of positively and negatively enriched gene sets. To identify experiment-specific gene set enrichment, we developed the GSEA-InContext method that accounts for gene expression patterns within a background set of experiments to identify statistically significantly enriched gene sets. We evaluated GSEA-InContext on experiments using small molecules with known targets to show that it successfully prioritizes gene sets that are specific to each experiment, thus providing valuable insights that complement standard GSEA analysis. Availability and implementation GSEA-InContext implemented in Python, Supplementary results and the background expression compendium are available at: https://github.com/CostelloLab/GSEA-InContext.
Collapse
Affiliation(s)
- Rani K Powers
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.,Department of Pharmacology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Andrew Goodspeed
- Department of Pharmacology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Harrison Pielke-Lombardo
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.,Department of Pharmacology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Aik-Choon Tan
- Department of Medical Oncology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - James C Costello
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.,Department of Pharmacology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| |
Collapse
|
6
|
Barradas-Bautista D, Rosell M, Pallara C, Fernández-Recio J. Structural Prediction of Protein–Protein Interactions by Docking: Application to Biomedical Problems. PROTEIN-PROTEIN INTERACTIONS IN HUMAN DISEASE, PART A 2018; 110:203-249. [DOI: 10.1016/bs.apcsb.2017.06.003] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
7
|
Chen JY, Pandey R, Nguyen TM. HAPPI-2: a Comprehensive and High-quality Map of Human Annotated and Predicted Protein Interactions. BMC Genomics 2017; 18:182. [PMID: 28212602 PMCID: PMC5314692 DOI: 10.1186/s12864-017-3512-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2015] [Accepted: 01/24/2017] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Human protein-protein interaction (PPI) data is essential to network and systems biology studies. PPI data can help biochemists hypothesize how proteins form complexes by binding to each other, how extracellular signals propagate through post-translational modification of de-activated signaling molecules, and how chemical reactions are coupled by enzymes involved in a complex biological process. Our capability to develop good public database resources for human PPI data has a direct impact on the quality of future research on genome biology and medicine. RESULTS The database of Human Annotated and Predicted Protein Interactions (HAPPI) version 2.0 is a major update to the original HAPPI 1.0 database. It contains 2,922,202 unique protein-protein interactions (PPI) linked by 23,060 human proteins, making it the most comprehensive database covering human PPI data today. These PPIs contain both physical/direct interactions and high-quality functional/indirect interactions. Compared with the HAPPI 1.0 database release, HAPPI database version 2.0 (HAPPI-2) represents a 485% of human PPI data coverage increase and a 73% protein coverage increase. The revamped HAPPI web portal provides users with a friendly search, curation, and data retrieval interface, allowing them to retrieve human PPIs and available annotation information on the interaction type, interaction quality, interacting partner drug targeting data, and disease information. The updated HAPPI-2 can be freely accessed by Academic users at http://discovery.informatics.uab.edu/HAPPI . CONCLUSIONS While the underlying data for HAPPI-2 are integrated from a diverse data sources, the new HAPPI-2 release represents a good balance between data coverage and data quality of human PPIs, making it ideally suited for network biology.
Collapse
Affiliation(s)
- Jake Y Chen
- Wenzhou Medical University First Affiliate Hospital, Wenzhou, Zhejiang Province, China. .,Medeolinx, LLC, Indianapolis, IN, 46280, USA. .,The Informatics Institute, University of Alabama at Birmingham School of Medicine, Birmingham, AL, 35294, USA. .,Indiana Center for Systems Biology and Personalized Medicine, Indiana University School of Informatics and Computing, Indianapolis, IN, 46202, USA.
| | | | - Thanh M Nguyen
- Indiana Center for Systems Biology and Personalized Medicine, Indiana University School of Informatics and Computing, Indianapolis, IN, 46202, USA
| |
Collapse
|
8
|
Pers TH. Gene set analysis for interpreting genetic studies. Hum Mol Genet 2016; 25:R133-R140. [PMID: 27511725 DOI: 10.1093/hmg/ddw249] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Accepted: 07/18/2016] [Indexed: 02/03/2023] Open
Abstract
Interpretation of genome-wide association study (GWAS) results is lacking behind the discovery of new genetic associations. Consequently, there is an urgent need for data-driven methods for interpreting genetic association studies. Gene set analysis (GSA) can identify aetiologic pathways and functional annotations and may hence point towards novel biological insights. However, despite the growing availability of GSA tools, the sizeable amount of variants identified for a vast number of complex traits, and many irrefutably trait-associated gene sets, the gap between discovery and interpretation remains. More efficient interpretation requires more complete and consistent gene set representations of biological pathways, phenotypes and functional annotations. In this review, I examine different types of gene sets, discuss how inconsistencies in gene set definitions impact GSA, describe how GSA has helped to elucidate biology and outline potential future directions.
Collapse
Affiliation(s)
- Tune H Pers
- Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark Novo Nordisk Foundation Centre for Basic Metabolic Research, Section of Metabolic, Genetics, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
9
|
Yue Z, Kshirsagar MM, Nguyen T, Suphavilai C, Neylon MT, Zhu L, Ratliff T, Chen JY. PAGER: constructing PAGs and new PAG-PAG relationships for network biology. Bioinformatics 2015; 31:i250-7. [PMID: 26072489 PMCID: PMC4553834 DOI: 10.1093/bioinformatics/btv265] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
In this article, we described a new database framework to perform integrative “gene-set, network, and pathway analysis” (GNPA). In this framework, we integrated heterogeneous data on pathways, annotated list, and gene-sets (PAGs) into a PAG electronic repository (PAGER). PAGs in the PAGER database are organized into P-type, A-type and G-type PAGs with a three-letter-code standard naming convention. The PAGER database currently compiles 44 313 genes from 5 species including human, 38 663 PAGs, 324 830 gene–gene relationships and two types of 3 174 323 PAG–PAG regulatory relationships—co-membership based and regulatory relationship based. To help users assess each PAG’s biological relevance, we developed a cohesion measure called Cohesion Coefficient (CoCo), which is capable of disambiguating between biologically significant PAGs and random PAGs with an area-under-curve performance of 0.98. PAGER database was set up to help users to search and retrieve PAGs from its online web interface. PAGER enable advanced users to build PAG–PAG regulatory networks that provide complementary biological insights not found in gene set analysis or individual gene network analysis. We provide a case study using cancer functional genomics data sets to demonstrate how integrative GNPA help improve network biology data coverage and therefore biological interpretability. The PAGER database can be accessible openly at http://discovery.informatics.iupui.edu/PAGER/. Contact: jakechen@iupui.edu Supplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zongliang Yue
- Indiana University School of Informatics and Computing, Department of Computer and Information Science, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, Purdue University Center for Cancer Research, West Lafayette, IN 47906 and Institute of Biopharmaceutical Informatics and Technology, Wenzhou Medical University, WenZhou, Zhe Jiang Province, China
| | - Madhura M Kshirsagar
- Indiana University School of Informatics and Computing, Department of Computer and Information Science, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, Purdue University Center for Cancer Research, West Lafayette, IN 47906 and Institute of Biopharmaceutical Informatics and Technology, Wenzhou Medical University, WenZhou, Zhe Jiang Province, China
| | - Thanh Nguyen
- Indiana University School of Informatics and Computing, Department of Computer and Information Science, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, Purdue University Center for Cancer Research, West Lafayette, IN 47906 and Institute of Biopharmaceutical Informatics and Technology, Wenzhou Medical University, WenZhou, Zhe Jiang Province, China
| | - Chayaporn Suphavilai
- Indiana University School of Informatics and Computing, Department of Computer and Information Science, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, Purdue University Center for Cancer Research, West Lafayette, IN 47906 and Institute of Biopharmaceutical Informatics and Technology, Wenzhou Medical University, WenZhou, Zhe Jiang Province, China
| | - Michael T Neylon
- Indiana University School of Informatics and Computing, Department of Computer and Information Science, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, Purdue University Center for Cancer Research, West Lafayette, IN 47906 and Institute of Biopharmaceutical Informatics and Technology, Wenzhou Medical University, WenZhou, Zhe Jiang Province, China
| | - Liugen Zhu
- Indiana University School of Informatics and Computing, Department of Computer and Information Science, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, Purdue University Center for Cancer Research, West Lafayette, IN 47906 and Institute of Biopharmaceutical Informatics and Technology, Wenzhou Medical University, WenZhou, Zhe Jiang Province, China
| | - Timothy Ratliff
- Indiana University School of Informatics and Computing, Department of Computer and Information Science, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, Purdue University Center for Cancer Research, West Lafayette, IN 47906 and Institute of Biopharmaceutical Informatics and Technology, Wenzhou Medical University, WenZhou, Zhe Jiang Province, China
| | - Jake Y Chen
- Indiana University School of Informatics and Computing, Department of Computer and Information Science, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, Purdue University Center for Cancer Research, West Lafayette, IN 47906 and Institute of Biopharmaceutical Informatics and Technology, Wenzhou Medical University, WenZhou, Zhe Jiang Province, China Indiana University School of Informatics and Computing, Department of Computer and Information Science, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, Purdue University Center for Cancer Research, West Lafayette, IN 47906 and Institute of Biopharmaceutical Informatics and Technology, Wenzhou Medical University, WenZhou, Zhe Jiang Province, China Indiana University School of Informatics and Computing, Department of Computer and Information Science, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, Purdue University Center for Cancer Research, West Lafayette, IN 47906 and Institute of Biopharmaceutical Informatics and Technology, Wenzhou Medical University, WenZhou, Zhe Jiang Province, China
| |
Collapse
|
10
|
Suphavilai C, Zhu L, Chen JY. A method for developing regulatory gene set networks to characterize complex biological systems. BMC Genomics 2015; 16 Suppl 11:S4. [PMID: 26576648 PMCID: PMC4652563 DOI: 10.1186/1471-2164-16-s11-s4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Background Traditional approaches to studying molecular networks are based on linking genes or proteins. Higher-level networks linking gene sets or pathways have been proposed recently. Several types of gene set networks have been used to study complex molecular networks such as co-membership gene set networks (M-GSNs) and co-enrichment gene set networks (E-GSNs). Gene set networks are useful for studying biological mechanism of diseases and drug perturbations. Results In this study, we proposed a new approach for constructing directed, regulatory gene set networks (R-GSNs) to reveal novel relationships among gene sets or pathways. We collected several gene set collections and high-quality gene regulation data in order to construct R-GSNs in a comparative study with co-membership gene set networks (M-GSNs). We described a method for constructing both global and disease-specific R-GSNs and determining their significance. To demonstrate the potential applications to disease biology studies, we constructed and analysed an R-GSN specifically built for Alzheimer's disease. Conclusions R-GSNs can provide new biological insights complementary to those derived at the protein regulatory network level or M-GSNs. When integrated properly to functional genomics data, R-GSNs can help enable future research on systems biology and translational bioinformatics.
Collapse
|
11
|
Karimpour-Fard A, Epperson LE, Hunter LE. A survey of computational tools for downstream analysis of proteomic and other omic datasets. Hum Genomics 2015; 9:28. [PMID: 26510531 PMCID: PMC4624643 DOI: 10.1186/s40246-015-0050-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2015] [Accepted: 10/06/2015] [Indexed: 12/19/2022] Open
Abstract
Proteomics is an expanding area of research into biological systems with significance for biomedical and therapeutic applications ranging from understanding the molecular basis of diseases to testing new treatments, studying the toxicity of drugs, or biotechnological improvements in agriculture. Progress in proteomic technologies and growing interest has resulted in rapid accumulation of proteomic data, and consequently, a great number of tools have become available. In this paper, we review the well-known and ready-to-use tools for classification, clustering and validation, interpretation, and generation of biological information from experimental data. We suggest some rules of thumb for the reader on choosing the best suitable learning method for a particular dataset and conclude with pathway and functional analysis and then provide information about submitting final results to a repository.
Collapse
Affiliation(s)
- Anis Karimpour-Fard
- Department of Pharmacology, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| | - L Elaine Epperson
- Integrated Center for Genes, Environment, and Health, National Jewish Health, Denver, CO, 80206, USA
| | - Lawrence E Hunter
- Department of Pharmacology, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| |
Collapse
|
12
|
Bhat A, Dakna M, Mischak H. Integrating proteomics profiling data sets: a network perspective. Methods Mol Biol 2015; 1243:237-53. [PMID: 25384750 DOI: 10.1007/978-1-4939-1872-0_14] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Understanding disease mechanisms often requires complex and accurate integration of cellular pathways and molecular networks. Systems biology offers the possibility to provide a comprehensive map of the cell's intricate wiring network, which can ultimately lead to decipher the disease phenotype. Here, we describe what biological pathways are, how they function in normal and abnormal cellular systems, limitations faced by databases for integrating data, and highlight how network models are emerging as a powerful integrative framework to understand and interpret the roles of proteins and peptides in diseases.
Collapse
Affiliation(s)
- Akshay Bhat
- Mosaiques-Diagnostics GmbH, Mellendorfer Straße 7-9, D-30625, Hannover, Germany,
| | | | | |
Collapse
|
13
|
Titz B, Elamin A, Martin F, Schneider T, Dijon S, Ivanov NV, Hoeng J, Peitsch MC. Proteomics for systems toxicology. Comput Struct Biotechnol J 2014; 11:73-90. [PMID: 25379146 PMCID: PMC4212285 DOI: 10.1016/j.csbj.2014.08.004] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Current toxicology studies frequently lack measurements at molecular resolution to enable a more mechanism-based and predictive toxicological assessment. Recently, a systems toxicology assessment framework has been proposed, which combines conventional toxicological assessment strategies with system-wide measurement methods and computational analysis approaches from the field of systems biology. Proteomic measurements are an integral component of this integrative strategy because protein alterations closely mirror biological effects, such as biological stress responses or global tissue alterations. Here, we provide an overview of the technical foundations and highlight select applications of proteomics for systems toxicology studies. With a focus on mass spectrometry-based proteomics, we summarize the experimental methods for quantitative proteomics and describe the computational approaches used to derive biological/mechanistic insights from these datasets. To illustrate how proteomics has been successfully employed to address mechanistic questions in toxicology, we summarized several case studies. Overall, we provide the technical and conceptual foundation for the integration of proteomic measurements in a more comprehensive systems toxicology assessment framework. We conclude that, owing to the critical importance of protein-level measurements and recent technological advances, proteomics will be an integral part of integrative systems toxicology approaches in the future.
Collapse
|
14
|
Pathway and network analysis in proteomics. J Theor Biol 2014; 362:44-52. [PMID: 24911777 DOI: 10.1016/j.jtbi.2014.05.031] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2014] [Revised: 05/15/2014] [Accepted: 05/21/2014] [Indexed: 12/14/2022]
Abstract
Proteomics is inherently a systems science that studies not only measured protein and their expressions in a cell, but also the interplay of proteins, protein complexes, signaling pathways, and network modules. There is a rapid accumulation of Proteomics data in recent years. However, Proteomics data are highly variable, with results sensitive to data preparation methods, sample condition, instrument types, and analytical methods. To address the challenge in Proteomics data analysis, we review current tools being developed to incorporate biological function and network topological information. We categorize these tools into four types: tools with basic functional information and little topological features (e.g., GO category analysis), tools with rich functional information and little topological features (e.g., GSEA), tools with basic functional information and rich topological features (e.g., Cytoscape), and tools with rich functional information and rich topological features (e.g., PathwayExpress). We first review the potential application of these tools to Proteomics; then we review tools that can achieve automated learning of pathway modules and features, and tools that help perform integrated network visual analytics.
Collapse
|
15
|
GuhaThakurta D, Sheikh NA, Meagher TC, Letarte S, Trager JB. Applications of systems biology in cancer immunotherapy: from target discovery to biomarkers of clinical outcome. Expert Rev Clin Pharmacol 2014; 6:387-401. [DOI: 10.1586/17512433.2013.811814] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
16
|
Csermely P, Korcsmáros T, Kiss HJM, London G, Nussinov R. Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review. Pharmacol Ther 2013; 138:333-408. [PMID: 23384594 PMCID: PMC3647006 DOI: 10.1016/j.pharmthera.2013.01.016] [Citation(s) in RCA: 512] [Impact Index Per Article: 46.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Accepted: 01/22/2013] [Indexed: 02/02/2023]
Abstract
Despite considerable progress in genome- and proteome-based high-throughput screening methods and in rational drug design, the increase in approved drugs in the past decade did not match the increase of drug development costs. Network description and analysis not only give a systems-level understanding of drug action and disease complexity, but can also help to improve the efficiency of drug design. We give a comprehensive assessment of the analytical tools of network topology and dynamics. The state-of-the-art use of chemical similarity, protein structure, protein-protein interaction, signaling, genetic interaction and metabolic networks in the discovery of drug targets is summarized. We propose that network targeting follows two basic strategies. The "central hit strategy" selectively targets central nodes/edges of the flexible networks of infectious agents or cancer cells to kill them. The "network influence strategy" works against other diseases, where an efficient reconfiguration of rigid networks needs to be achieved by targeting the neighbors of central nodes/edges. It is shown how network techniques can help in the identification of single-target, edgetic, multi-target and allo-network drug target candidates. We review the recent boom in network methods helping hit identification, lead selection optimizing drug efficacy, as well as minimizing side-effects and drug toxicity. Successful network-based drug development strategies are shown through the examples of infections, cancer, metabolic diseases, neurodegenerative diseases and aging. Summarizing >1200 references we suggest an optimized protocol of network-aided drug development, and provide a list of systems-level hallmarks of drug quality. Finally, we highlight network-related drug development trends helping to achieve these hallmarks by a cohesive, global approach.
Collapse
Affiliation(s)
- Peter Csermely
- Department of Medical Chemistry, Semmelweis University, P.O. Box 260, H-1444 Budapest 8, Hungary.
| | | | | | | | | |
Collapse
|
17
|
Wren JD, Dozmorov MG, Burian D, Kaundal R, Bridges S, Kupfer DM. Proceedings of the 2012 MidSouth computational biology and bioinformatics society (MCBIOS) conference. BMC Bioinformatics 2012; 13 Suppl 15:S1. [PMID: 23046182 PMCID: PMC3439718 DOI: 10.1186/1471-2105-13-s15-s1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|