1
|
Abstract
A major endeavor of systems biology is the construction of graphical and computational models of biological pathways as a means to better understand their structure and function. Here, we present a protocol for a biologist-friendly graphical modeling scheme that facilitates the construction of detailed network diagrams, summarizing the components of a biological pathway (such as proteins and biochemicals) and illustrating how they interact. These diagrams can then be used to simulate activity flow through a pathway, thereby modeling its dynamic behavior. The protocol is divided into four sections: (i) assembly of network diagrams using the modified Edinburgh Pathway Notation (mEPN) scheme and yEd network editing software with pathway information obtained from published literature and databases of molecular interaction data; (ii) parameterization of the pathway model within yEd through the placement of 'tokens' on the basis of the known or imputed amount or activity of a component; (iii) model testing through visualization and quantitative analysis of the movement of tokens through the pathway, using the network analysis tool Graphia Professional and (iv) optimization of model parameterization and experimentation. This is the first modeling approach that combines a sophisticated notation scheme for depicting biological events at the molecular level with a Petri net-based flow simulation algorithm and a powerful visualization engine with which to observe the dynamics of the system being modeled. Unlike many mathematical approaches to modeling pathways, it does not require the construction of a series of equations or rate constants for model parameterization. Depending on a model's complexity and the availability of information, its construction can take days to months, and, with refinement, possibly years. However, once assembled and parameterized, a simulation run, even on a large model, typically takes only seconds. Models constructed using this approach provide a means of knowledge management, information exchange and, through the computation simulation of their dynamic activity, generation and testing of hypotheses, as well as prediction of a system's behavior when perturbed.
Collapse
|
2
|
Computational Approaches for Predicting Binding Partners, Interface Residues, and Binding Affinity of Protein-Protein Complexes. Methods Mol Biol 2017; 1484:237-253. [PMID: 27787830 DOI: 10.1007/978-1-4939-6406-2_16] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Studying protein-protein interactions leads to a better understanding of the underlying principles of several biological pathways. Cost and labor-intensive experimental techniques suggest the need for computational methods to complement them. Several such state-of-the-art methods have been reported for analyzing diverse aspects such as predicting binding partners, interface residues, and binding affinity for protein-protein complexes with reliable performance. However, there are specific drawbacks for different methods that indicate the need for their improvement. This review highlights various available computational algorithms for analyzing diverse aspects of protein-protein interactions and endorses the necessity for developing new robust methods for gaining deep insights about protein-protein interactions.
Collapse
|
3
|
Wang J, Zuo Y, Man YG, Avital I, Stojadinovic A, Liu M, Yang X, Varghese RS, Tadesse MG, Ressom HW. Pathway and network approaches for identification of cancer signature markers from omics data. J Cancer 2015; 6:54-65. [PMID: 25553089 PMCID: PMC4278915 DOI: 10.7150/jca.10631] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 11/14/2014] [Indexed: 12/12/2022] Open
Abstract
The advancement of high throughput omic technologies during the past few years has made it possible to perform many complex assays in a much shorter time than the traditional approaches. The rapid accumulation and wide availability of omic data generated by these technologies offer great opportunities to unravel disease mechanisms, but also presents significant challenges to extract knowledge from such massive data and to evaluate the findings. To address these challenges, a number of pathway and network based approaches have been introduced. This review article evaluates these methods and discusses their application in cancer biomarker discovery using hepatocellular carcinoma (HCC) as an example.
Collapse
Affiliation(s)
- Jinlian Wang
- 1. Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC, USA
- 7. Genetics and Genomics Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Yiming Zuo
- 1. Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC, USA
- 6. Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, USA
| | - Yan-gao Man
- 2. Bon Secours Cancer Institute, Richmond VA, USA
| | | | - Alexander Stojadinovic
- 2. Bon Secours Cancer Institute, Richmond VA, USA
- 3. Division of Surgical Oncology, Walter Reed National Military Medical Center, Bethesda, MD, USA
| | - Meng Liu
- 4. Department of Public Health School of Hunter College, City University of New York, NYC, USA
| | - Xiaowei Yang
- 4. Department of Public Health School of Hunter College, City University of New York, NYC, USA
| | - Rency S. Varghese
- 1. Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC, USA
| | - Mahlet G Tadesse
- 5. Department of Mathematics and Statistics, Georgetown University, Washington DC, USA
| | - Habtom W Ressom
- 1. Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC, USA
| |
Collapse
|
4
|
Yugandhar K, Gromiha MM. Feature selection and classification of protein-protein complexes based on their binding affinities using machine learning approaches. Proteins 2014; 82:2088-96. [PMID: 24648146 DOI: 10.1002/prot.24564] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2014] [Accepted: 03/14/2014] [Indexed: 12/16/2022]
Abstract
Protein-protein interactions are intrinsic to virtually every cellular process. Predicting the binding affinity of protein-protein complexes is one of the challenging problems in computational and molecular biology. In this work, we related sequence features of protein-protein complexes with their binding affinities using machine learning approaches. We set up a database of 185 protein-protein complexes for which the interacting pairs are heterodimers and their experimental binding affinities are available. On the other hand, we have developed a set of 610 features from the sequences of protein complexes and utilized Ranker search method, which is the combination of Attribute evaluator and Ranker method for selecting specific features. We have analyzed several machine learning algorithms to discriminate protein-protein complexes into high and low affinity groups based on their Kd values. Our results showed a 10-fold cross-validation accuracy of 76.1% with the combination of nine features using support vector machines. Further, we observed accuracy of 83.3% on an independent test set of 30 complexes. We suggest that our method would serve as an effective tool for identifying the interacting partners in protein-protein interaction networks and human-pathogen interactions based on the strength of interactions.
Collapse
Affiliation(s)
- K Yugandhar
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, 600036, Tamil Nadu, India
| | | |
Collapse
|
5
|
Perez-Riverol Y, Wang R, Hermjakob H, Müller M, Vesada V, Vizcaíno JA. Open source libraries and frameworks for mass spectrometry based proteomics: a developer's perspective. BIOCHIMICA ET BIOPHYSICA ACTA 2014; 1844:63-76. [PMID: 23467006 PMCID: PMC3898926 DOI: 10.1016/j.bbapap.2013.02.032] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/01/2012] [Revised: 02/05/2013] [Accepted: 02/22/2013] [Indexed: 12/23/2022]
Abstract
Data processing, management and visualization are central and critical components of a state of the art high-throughput mass spectrometry (MS)-based proteomics experiment, and are often some of the most time-consuming steps, especially for labs without much bioinformatics support. The growing interest in the field of proteomics has triggered an increase in the development of new software libraries, including freely available and open-source software. From database search analysis to post-processing of the identification results, even though the objectives of these libraries and packages can vary significantly, they usually share a number of features. Common use cases include the handling of protein and peptide sequences, the parsing of results from various proteomics search engines output files, and the visualization of MS-related information (including mass spectra and chromatograms). In this review, we provide an overview of the existing software libraries, open-source frameworks and also, we give information on some of the freely available applications which make use of them. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.
Collapse
Affiliation(s)
- Yasset Perez-Riverol
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- Department of Proteomics, Center for Genetic Engineering and Biotechnology, Ciudad de la Habana, Cuba
| | - Rui Wang
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Henning Hermjakob
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Markus Müller
- Proteome Informatics Group, Swiss Institute of Bioinformatics, CMU - 1, rue Michel Servet CH-1211 Geneva, Switzerland
| | - Vladimir Vesada
- Department of Proteomics, Center for Genetic Engineering and Biotechnology, Ciudad de la Habana, Cuba
| | - Juan Antonio Vizcaíno
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
6
|
Wilmarth P, Short K, Fiehn O, Lutsenko S, David L, Burkhead JL. A systems approach implicates nuclear receptor targeting in the Atp7b(-/-) mouse model of Wilson's disease. Metallomics 2012; 4:660-8. [PMID: 22565294 PMCID: PMC3695828 DOI: 10.1039/c2mt20017a] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Wilson's disease (WD) is an inherited disorder of copper metabolism characterized by liver disease and/or neurologic and psychiatric pathology. The disease is a result of mutation in ATP7B, which encodes the ATP7B copper transporting ATPase. Loss of copper transport function by ATP7B results in copper accumulation primarily in the liver, but also in other organs including the brain. Studies in the Atp7b(-/-) mouse model of WD revealed specific transcript and metabolic changes that precede development of liver pathology, most notably downregulation of transcripts in the cholesterol biosynthetic pathway. In order to gain insight into the molecular mechanisms of transcriptomic and metabolic changes, we used a systems approach analysing the pre-symptomatic hepatic nuclear proteome and liver metabolites. We found that ligand-activated nuclear receptors FXR/NR1H4 and GR/NR3C1 and nuclear receptor interacting partners are less abundant in Atp7b(-/-) hepatocyte nuclei, while DNA repair machinery and the nucleus-localized glutathione peroxidase, SelH, are more abundant. Analysis of metabolites revealed an increase in polyol sugar alcohols, indicating a change in osmotic potential that precedes hepatocyte swelling observed later in disease. This work is the first application of quantitative Multidimensional Protein Identification Technology (MuDPIT) to a model of WD to investigate protein-level mechanisms of WD pathology. The systems approach using "shotgun" proteomics and metabolomics in the context of previous transcriptomic data reveals molecular-level mechanisms of WD development and facilitates targeted analysis of hepatocellular copper toxicity.
Collapse
Affiliation(s)
- Phillip Wilmarth
- Dept. Biochemistry and Molecular Biology, Oregon Health & Science University, 3181 SW Sam Jackson Park Rd., Portland, OR 97239
| | - Kristopher Short
- Dept. Biological Sciences, University of Alaska Anchorage, 3211 Providence Dr., Anchorage, AK 99508. Fax: 01 907 7864607; Tel: 01 907 7864765
| | - Oliver Fiehn
- University of California Davis Genome Center, Davis, California 95616
| | - Svetlana Lutsenko
- Dept. Physiology, The Johns Hopkins University, Baltimore, MD, 21205
| | - Larry David
- Dept. Biochemistry and Molecular Biology, Oregon Health & Science University, 3181 SW Sam Jackson Park Rd., Portland, OR 97239
| | - Jason L. Burkhead
- Dept. Biological Sciences, University of Alaska Anchorage, 3211 Providence Dr., Anchorage, AK 99508. Fax: 01 907 7864607; Tel: 01 907 7864765
| |
Collapse
|
7
|
Brusniak MYK, Chu CS, Kusebauch U, Sartain MJ, Watts JD, Moritz RL. An assessment of current bioinformatic solutions for analyzing LC-MS data acquired by selected reaction monitoring technology. Proteomics 2012; 12:1176-84. [PMID: 22577019 PMCID: PMC3857306 DOI: 10.1002/pmic.201100571] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2011] [Accepted: 01/10/2012] [Indexed: 12/18/2022]
Abstract
Selected reaction monitoring (SRM) is an accurate quantitative technique, typically used for small-molecule mass spectrometry (MS). SRM has emerged as an important technique for targeted and hypothesis-driven proteomic research, and is becoming the reference method for protein quantification in complex biological samples. SRM offers high selectivity, a lower limit of detection and improved reproducibility, compared to conventional shot-gun-based tandem MS (LC-MS/MS) methods. Unlike LC-MS/MS, which requires computationally intensive informatic postanalysis, SRM requires preacquisition bioinformatic analysis to determine proteotypic peptides and optimal transitions to uniquely identify and to accurately quantitate proteins of interest. Extensive arrays of bioinformatics software tools, both web-based and stand-alone, have been published to assist researchers to determine optimal peptides and transition sets. The transitions are oftentimes selected based on preferred precursor charge state, peptide molecular weight, hydrophobicity, fragmentation pattern at a given collision energy (CE), and instrumentation chosen. Validation of the selected transitions for each peptide is critical since peptide performance varies depending on the mass spectrometer used. In this review, we provide an overview of open source and commercial bioinformatic tools for analyzing LC-MS data acquired by SRM.
Collapse
Affiliation(s)
| | - Caroline S. Chu
- Institute for Systems Biology, 401 Terry Avenue N, Seattle, WA, 98109 USA
| | - Ulrike Kusebauch
- Institute for Systems Biology, 401 Terry Avenue N, Seattle, WA, 98109 USA
| | - Mark J. Sartain
- Institute for Systems Biology, 401 Terry Avenue N, Seattle, WA, 98109 USA
| | - Julian D. Watts
- Institute for Systems Biology, 401 Terry Avenue N, Seattle, WA, 98109 USA
| | - Robert L. Moritz
- Institute for Systems Biology, 401 Terry Avenue N, Seattle, WA, 98109 USA
| |
Collapse
|
8
|
Brusniak MYK, Kwok ST, Christiansen M, Campbell D, Reiter L, Picotti P, Kusebauch U, Ramos H, Deutsch EW, Chen J, Moritz RL, Aebersold R. ATAQS: A computational software tool for high throughput transition optimization and validation for selected reaction monitoring mass spectrometry. BMC Bioinformatics 2011; 12:78. [PMID: 21414234 PMCID: PMC3213215 DOI: 10.1186/1471-2105-12-78] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2010] [Accepted: 03/18/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Since its inception, proteomics has essentially operated in a discovery mode with the goal of identifying and quantifying the maximal number of proteins in a sample. Increasingly, proteomic measurements are also supporting hypothesis-driven studies, in which a predetermined set of proteins is consistently detected and quantified in multiple samples. Selected reaction monitoring (SRM) is a targeted mass spectrometric technique that supports the detection and quantification of specific proteins in complex samples at high sensitivity and reproducibility. Here, we describe ATAQS, an integrated software platform that supports all stages of targeted, SRM-based proteomics experiments including target selection, transition optimization and post acquisition data analysis. This software will significantly facilitate the use of targeted proteomic techniques and contribute to the generation of highly sensitive, reproducible and complete datasets that are particularly critical for the discovery and validation of targets in hypothesis-driven studies in systems biology. RESULT We introduce a new open source software pipeline, ATAQS (Automated and Targeted Analysis with Quantitative SRM), which consists of a number of modules that collectively support the SRM assay development workflow for targeted proteomic experiments (project management and generation of protein, peptide and transitions and the validation of peptide detection by SRM). ATAQS provides a flexible pipeline for end-users by allowing the workflow to start or end at any point of the pipeline, and for computational biologists, by enabling the easy extension of java algorithm classes for their own algorithm plug-in or connection via an external web site.This integrated system supports all steps in a SRM-based experiment and provides a user-friendly GUI that can be run by any operating system that allows the installation of the Mozilla Firefox web browser. CONCLUSIONS Targeted proteomics via SRM is a powerful new technique that enables the reproducible and accurate identification and quantification of sets of proteins of interest. ATAQS is the first open-source software that supports all steps of the targeted proteomics workflow. ATAQS also provides software API (Application Program Interface) documentation that enables the addition of new algorithms to each of the workflow steps. The software, installation guide and sample dataset can be found in http://tools.proteomecenter.org/ATAQS/ATAQS.html.
Collapse
|