1
|
Palmer BJ, Almgren AS, Johnson CGM, Myers AT, Cannon WR. BMX: Biological modelling and interface exchange. Sci Rep 2023; 13:12235. [PMID: 37507417 PMCID: PMC10382537 DOI: 10.1038/s41598-023-39150-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 07/20/2023] [Indexed: 07/30/2023] Open
Abstract
High performance computing has a great potential to provide a range of significant benefits for investigating biological systems. These systems often present large modelling problems with many coupled subsystems, such as when studying colonies of bacteria cells. The aim to understand cell colonies has generated substantial interest as they can have strong economic and societal impacts through their roles in in industrial bioreactors and complex community structures, called biofilms, found in clinical settings. Investigating these communities through realistic models can rapidly exceed the capabilities of current serial software. Here, we introduce BMX, a software system developed for the high performance modelling of large cell communities by utilising GPU acceleration. BMX builds upon the AMRex adaptive mesh refinement package to efficiently model cell colony formation under realistic laboratory conditions. Using simple test scenarios with varying nutrient availability, we show that BMX is capable of correctly reproducing observed behavior of bacterial colonies on realistic time scales demonstrating a potential application of high performance computing to colony modelling. The open source software is available from the zenodo repository https://doi.org/10.5281/zenodo.8084270 under the BSD-2-Clause licence.
Collapse
Affiliation(s)
- Bruce J Palmer
- Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Washington, USA
| | - Ann S Almgren
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Connah G M Johnson
- Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Washington, USA.
| | - Andrew T Myers
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - William R Cannon
- Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Washington, USA
| |
Collapse
|
2
|
Cannon WR, Raff LM. The formulation of chemical potentials and free energy changes in biochemical reactions. Phys Chem Chem Phys 2021; 23:14783-14795. [PMID: 34196644 DOI: 10.1039/d1cp02045e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
In 1994, an IUBMB-IUPAC joint committee recommended a revised formulation for standard chemical potentials and reaction free energies motivated by the fact that, in biochemistry, the reactants and products often exist in multiple charge states depending on the pH and pMg of the solution environment. The recommendation involved both the use of (1) a mathematical transform with the intent to hold the pH constant, and (2) the formulation of reference chemical potentials of ionized isomeric species based on the log sum of the individual standard chemical potentials of each isomeric species. Recently, several reports including a 2020 IUPAC report have appeared that challenged the need for such summary formulations, arguing that the standard chemical potentials were sufficient with full accounting of each of the different charge state isomers involved in a biochemical reaction. This work critically evaluates both the use of thermodynamic transforms and the different chemical potential formulations. It is shown that (1) transforms are not necessary to hold the pH constant and (2) demonstrates that the two chemical potential formulations are not equivalent. Which formulation is appropriate depends on what species are measured experimentally or whether an assumption of equilibrium among the charge state isomers is reasonable and desirable.
Collapse
Affiliation(s)
- William R Cannon
- Pacific Northwest National Laboratory, Richland, USA. and Interdisciplinary Center for Quantitative Modeling in Biology, University of California, Riverside, USA
| | - Lionel M Raff
- Department of Chemistry, Oklahoma State University, Stillwater, USA
| |
Collapse
|
3
|
Peng GCY, Alber M, Tepole AB, Cannon WR, De S, Dura-Bernal S, Garikipati K, Karniadakis G, Lytton WW, Perdikaris P, Petzold L, Kuhl E. Multiscale modeling meets machine learning: What can we learn? Arch Comput Methods Eng 2021; 28:1017-1037. [PMID: 34093005 PMCID: PMC8172124 DOI: 10.1007/s11831-020-09405-5] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Accepted: 02/09/2020] [Indexed: 05/10/2023]
Abstract
Machine learning is increasingly recognized as a promising technology in the biological, biomedical, and behavioral sciences. There can be no argument that this technique is incredibly successful in image recognition with immediate applications in diagnostics including electrophysiology, radiology, or pathology, where we have access to massive amounts of annotated data. However, machine learning often performs poorly in prognosis, especially when dealing with sparse data. This is a field where classical physics-based simulation seems to remain irreplaceable. In this review, we identify areas in the biomedical sciences where machine learning and multiscale modeling can mutually benefit from one another: Machine learning can integrate physics-based knowledge in the form of governing equations, boundary conditions, or constraints to manage ill-posted problems and robustly handle sparse and noisy data; multiscale modeling can integrate machine learning to create surrogate models, identify system dynamics and parameters, analyze sensitivities, and quantify uncertainty to bridge the scales and understand the emergence of function. With a view towards applications in the life sciences, we discuss the state of the art of combining machine learning and multiscale modeling, identify applications and opportunities, raise open questions, and address potential challenges and limitations. We anticipate that it will stimulate discussion within the community of computational mechanics and reach out to other disciplines including mathematics, statistics, computer science, artificial intelligence, biomedicine, systems biology, and precision medicine to join forces towards creating robust and efficient models for biological systems.
Collapse
Affiliation(s)
| | - Mark Alber
- University of California, Riverside, USA
| | | | - William R Cannon
- Pacific Northwest National Laboratory, Richland, Washington, USA
| | - Suvranu De
- Rensselaer Polytechnic Institute, Troy, New York, USA
| | | | | | | | | | | | - Linda Petzold
- University of California, Santa Barbara, California, USA
| | - Ellen Kuhl
- Stanford University, Stanford, California, USA
| |
Collapse
|
4
|
Britton J, Ramezani A, Pelletier D, Alber M, Cannon WR. A Multiscale Model of Fungal Impact on Chemotactic Behavior of Mycorrhizal Helper Bacteria. Biophys J 2021. [DOI: 10.1016/j.bpj.2020.11.639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
5
|
Britton S, Alber M, Cannon WR. Enzyme activities predicted by metabolite concentrations and solvent capacity in the cell. J R Soc Interface 2020; 17:20200656. [PMID: 33050777 PMCID: PMC7653389 DOI: 10.1098/rsif.2020.0656] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Accepted: 09/17/2020] [Indexed: 12/23/2022] Open
Abstract
Experimental measurements or computational model predictions of the post-translational regulation of enzymes needed in a metabolic pathway is a difficult problem. Consequently, regulation is mostly known only for well-studied reactions of central metabolism in various model organisms. In this study, we use two approaches to predict enzyme regulation policies and investigate the hypothesis that regulation is driven by the need to maintain the solvent capacity in the cell. The first predictive method uses a statistical thermodynamics and metabolic control theory framework while the second method is performed using a hybrid optimization-reinforcement learning approach. Efficient regulation schemes were learned from experimental data that either agree with theoretical calculations or result in a higher cell fitness using maximum useful work as a metric. As previously hypothesized, regulation is herein shown to control the concentrations of both immediate and downstream product concentrations at physiological levels. Model predictions provide the following two novel general principles: (1) the regulation itself causes the reactions to be much further from equilibrium instead of the common assumption that highly non-equilibrium reactions are the targets for regulation; and (2) the minimal regulation needed to maintain metabolite levels at physiological concentrations maximizes the free energy dissipation rate instead of preserving a specific energy charge. The resulting energy dissipation rate is an emergent property of regulation which may be represented by a high value of the adenylate energy charge. In addition, the predictions demonstrate that the amount of regulation needed can be minimized if it is applied at the beginning or branch point of a pathway, in agreement with common notions. The approach is demonstrated for three pathways in the central metabolism of E. coli (gluconeogenesis, glycolysis-tricarboxylic acid (TCA) and pentose phosphate-TCA) that each require different regulation schemes. It is shown quantitatively that hexokinase, glucose 6-phosphate dehydrogenase and glyceraldehyde phosphate dehydrogenase, all branch points of pathways, play the largest roles in regulating central metabolism.
Collapse
Affiliation(s)
- Samuel Britton
- Department of Mathematics, University of California Riverside, Riverside, CA 92505, USA
- Center for Quantitative Modeling in Biology, University of California Riverside, Riverside, CA 92505, USA
| | - Mark Alber
- Department of Mathematics, University of California Riverside, Riverside, CA 92505, USA
- Center for Quantitative Modeling in Biology, University of California Riverside, Riverside, CA 92505, USA
| | - William R. Cannon
- Department of Mathematics, University of California Riverside, Riverside, CA 92505, USA
- Center for Quantitative Modeling in Biology, University of California Riverside, Riverside, CA 92505, USA
- Physical and Computational Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| |
Collapse
|
6
|
North JA, Narrowe AB, Xiong W, Byerly KM, Zhao G, Young SJ, Murali S, Wildenthal JA, Cannon WR, Wrighton KC, Hettich RL, Tabita FR. A nitrogenase-like enzyme system catalyzes methionine, ethylene, and methane biogenesis. Science 2020; 369:1094-1098. [PMID: 32855335 DOI: 10.1126/science.abb6310] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Accepted: 06/12/2020] [Indexed: 12/16/2022]
Abstract
Bacterial production of gaseous hydrocarbons such as ethylene and methane affects soil environments and atmospheric climate. We demonstrate that biogenic methane and ethylene from terrestrial and freshwater bacteria are directly produced by a previously unknown methionine biosynthesis pathway. This pathway, present in numerous species, uses a nitrogenase-like reductase that is distinct from known nitrogenases and nitrogenase-like reductases and specifically functions in C-S bond breakage to reduce ubiquitous and appreciable volatile organic sulfur compounds such as dimethyl sulfide and (2-methylthio)ethanol. Liberated methanethiol serves as the immediate precursor to methionine, while ethylene or methane is released into the environment. Anaerobic ethylene production by this pathway apparently explains the long-standing observation of ethylene accumulation in oxygen-depleted soils. Methane production reveals an additional bacterial pathway distinct from archaeal methanogenesis.
Collapse
Affiliation(s)
- Justin A North
- Department of Microbiology, The Ohio State University, Columbus, OH 43210, USA
| | - Adrienne B Narrowe
- Department of Soil and Crop Sciences, Colorado State University, Fort Collins, CO 80523, USA
| | - Weili Xiong
- Chemical Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | - Kathryn M Byerly
- Department of Microbiology, The Ohio State University, Columbus, OH 43210, USA
| | - Guanqi Zhao
- Department of Microbiology, The Ohio State University, Columbus, OH 43210, USA
| | - Sarah J Young
- Department of Microbiology, The Ohio State University, Columbus, OH 43210, USA
| | - Srividya Murali
- Department of Microbiology, The Ohio State University, Columbus, OH 43210, USA
| | - John A Wildenthal
- Department of Microbiology, The Ohio State University, Columbus, OH 43210, USA
| | - William R Cannon
- Pacific Northwest National Laboratory, Richland, WA 99352, USA.,Department of Mathematics, University of California, Riverside, Riverside, CA 92507, USA
| | - Kelly C Wrighton
- Department of Soil and Crop Sciences, Colorado State University, Fort Collins, CO 80523, USA
| | - Robert L Hettich
- Chemical Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | - F Robert Tabita
- Department of Microbiology, The Ohio State University, Columbus, OH 43210, USA.
| |
Collapse
|
7
|
Cannon WR, Britton SR, Alber M. Learning Regulation and Optimal Control of Enzyme Activities. Biophys J 2020. [DOI: 10.1016/j.bpj.2019.11.864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|
8
|
Alber M, Buganza Tepole A, Cannon WR, De S, Dura-Bernal S, Garikipati K, Karniadakis G, Lytton WW, Perdikaris P, Petzold L, Kuhl E. Integrating machine learning and multiscale modeling-perspectives, challenges, and opportunities in the biological, biomedical, and behavioral sciences. NPJ Digit Med 2019; 2:115. [PMID: 31799423 PMCID: PMC6877584 DOI: 10.1038/s41746-019-0193-y] [Citation(s) in RCA: 148] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Accepted: 11/01/2019] [Indexed: 12/12/2022] Open
Abstract
Fueled by breakthrough technology developments, the biological, biomedical, and behavioral sciences are now collecting more data than ever before. There is a critical need for time- and cost-efficient strategies to analyze and interpret these data to advance human health. The recent rise of machine learning as a powerful technique to integrate multimodality, multifidelity data, and reveal correlations between intertwined phenomena presents a special opportunity in this regard. However, machine learning alone ignores the fundamental laws of physics and can result in ill-posed problems or non-physical solutions. Multiscale modeling is a successful strategy to integrate multiscale, multiphysics data and uncover mechanisms that explain the emergence of function. However, multiscale modeling alone often fails to efficiently combine large datasets from different sources and different levels of resolution. Here we demonstrate that machine learning and multiscale modeling can naturally complement each other to create robust predictive models that integrate the underlying physics to manage ill-posed problems and explore massive design spaces. We review the current literature, highlight applications and opportunities, address open questions, and discuss potential challenges and limitations in four overarching topical areas: ordinary differential equations, partial differential equations, data-driven approaches, and theory-driven approaches. Towards these goals, we leverage expertise in applied mathematics, computer science, computational biology, biophysics, biomechanics, engineering mechanics, experimentation, and medicine. Our multidisciplinary perspective suggests that integrating machine learning and multiscale modeling can provide new insights into disease mechanisms, help identify new targets and treatment strategies, and inform decision making for the benefit of human health.
Collapse
Affiliation(s)
- Mark Alber
- Department of Mathematics, University of California, Riverside, CA USA
| | | | - William R. Cannon
- Computational Biology Group, Pacific Northwest National Laboratory, Richland, WA USA
| | - Suvranu De
- Department of Mechanical, Aerospace and Nuclear Engineering, Rensselaer Polytechnic Institute, Troy, NY USA
| | | | - Krishna Garikipati
- Departments of Mechanical Engineering and Mathematics, University of Michigan, Ann Arbor, MI USA
| | | | - William W. Lytton
- SUNY Downstate Medical Center and Kings County Hospital, Brooklyn, NY USA
| | - Paris Perdikaris
- Department of Mechanical Engineering, University of Pennsylvania, Philadelphia, PA USA
| | - Linda Petzold
- Department of Computer Science and Mechanical Engineering, University of California, Santa Barbara, CA USA
| | - Ellen Kuhl
- Departments of Mechanical Engineering and Bioengineering, Stanford University, Stanford, CA USA
| |
Collapse
|
9
|
Cannon WR, Zucker JD, Baxter DJ, Kumar N, Baker SE, Hurley JM, Dunlap JC. Prediction of Metabolite Concentrations, Rate Constants and Post-Translational Regulation Using Maximum Entropy-Based Simulations with Application to Central Metabolism of Neurospora crassa. Processes (Basel) 2018; 6. [PMID: 33824861 PMCID: PMC8020867 DOI: 10.3390/pr6060063] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
We report the application of a recently proposed approach for modeling biological systems using a maximum entropy production rate principle in lieu of having in vivo rate constants. The method is applied in four steps: (1) a new ordinary differential equation (ODE) based optimization approach based on Marcelin’s 1910 mass action equation is used to obtain the maximum entropy distribution; (2) the predicted metabolite concentrations are compared to those generally expected from experiments using a loss function from which post-translational regulation of enzymes is inferred; (3) the system is re-optimized with the inferred regulation from which rate constants are determined from the metabolite concentrations and reaction fluxes; and finally (4) a full ODE-based, mass action simulation with rate parameters and allosteric regulation is obtained. From the last step, the power characteristics and resistance of each reaction can be determined. The method is applied to the central metabolism of Neurospora crassa and the flow of material through the three competing pathways of upper glycolysis, the non-oxidative pentose phosphate pathway, and the oxidative pentose phosphate pathway are evaluated as a function of the NADP/NADPH ratio. It is predicted that regulation of phosphofructokinase (PFK) and flow through the pentose phosphate pathway are essential for preventing an extreme level of fructose 1,6-bisphophate accumulation. Such an extreme level of fructose 1,6-bisphophate would otherwise result in a glassy cytoplasm with limited diffusion, dramatically decreasing the entropy and energy production rate and, consequently, biological competitiveness.
Collapse
Affiliation(s)
- William R. Cannon
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
- Correspondence: ; Tel.: +1-509-375-6732
| | - Jeremy D. Zucker
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Douglas J. Baxter
- Research Computing Group, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Neeraj Kumar
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Scott E. Baker
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Jennifer M. Hurley
- Department of Biological Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
| | - Jay C. Dunlap
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH 03755, USA
| |
Collapse
|
10
|
Abstract
Comprehensive and predictive simulation of coupled reaction networks has long been a goal of biology and other fields. Currently, metabolic network models that utilize enzyme mass action kinetics have predictive power but are limited in scope and application by the fact that the determination of enzyme rate constants is laborious and low throughput. We present a statistical thermodynamic formulation of the law of mass action for coupled reactions at both steady states and non-stationary states. The formulation uses chemical potentials instead of rate constants. When used to model deterministic systems, the method corresponds to a rescaling of the time dependent reactions in such a way that steady states can be reached on the same time scale but with significantly fewer computational steps. The relationships between reaction affinities, free energy changes and generalized detailed balance are central to the discussion. The significance for applications in systems biology are discussed as is the concept and assumption of maximum entropy production rate as a biological principle that links thermodynamics to natural selection.
Collapse
Affiliation(s)
- William R Cannon
- Biological Sciences Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99352, United States of America. Author to whom any correspondence should be addressed
| | | |
Collapse
|
11
|
Clancy CE, An G, Cannon WR, Liu Y, May EE, Ortoleva P, Popel AS, Sluka JP, Su J, Vicini P, Zhou X, Eckmann DM. Multiscale Modeling in the Clinic: Drug Design and Development. Ann Biomed Eng 2016; 44:2591-610. [PMID: 26885640 PMCID: PMC4983472 DOI: 10.1007/s10439-016-1563-0] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Accepted: 02/02/2016] [Indexed: 01/30/2023]
Abstract
A wide range of length and time scales are relevant to pharmacology, especially in drug development, drug design and drug delivery. Therefore, multiscale computational modeling and simulation methods and paradigms that advance the linkage of phenomena occurring at these multiple scales have become increasingly important. Multiscale approaches present in silico opportunities to advance laboratory research to bedside clinical applications in pharmaceuticals research. This is achievable through the capability of modeling to reveal phenomena occurring across multiple spatial and temporal scales, which are not otherwise readily accessible to experimentation. The resultant models, when validated, are capable of making testable predictions to guide drug design and delivery. In this review we describe the goals, methods, and opportunities of multiscale modeling in drug design and development. We demonstrate the impact of multiple scales of modeling in this field. We indicate the common mathematical and computational techniques employed for multiscale modeling approaches used in pharmacometric and systems pharmacology models in drug development and present several examples illustrating the current state-of-the-art models for (1) excitable systems and applications in cardiac disease; (2) stem cell driven complex biosystems; (3) nanoparticle delivery, with applications to angiogenesis and cancer therapy; (4) host-pathogen interactions and their use in metabolic disorders, inflammation and sepsis; and (5) computer-aided design of nanomedical systems. We conclude with a focus on barriers to successful clinical translation of drug development, drug design and drug delivery multiscale models.
Collapse
Affiliation(s)
- Colleen E Clancy
- Department of Pharmacology, University of California, Davis, CA, USA.
| | - Gary An
- Department of Surgery, University of Chicago, Chicago, IL, USA
| | - William R Cannon
- Computational Biology Group, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Yaling Liu
- Department of Mechanical Engineering and Mechanics, Bioengineering Program, Lehigh University, Bethlehem, PA, USA
| | - Elebeoba E May
- Department of Biomedical Engineering, University of Houston, Houston, TX, USA
| | - Peter Ortoleva
- Department of Chemistry, Indiana University, Bloomington, IN, USA
| | - Aleksander S Popel
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - James P Sluka
- Biocomplexity Institute, Indiana University, Bloomington, IN, USA
| | - Jing Su
- Department of Radiology, Wake Forest University, Winston-Salem, NC, USA
| | - Paolo Vicini
- Clinical Pharmacology and DMPK, MedImmune, Cambridge, UK
| | - Xiaobo Zhou
- Department of Radiology, Wake Forest University, Winston-Salem, NC, USA
| | - David M Eckmann
- Department of Anesthesiology and Critical Care, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
12
|
Thomas DG, Jaramillo-Riveri S, Baxter DJ, Cannon WR. Comparison of Optimal Thermodynamic Models of the Tricarboxylic Acid Cycle from Heterotrophs, Cyanobacteria, and Green Sulfur Bacteria. J Phys Chem B 2014; 118:14745-60. [PMID: 25495377 DOI: 10.1021/jp5075913] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We have applied a new stochastic simulation approach to predict the metabolite levels, material flux, and thermodynamic profiles of the oxidative TCA cycles found in E. coli and Synechococcus sp. PCC 7002, and in the reductive TCA cycle typical of chemolithoautotrophs and phototrophic green sulfur bacteria such as Chlorobaculum tepidum. The simulation approach is based on modeling states using statistical thermodynamics and employs an assumption similar to that used in transition state theory. The ability to evaluate the thermodynamics of metabolic pathways allows one to understand the relationship between coupling of energy and material gradients in the environment and the self-organization of stable biological systems, and it is shown that each cycle operates in the direction expected due to its environmental niche. The simulations predict changes in metabolite levels and flux in response to changes in cofactor concentrations that would be hard to predict without an elaborate model based on the law of mass action. In fact, we show that a thermodynamically unfavorable reaction can still have flux in the forward direction when it is part of a reaction network. The ability to predict metabolite levels, energy flow, and material flux should be significant for understanding the dynamics of natural systems and for understanding principles for engineering organisms for production of specialty chemicals.
Collapse
Affiliation(s)
- Dennis G Thomas
- Knowledge Discovery and Informatics Group, National Security Directorate, ‡Computational Biology and Bioinformatics Group, Fundamental and Computational Sciences Directorate, and §Molecular Sciences Computing Division, Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory , Richland, Washington 99352, United States
| | - Sebastian Jaramillo-Riveri
- Knowledge Discovery and Informatics Group, National Security Directorate, ‡Computational Biology and Bioinformatics Group, Fundamental and Computational Sciences Directorate, and §Molecular Sciences Computing Division, Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory , Richland, Washington 99352, United States
| | - Douglas J Baxter
- Knowledge Discovery and Informatics Group, National Security Directorate, ‡Computational Biology and Bioinformatics Group, Fundamental and Computational Sciences Directorate, and §Molecular Sciences Computing Division, Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory , Richland, Washington 99352, United States
| | - William R Cannon
- Knowledge Discovery and Informatics Group, National Security Directorate, ‡Computational Biology and Bioinformatics Group, Fundamental and Computational Sciences Directorate, and §Molecular Sciences Computing Division, Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory , Richland, Washington 99352, United States
| |
Collapse
|
13
|
Abstract
The modeling of the chemical reactions involved in metabolism is a daunting task. Ideally, the modeling of metabolism would use kinetic simulations, but these simulations require knowledge of the thousands of rate constants involved in the reactions. The measurement of rate constants is very labor intensive, and hence rate constants for most enzymatic reactions are not available. Consequently, constraint-based flux modeling has been the method of choice because it does not require the use of the rate constants of the law of mass action. However, this convenience also limits the predictive power of constraint-based approaches in that the law of mass action is used only as a constraint, making it difficult to predict metabolite levels or energy requirements of pathways. An alternative to both of these approaches is to model metabolism using simulations of states rather than simulations of reactions, in which the state is defined as the set of all metabolite counts or concentrations. While kinetic simulations model reactions based on the likelihood of the reaction derived from the law of mass action, states are modeled based on likelihood ratios of mass action. Both approaches provide information on the energy requirements of metabolic reactions and pathways. However, modeling states rather than reactions has the advantage that the parameters needed to model states (chemical potentials) are much easier to determine than the parameters needed to model reactions (rate constants). Herein, we discuss recent results, assumptions, and issues in using simulations of state to model metabolism.
Collapse
Affiliation(s)
- William R Cannon
- Computational Biology and Bioinformatics Group, Biological Sciences Division, Pacific Northwest National Laboratory , Richland, WA , USA
| |
Collapse
|
14
|
Abstract
New methods are needed for large scale modeling of metabolism that predict metabolite levels and characterize the thermodynamics of individual reactions and pathways. Current approaches use either kinetic simulations, which are difficult to extend to large networks of reactions because of the need for rate constants, or flux-based methods, which have a large number of feasible solutions because they are unconstrained by the law of mass action. This report presents an alternative modeling approach based on statistical thermodynamics. The principles of this approach are demonstrated using a simple set of coupled reactions, and then the system is characterized with respect to the changes in energy, entropy, free energy, and entropy production. Finally, the physical and biochemical insights that this approach can provide for metabolism are demonstrated by application to the tricarboxylic acid (TCA) cycle of Escherichia coli. The reaction and pathway thermodynamics are evaluated and predictions are made regarding changes in concentration of TCA cycle intermediates due to 10- and 100-fold changes in the ratio of NAD+:NADH concentrations. Finally, the assumptions and caveats regarding the use of statistical thermodynamics to model non-equilibrium reactions are discussed.
Collapse
Affiliation(s)
- William R. Cannon
- Computational Biology and Bioinformatics Group, Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, United States of America
- * E-mail:
| |
Collapse
|
15
|
Peterson ES, McCue LA, Schrimpe-Rutledge AC, Jensen JL, Walker H, Kobold MA, Webb SR, Payne SH, Ansong C, Adkins JN, Cannon WR, Webb-Robertson BJM. VESPA: software to facilitate genomic annotation of prokaryotic organisms through integration of proteomic and transcriptomic data. BMC Genomics 2012; 13:131. [PMID: 22480257 PMCID: PMC3364912 DOI: 10.1186/1471-2164-13-131] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2011] [Accepted: 04/05/2012] [Indexed: 11/10/2022] Open
Abstract
Background The procedural aspects of genome sequencing and assembly have become relatively inexpensive, yet the full, accurate structural annotation of these genomes remains a challenge. Next-generation sequencing transcriptomics (RNA-Seq), global microarrays, and tandem mass spectrometry (MS/MS)-based proteomics have demonstrated immense value to genome curators as individual sources of information, however, integrating these data types to validate and improve structural annotation remains a major challenge. Current visual and statistical analytic tools are focused on a single data type, or existing software tools are retrofitted to analyze new data forms. We present Visual Exploration and Statistics to Promote Annotation (VESPA) is a new interactive visual analysis software tool focused on assisting scientists with the annotation of prokaryotic genomes though the integration of proteomics and transcriptomics data with current genome location coordinates. Results VESPA is a desktop Java™ application that integrates high-throughput proteomics data (peptide-centric) and transcriptomics (probe or RNA-Seq) data into a genomic context, all of which can be visualized at three levels of genomic resolution. Data is interrogated via searches linked to the genome visualizations to find regions with high likelihood of mis-annotation. Search results are linked to exports for further validation outside of VESPA or potential coding-regions can be analyzed concurrently with the software through interaction with BLAST. VESPA is demonstrated on two use cases (Yersinia pestis Pestoides F and Synechococcus sp. PCC 7002) to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics data alone, or in combination with transcriptomic data. Conclusions VESPA is an interactive visual analytics tool that integrates high-throughput data into a genomic context to facilitate the discovery of structural mis-annotations in prokaryotic genomes. Data is evaluated via visual analysis across multiple levels of genomic resolution, linked searches and interaction with existing bioinformatics tools. We highlight the novel functionality of VESPA and core programming requirements for visualization of these large heterogeneous datasets for a client-side application. The software is freely available at https://www.biopilot.org/docs/Software/Vespa.php.
Collapse
Affiliation(s)
- Elena S Peterson
- Scientific Data Management, Pacific Northwest National Laboratory, Richland, WA, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Hugo A, Baxter DJ, Cannon WR, Kalyanaraman A, Kulkarni G, Callister SJ. Proteotyping of microbial communities by optimization of tandem mass spectrometry data interpretation. Pac Symp Biocomput 2012:225-234. [PMID: 22174278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
We report the development of a novel high performance computing method for the identification of proteins from unknown (environmental) samples. The method uses computational optimization to provide an effective way to control the false discovery rate for environmental samples and complements de novo peptide sequencing. Furthermore, the method provides information based on the expressed protein in a microbial community, and thus complements DNA-based identification methods. Testing on blind samples demonstrates that the method provides 79-95% overlap with analogous results from searches involving only the correct genomes. We provide scaling and performance evaluations for the software that demonstrate the ability to carry out large-scale optimizations on 1258 genomes containing 4.2M proteins.
Collapse
Affiliation(s)
- Alys Hugo
- Computational Biology and Bioinformatics Group, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | | | | | | | | | | |
Collapse
|
17
|
Kalyanaraman A, Cannon WR, Latt B, Baxter DJ. MapReduce implementation of a hybrid spectral library-database search method for large-scale peptide identification. ACTA ACUST UNITED AC 2011; 27:3072-3. [PMID: 21926122 PMCID: PMC3198583 DOI: 10.1093/bioinformatics/btr523] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
SUMMARY A MapReduce-based implementation called MR-MSPolygraph for parallelizing peptide identification from mass spectrometry data is presented. The underlying serial method, MSPolygraph, uses a novel hybrid approach to match an experimental spectrum against a combination of a protein sequence database and a spectral library. Our MapReduce implementation can run on any Hadoop cluster environment. Experimental results demonstrate that, relative to the serial version, MR-MSPolygraph reduces the time to solution from weeks to hours, for processing tens of thousands of experimental spectra. Speedup and other related performance studies are also reported on a 400-core Hadoop cluster using spectral datasets from environmental microbial communities as inputs. AVAILABILITY The source code along with user documentation are available on http://compbio.eecs.wsu.edu/MR-MSPolygraph. CONTACT ananth@eecs.wsu.edu; william.cannon@pnnl.gov. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ananth Kalyanaraman
- School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99164-2752, USA.
| | | | | | | |
Collapse
|
18
|
Deutsch SI, Burket JA, Cannon WR, Jacome LF. Selective mGluR5 antagonism attenuates the stress-induced reduction of MK-801's antiseizure potency in the genetically inbred Balb/c mouse. Epilepsy Behav 2011; 21:352-5. [PMID: 21683659 DOI: 10.1016/j.yebeh.2011.03.026] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/09/2011] [Revised: 03/15/2011] [Accepted: 03/19/2011] [Indexed: 11/27/2022]
Abstract
The ability of MK-801 (dizocilpine), a noncompetitive N-methyl D-aspartate (NMDA) antagonist, to antagonize electrical seizures is reduced in stressed mice. Stress-associated alterations in seizure susceptibility and diminished efficacy of antiseizure medications in humans have been reported [Joëls, 2009; Haut et al., 2007; Moshe et al., 2008]; thus, these experimental observations implicate altered endogenous tone of NMDA receptor-mediated neurotransmission in clinically adverse effects of stress on seizure proneness and treatment. The current exploratory experiment examined the effect of 2-methyl-6-(phenylethynyl)-pyridine (MPEP), an antagonist of mGluR5, administered prior to stress on the stress-induced reduction of MK-801's antiseizure effect in Swiss-Webster and Balb/c mice; the Balb/c mouse is behaviorally hypersensitive to MK-801. Interestingly, the data suggest that MPEP can attenuate the severity of the stress-induced reduction of MK-801's antiseizure effect in the Balb/c strain. Thus, mGluR5 could serve as a target for strategies for adjuvant treatment of seizures exacerbated by stress.
Collapse
Affiliation(s)
- Stephen I Deutsch
- Department of Psychiatry and Behavioral Sciences, Eastern Virginia Medical School, Norfolk, VA 23507–1912, USA.
| | | | | | | |
Collapse
|
19
|
Cannon WR, Rawlins MM, Baxter DJ, Callister SJ, Lipton MS, Bryant DA. Large improvements in MS/MS-based peptide identification rates using a hybrid analysis. J Proteome Res 2011; 10:2306-17. [PMID: 21391700 DOI: 10.1021/pr101130b] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
We report a hybrid search method combining database and spectral library searches that allows for a straightforward approach to characterizing the error rates from the combined data. Using these methods, we demonstrate significantly increased sensitivity and specificity in matching peptides to tandem mass spectra. The hybrid search method increased the number of spectra that can be assigned to a peptide in a global proteomics study by 57-147% at an estimated false discovery rate of 5%, with clear room for even greater improvements. The approach combines the general utility of using consensus model spectra typical of database search methods with the accuracy of the intensity information contained in spectral libraries. A common scoring metric based on recent developments linking data analysis and statistical thermodynamics is used, which allows the use of a conservative estimate of error rates for the combined data. We applied this approach to proteomics analysis of Synechococcus sp. PCC 7002, a cyanobacterium that is a model organism for studies of photosynthetic carbon fixation and biofuels development. The increased specificity and sensitivity of this approach allowed us to identify many more peptides involved in the processes important for photoautotrophic growth.
Collapse
Affiliation(s)
- William R Cannon
- Computational Biology and Bioinformatics Group, Pacific Northwest National Laboratory, Richland, Washington 99352, United States.
| | | | | | | | | | | |
Collapse
|
20
|
Jacome LF, Burket JA, Herndon AL, Cannon WR, Deutsch SI. D-serine improves dimensions of the sociability deficit of the genetically-inbred Balb/c mouse strain. Brain Res Bull 2010; 84:12-6. [PMID: 21056638 DOI: 10.1016/j.brainresbull.2010.10.010] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2010] [Revised: 10/26/2010] [Accepted: 10/28/2010] [Indexed: 11/26/2022]
Abstract
The Balb/c mouse strain shows quantitative deficits of sociability and is behaviorally-hypersensitive to MK-801 (dizocilpine), a noncompetitive NMDA receptor antagonist. D-Serine (560mg/kg, intraperitoneally), a full agonist for the obligatory glycine co-agonist binding site on the NMDA receptor, increased the amount of time Balb/c mice spend in a compartment containing the enclosed social stimulus mouse and the amount of time Balb/c mice spend exploring (sniffing) an inverted cup containing the enclosed social stimulus mouse in a standard sociability apparatus. These effects of D-serine on the impaired sociability of the Balb/c mouse strain were not due to a "nonspecific" effect on locomotor activity; importantly, the locomotor activity of the Balb/c mouse strain decreases in the presence of an enclosed or freely-moving social stimulus mouse. The data suggest that dimensions of the impaired sociability of the Balb/c mouse strain may be improved by targeted NMDA receptor agonist interventions.
Collapse
Affiliation(s)
- Luis F Jacome
- Department of Psychiatry and Behavioral Sciences, Eastern Virginia Medical School, Norfolk, VA 23507-1912, United States
| | | | | | | | | |
Collapse
|
21
|
Deutsch SI, Burket JA, Jacome LF, Cannon WR, Herndon AL. D-Cycloserine improves the impaired sociability of the Balb/c mouse. Brain Res Bull 2010; 84:8-11. [PMID: 20970484 DOI: 10.1016/j.brainresbull.2010.10.006] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2010] [Accepted: 10/13/2010] [Indexed: 11/30/2022]
Abstract
The genetically inbred Balb/c mouse strain shows evidence of impaired sociability in a standard paradigm. For example, relative to 8-week-old male outbred Swiss-Webster mice, 8 week-old male Balb/c mice spend less time sniffing and in the vicinity of an enclosed 4 week-old male ICR stimulus mouse and, when allowed to interact freely with the stimulus mouse for five minutes, make fewer discrete episodes of social approach and show suppression of locomotor activity. We explored the effect of D-cycloserine (320mg/kg, intraperitoneally), a partial glycine agonist that binds to the obligatory co-agonist glycine binding site on the NMDA receptor, on the sociability of the Balb/c and Swiss-Webster mouse strains in a standard paradigm. The results show that treatment with D-cycloserine increased the locomotor activity of the Balb/c mouse strain in the presence of an enclosed social stimulus mouse and when these mice were allowed to interact freely with each other. Also, D-cycloserine increased the number of discrete episodes of social approach when Balb/c mice were allowed to interact freely with social stimulus mice. However, D-cycloserine had similar effects on measures of sociability in the Swiss-Webster mouse, raising the possibility that the positive effects on the sociability of the Balb/c mouse strain may be mediated by indirect effects on locomotion, arousal, and anxiety.
Collapse
Affiliation(s)
- Stephen I Deutsch
- Department of Psychiatry and Behavioral Sciences, Eastern Virginia Medical School, Norfolk, VA 23507-1912, United States.
| | | | | | | | | |
Collapse
|
22
|
Webb-Robertson BJM, Cannon WR, Oehmen CS, Shah AR, Gurumoorthi V, Lipton MS, Waters KM. A support vector machine model for the prediction of proteotypic peptides for accurate mass and time proteomics. Bioinformatics 2010; 26:1677-83. [PMID: 20568665 DOI: 10.1093/bioinformatics/btq251] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The standard approach to identifying peptides based on accurate mass and elution time (AMT) compares profiles obtained from a high resolution mass spectrometer to a database of peptides previously identified from tandem mass spectrometry (MS/MS) studies. It would be advantageous, with respect to both accuracy and cost, to only search for those peptides that are detectable by MS (proteotypic). RESULTS We present a support vector machine (SVM) model that uses a simple descriptor space based on 35 properties of amino acid content, charge, hydrophilicity and polarity for the quantitative prediction of proteotypic peptides. Using three independently derived AMT databases (Shewanella oneidensis, Salmonella typhimurium, Yersinia pestis) for training and validation within and across species, the SVM resulted in an average accuracy measure of approximately 0.83 with an SD of <0.038. Furthermore, we demonstrate that these results are achievable with a small set of 13 variables and can achieve high proteome coverage. AVAILABILITY http://omics.pnl.gov/software/STEPP.php CONTACT bj@pnl.gov SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
23
|
Taylor RC, Singhal M, Daly DS, Gilmore J, Cannon WR, Domico K, White AM, Auberry DL, Auberry KJ, Hooker BS, Hurst G, McDermott JE, McDonald WH, Pelletier DA, Schmoyer D, Wiley HS. An analysis pipeline for the inference of protein-protein interaction networks. INT J DATA MIN BIOIN 2010; 3:409-30. [PMID: 20052905 DOI: 10.1504/ijdmb.2009.029204] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
We present a platform for the reconstruction of protein-protein interaction networks inferred from Mass Spectrometry (MS) bait-prey data. The Software Environment for Biological Network Inference (SEBINI), an environment for the deployment of network inference algorithms that use high-throughput data, forms the platform core. Among the many algorithms available in SEBINI is the Bayesian Estimator of Probabilities of Protein-Protein Associations (BEPro3) algorithm, which is used to infer interaction networks from such MS affinity isolation data. Also, the pipeline incorporates the Collective Analysis of Biological Interaction Networks (CABIN) software. We have thus created a structured workflow for protein-protein network inference and supplemental analysis.
Collapse
Affiliation(s)
- Ronald C Taylor
- Computational Sciences and Mathematics Division, Pacific Northwest National Laboratory (US Department of Energy), Richland, WA 99352, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Talley ND, Danzig BA, Cannon WR, Martinez J, Shreve AP, MacDonald G. Concentration and Ion Induced Effects on Nucleotide Binding, Aggregation and Thermal Unfolding Transitions of Reca. Biophys J 2010. [DOI: 10.1016/j.bpj.2009.12.171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
|
25
|
Sharp JL, Borkowski JJ, Schmoyer D, Daly DS, Purvine S, Cannon WR, Hurst GB. Statistically appraising process quality of affinity isolation experiments. Comput Stat Data Anal 2009. [DOI: 10.1016/j.csda.2008.05.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
26
|
|
27
|
Pelletier DA, Hurst GB, Foote LJ, Lankford PK, McKeown CK, Lu TY, Schmoyer DD, Shah MB, Hervey WJ, McDonald WH, Hooker BS, Cannon WR, Daly DS, Gilmore JM, Wiley HS, Auberry DL, Wang Y, Larimer FW, Kennel SJ, Doktycz MJ, Morrell-Falvey JL, Owens ET, Buchanan MV. A general system for studying protein-protein interactions in Gram-negative bacteria. J Proteome Res 2008; 7:3319-28. [PMID: 18590317 DOI: 10.1021/pr8001832] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
One of the most promising methods for large-scale studies of protein interactions is isolation of an affinity-tagged protein with its in vivo interaction partners, followed by mass spectrometric identification of the copurified proteins. Previous studies have generated affinity-tagged proteins using genetic tools or cloning systems that are specific to a particular organism. To enable protein-protein interaction studies across a wider range of Gram-negative bacteria, we have developed a methodology based on expression of affinity-tagged "bait" proteins from a medium copy-number plasmid. This construct is based on a broad-host-range vector backbone (pBBR1MCS5). The vector has been modified to incorporate the Gateway DEST vector recombination region, to facilitate cloning and expression of fusion proteins bearing a variety of affinity, fluorescent, or other tags. We demonstrate this methodology by characterizing interactions among subunits of the DNA-dependent RNA polymerase complex in two metabolically versatile Gram-negative microbial species of environmental interest, Rhodopseudomonas palustris CGA010 and Shewanella oneidensis MR-1. Results compared favorably with those for both plasmid and chromosomally encoded affinity-tagged fusion proteins expressed in a model organism, Escherichia coli.
Collapse
Affiliation(s)
- Dale A Pelletier
- Biosciences Division, Chemical Sciences Division, Computer Science and Mathematics Division, and Physical Sciences Directorate, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Webb-Robertson BJM, Cannon WR, Oehmen CS, Shah AR, Gurumoorthi V, Lipton MS, Waters KM. A support vector machine model for the prediction of proteotypic peptides for accurate mass and time proteomics. ACTA ACUST UNITED AC 2008; 24:1503-9. [PMID: 18453551 DOI: 10.1093/bioinformatics/btn218] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION The standard approach to identifying peptides based on accurate mass and elution time (AMT) compares profiles obtained from a high resolution mass spectrometer to a database of peptides previously identified from tandem mass spectrometry (MS/MS) studies. It would be advantageous, with respect to both accuracy and cost, to only search for those peptides that are detectable by MS (proteotypic). RESULTS We present a support vector machine (SVM) model that uses a simple descriptor space based on 35 properties of amino acid content, charge, hydrophilicity and polarity for the quantitative prediction of proteotypic peptides. Using three independently derived AMT databases (Shewanella oneidensis, Salmonella typhimurium, Yersinia pestis) for training and validation within and across species, the SVM resulted in an average accuracy measure of 0.8 with a SD of <0.025. Furthermore, we demonstrate that these results are achievable with a small set of 12 variables and can achieve high proteome coverage. AVAILABILITY http://omics.pnl.gov/software/STEPP.php. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
29
|
Cannon WR, Taasevigen D, Baxter DJ, Laskin J. Evaluation of the influence of amino acid composition on the propensity for collision-induced dissociation of model peptides using molecular dynamics simulations. J Am Soc Mass Spectrom 2007; 18:1625-37. [PMID: 17651984 DOI: 10.1016/j.jasms.2007.06.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2007] [Revised: 06/13/2007] [Accepted: 06/14/2007] [Indexed: 05/16/2023]
Abstract
The dynamical behavior of model peptides was evaluated with respect to their ability to form internal proton donor-acceptor pairs using molecular dynamics simulations. The proton donor-acceptor pairs are postulated to be prerequisites for peptide bond cleavage resulting in formation of b and y ions during low-energy collision-induced dissociation in tandem mass spectrometry (MS/MS). The simulations for the polyalanine pentamer Ala(5)H(+) were compared with experimental data from energy-resolved surface induced dissociation (SID) studies. The results of the simulation are insightful into the events that likely lead up to the fragmentation of peptides. Nine-mer polyalanine-based model peptides were used to examine the dynamical effect of each of the 20 common amino acids on the probability to form donor-acceptor pairs at labile peptide bonds. A range of probabilities was observed as a function of the substituted amino acid. However, the location of the peptide bond involved in the donor-acceptor pair plays a critical role in the dynamical behavior. This influence of position on the probability of forming a donor-acceptor pair would be hard to predict from statistical analyses on experimental spectra of aggregate, diverse peptides. In addition, the inclusion of basic side chains in the model peptides alters the probability of forming donor-acceptor pairs across the entire backbone. In this case, there are still more ionizing protons than basic residues, but the side chains of the basic amino acids form stable hydrogen bond networks with the peptide carbonyl oxygens and thus act to prevent free access of "mobile protons" to labile peptide bonds. It is clear from the work that the identification of peptides from low-energy CID using automated computational methods should consider the location of the fragmenting bond as well as the amino acid composition.
Collapse
Affiliation(s)
- William R Cannon
- Computational Biology and Bioinformatics Group, Computational and Information Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, USA.
| | | | | | | |
Collapse
|
30
|
Sharp JL, Anderson KK, Hurst GB, Daly DS, Pelletier DA, Cannon WR, Auberry DL, Schmoyer DD, McDonald WH, White AM, Hooker BS, Victry KD, Buchanan MV, Kery V, Wiley HS. Statistically inferring protein-protein associations with affinity isolation LC-MS/MS assays. J Proteome Res 2007; 6:3788-95. [PMID: 17691832 DOI: 10.1021/pr0701106] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Affinity isolation of protein complexes followed by protein identification by LC-MS/MS is an increasingly popular approach for mapping protein interactions. However, systematic and random assay errors from multiple sources must be considered to confidently infer authentic protein-protein interactions. To address this issue, we developed a general, robust statistical method for inferring authentic interactions from protein prey-by-bait frequency tables using a binomial-based likelihood ratio test (LRT) coupled with Bayes' Odds estimation. We then applied our LRT-Bayes' algorithm experimentally using data from protein complexes isolated from Rhodopseudomonas palustris. Our algorithm, in conjunction with the experimental protocol, inferred with high confidence authentic interacting proteins from abundant, stable complexes, but few or no authentic interactions for lower-abundance complexes. The algorithm can discriminate against a background of prey proteins that are detected in association with a large number of baits as an artifact of the measurement. We conclude that the experimental protocol including the LRT-Bayes' algorithm produces results with high confidence but moderate sensitivity. We also found that Monte Carlo simulation is a feasible tool for checking modeling assumptions, estimating parameters, and evaluating the significance of results in protein association studies.
Collapse
Affiliation(s)
- Julia L Sharp
- Clemson University, 237 Barre Hall, Clemson, South Carolina 29634-0313, Pacific Northwest National Laboratory, P.O. Box 999, Richland, Washington 99352, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Abstract
Mass spectrometry offers a high-throughput approach to quantifying the proteome associated with a biological sample and hence has become the primary approach of proteomic analyses. Computation is tightly coupled to this advanced technological platform as a required component of not only peptide and protein identification, but quantification and functional inference, such as protein modifications and interactions. Proteomics faces several key computational challenges such as identification of proteins and peptides from tandem mass spectra as well as their quantitation. In addition, the application of proteomics to systems biology requires understanding the functional proteome, including how the dynamics of the cell change in response to protein modifications and complex interactions between biomolecules. This review presents an overview of recently developed methods and their impact on these core computational challenges currently facing proteomics.
Collapse
Affiliation(s)
- Bobbie-Jo M Webb-Robertson
- Department of Computational Biology & Bioinformatics, Pacific Northwest National Laboratory, P.O. BOX 999, Richland, WA 99352, USA.
| | | |
Collapse
|
32
|
Cannon WR, Jarman KH, Webb-Robertson BJM, Baxter DJ, Oehmen CS, Jarman KD, Heredia-Langner A, Auberry KJ, Anderson GA. Comparison of probability and likelihood models for peptide identification from tandem mass spectrometry data. J Proteome Res 2006; 4:1687-98. [PMID: 16212422 DOI: 10.1021/pr050147v] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We evaluate statistical models used in two-hypothesis tests for identifying peptides from tandem mass spectrometry data. The null hypothesis H(0), that a peptide matches a spectrum by chance, requires information on the probability of by-chance matches between peptide fragments and peaks in the spectrum. Likewise, the alternate hypothesis H(A), that the spectrum is due to a particular peptide, requires probabilities that the peptide fragments would indeed be observed if it was the causative agent. We compare models for these probabilities by determining the identification rates produced by the models using an independent data set. The initial models use different probabilities depending on fragment ion type, but uniform probabilities for each ion type across all of the labile bonds along the backbone. More sophisticated models for probabilities under both H(A) and H(0) are introduced that do not assume uniform probabilities for each ion type. In addition, the performance of these models using a standard likelihood model is compared to an information theory approach derived from the likelihood model. Also, a simple but effective model for incorporating peak intensities is described. Finally, a support-vector machine is used to discriminate between correct and incorrect identifications based on multiple characteristics of the scoring functions. The results are shown to reduce the misidentification rate significantly when compared to a benchmark cross-correlation based approach.
Collapse
Affiliation(s)
- William R Cannon
- Computational Biology and Bioinformatics Group, Computational and Information Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA 99352, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Heredia-Langner A, Cannon WR, Jarman KD, Jarman KH. Sequence optimization as an alternative to de novo analysis of tandem mass spectrometry data. Bioinformatics 2004; 20:2296-304. [PMID: 15087321 DOI: 10.1093/bioinformatics/bth242] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Peptide identification following tandem mass spectrometry (MS/MS) is usually achieved by searching for the best match between the mass spectrum of an unidentified peptide and model spectra generated from peptides in a sequence database. This methodology will be successful only if the peptide under investigation belongs to an available database. Our objective is to develop and test the performance of a heuristic optimization algorithm capable of dealing with some features commonly found in actual MS/MS spectra that tend to stop simpler deterministic solution approaches. RESULTS We present the implementation of a Genetic Algorithm (GA) in the reconstruction of amino acid sequences using only spectral features, discuss some of the problems associated with this approach and compare its performance to a de novo sequencing method. The GA can potentially overcome some of the most problematic aspects associated with de novo analysis of real MS/MS data such as missing or unclearly defined peaks and may prove to be a valuable tool in the proteomics field. We assess the performance of our algorithm under conditions of perfect spectral information, in situations where key spectral features are missing, and using real MS/MS spectral data.
Collapse
|
34
|
|
35
|
Cannon WR, Jarman KD. Improved peptide sequencing using isotope information inherent in tandem mass spectra. Rapid Commun Mass Spectrom 2003; 17:1793-1801. [PMID: 12872285 DOI: 10.1002/rcm.1119] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
We demonstrate here the use of natural isotopic 'labels' in peptides to aid in the identification of peptides with a de novo algorithm. Using data from ion trap tandem mass spectrometric (MS/MS) analysis of 102 tryptic peptides, we have analyzed multiple series of peaks within LCQ MS/MS spectra that 'spell' peptide sequences. Isotopic peaks from naturally abundant isotopes are particularly prominent even after peak centroiding on y- and b-series ions and lead to increased confidence in the identification of the precursor peptides. Sequence analysis of the MS/MS data is accomplished by finding sequences and subsequences in a hierarchical manner within the spectra.
Collapse
Affiliation(s)
- William R Cannon
- Computational Biosciences, Pacific Northwest National Laboratory, Richland, WA 99352, USA.
| | | |
Collapse
|
36
|
|
37
|
|
38
|
Raj PM, Dunn SM, Cannon WR. Edge Sharpening for Unbiased Edge Detection in Field Emission Scanning Electron Microscope Images. Microsc Microanal 1999; 5:136-146. [PMID: 10341013 DOI: 10.1017/s1431927699000100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
: We report here a specific type of edge strength anisotropy observed in field emission scanning electron microscope (FESEM) images. The images show weaker edge gradients in the scanning direction and hence these edges frequently go undetected. Direct application of edge detection algorithms to images with nondistinct edges, such as powder particles, show strong bias to edges perpendicular to the scanning direction. Edge orientation polarograms obtained from these images always show strong fictitious particle orientation in the scanning direction. In this work, we discuss an edge-sharpening algorithm that corrects for this bias and results in relatively more accurate and consistent edge orientation information.
Collapse
Affiliation(s)
- PM Raj
- Center for Ceramic Research, Rutgers, The State University of New Jersey, 607 Taylor Road, Piscataway, NJ 08854-8065
| | | | | |
Collapse
|
39
|
Affiliation(s)
- W R Cannon
- Department of Chemistry, Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | | |
Collapse
|
40
|
Abstract
Poisson-Boltzmann calculations were used to determine the pKa of protein functional groups in the unliganded dihydrofolate reductase enzyme, and the pKa of protein and ligand groups in methotrexate-enzyme complexes. The results reported here are in conflict with two fundamental tenets of dihydrofolate reductase inhibition by methotrexate: (1) Asp27 is not expected to be protonated near pH 6.5 in the apoenzyme as previously proposed based on fitting of empirical equations to binding data, and (2) the calculated pKa for the pteridine N1 of the inhibitor while bound to the protein is significantly lower than that estimated for this group from interpretation of NMR data (>10). In fact, the electrostatic calculations and complementary quantum chemical calculations indicate that Asp27 is likely protonated when methotrexate is bound, resulting in a neutral dipole-dipole interaction rather than a salt-bridge between the enzyme and the inhibitor. Reasons for this discrepancy with the experimental data are discussed. Furthermore, His45 and Glu17 in the Escherichia coli enzyme are proposed to be in part responsible for the pH dependence of the conformational degeneracy in the inhibitor-enzyme complex.
Collapse
Affiliation(s)
- W R Cannon
- Department of Chemistry 152 Davey Laboratory, Pennsylvania State University, University Park, PA 16802, USA
| | | | | |
Collapse
|
41
|
Cannon WR, Garrison BJ, Benkovic SJ. Electrostatic Characterization of Enzyme Complexes: Evaluation of the Mechanism of Catalysis of Dihydrofolate Reductase. J Am Chem Soc 1997. [DOI: 10.1021/ja962621r] [Citation(s) in RCA: 41] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- William R. Cannon
- Contribution from 152 Davey Laboratory, Department of Chemistry, Pennsylvania State University, University Park, Pennsylvania 16802
| | - Barbara J. Garrison
- Contribution from 152 Davey Laboratory, Department of Chemistry, Pennsylvania State University, University Park, Pennsylvania 16802
| | - Stephen J. Benkovic
- Contribution from 152 Davey Laboratory, Department of Chemistry, Pennsylvania State University, University Park, Pennsylvania 16802
| |
Collapse
|
42
|
Abstract
We have analysed enzyme catalysis through a re-examination of the reaction coordinate. The ground state of the enzyme-substrate complex is shown to be related to the transition state through the mean force acting along the reaction path; as such, catalytic strategies cannot be resolved into ground state destabilization versus transition state stabilization. We compare the role of active-site residues in the chemical step with the analogous role played by solvent molecules in the environment of the noncatalysed reaction. We conclude that enzyme catalysis is significantly enhanced by the ability of the enzyme to preorganize the reaction environment. This complementation of the enzyme to the substrate's transition state geometry acts to eliminate the slow components of solvent reorganization required for reactions in aqueous solution. Dramatically strong binding of the transition state geometry is not required.
Collapse
Affiliation(s)
- W R Cannon
- Department of Chemistry, Pennsylvania State University, University Park 16802, USA
| | | | | |
Collapse
|
43
|
Cannon WR, Briggs JM, Shen J, McCammon JA, Quiocho FA. Conservative and nonconservative mutations in proteins: anomalous mutations in a transport receptor analyzed by free energy and quantum chemical calculations. Protein Sci 1995; 4:387-93. [PMID: 7795522 PMCID: PMC2143071 DOI: 10.1002/pro.5560040305] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Experimental studies on a bacterial sulfate receptor have indicated anomalous relative binding affinities for the mutations Ser130-->Cys,Ser130-->Gly, and Ser130-->Ala. The loss of affinity for sulfate in the former mutation was previously attributed to a greater steric effect on the part of the Cys side chain relative to the Ser side chain, whereas the relatively small loss of binding affinity for the latter two mutations was attributed to the loss of a single hydrogen bond. In this report we present quantum chemical and statistical thermodynamic studies of these mutations. Qualitative results from these studies indicate that for the Ser130-->Cys mutation the large decrease in binding affinity is in part caused by steric effects, but also significantly by the differential work required to polarize the Cys thiol group relative to the Ser hydroxyl group. The Gly mutant cobinds a water molecule in the same location as the Ser side chain resulting in a relatively small decrease in binding affinity. Results for the Ala mutant are in disagreement with experimental results but are likely to be limited by insufficient sampling of configuration space due to physical constraints applied during the simulation.
Collapse
Affiliation(s)
- W R Cannon
- Department of Chemistry, University of Houston, Texas 77204-5641, USA
| | | | | | | | | |
Collapse
|
44
|
Staley DH, Holt WR, Cannon WR. Necessary and sufficient condition for a model insect population to go to extinction. Bull Math Biol 1974; 36:527-33. [PMID: 4457197 DOI: 10.1007/bf02463264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
|