1
|
Ahmad N, Singh A, Gupta A, Pant P, Singh TP, Sharma S, Sharma P. Discovery of the Lead Molecules Targeting the First Step of the Histidine Biosynthesis Pathway of Acinetobacter baumannii. J Chem Inf Model 2022; 62:1744-1759. [PMID: 35333517 DOI: 10.1021/acs.jcim.1c01421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Acinetobacter baumannii is a multidrug-resistant, opportunistic, nosocomial pathogen for which a new line of treatments is desperately needed. We have targeted the enzyme of the first step of the histidine biosynthesis pathway, viz., ATP-phosphoribosyltransferase (ATP-PRT). The three-dimensional structure of ATP-PRT was predicted on the template of the known three-dimensional structure of ATP-PRT from Psychrobacter arcticus (PaATPPRT) using a homology modeling approach. High-throughput virtual screening (HTVS) of the antibacterial library of Life Chemicals Inc., Ontario, Canada was carried out followed by molecular dynamics simulations of the top hit compounds. In silico results were then biochemically validated using surface plasmon resonance spectroscopy. We found that two compounds, namely, F0843-0019 and F0608-0626, were binding with micromolar affinities to the ATP-phosphoribosyltransferase from Acinetobacter baumannii (AbATPPRT). Both of these compounds were binding in the same way as AMP in PaATPPRT, and the important residues of the active site, viz., Val4, Ser72, Thr76, Tyr77, Glu95, Lys134, Val136, and Tyr156, were also interacting via hydrogen bonds. The calculated binding energies of these compounds were -10.5 kcal/mol and -11.1 kcal/mol, respectively. These two compounds can be used as the potential lead molecules for designing antibacterial compounds in the future, and this information will help in drug discovery programs against Acinetobacter worldwide.
Collapse
Affiliation(s)
- Nabeel Ahmad
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India
| | - Anamika Singh
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India
| | - Akshita Gupta
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India
| | - Pradeep Pant
- Department of Chemistry, Indian Institute of Technology, Delhi 110016, India
| | - Tej P Singh
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India
| | - Sujata Sharma
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India
| | - Pradeep Sharma
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi 110029, India
| |
Collapse
|
2
|
Shah HA, Liu J, Yang Z, Feng J. Review of Machine Learning Methods for the Prediction and Reconstruction of Metabolic Pathways. Front Mol Biosci 2021; 8:634141. [PMID: 34222327 PMCID: PMC8247443 DOI: 10.3389/fmolb.2021.634141] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 06/01/2021] [Indexed: 11/13/2022] Open
Abstract
Prediction and reconstruction of metabolic pathways play significant roles in many fields such as genetic engineering, metabolic engineering, drug discovery, and are becoming the most active research topics in synthetic biology. With the increase of related data and with the development of machine learning techniques, there have many machine leaning based methods been proposed for prediction or reconstruction of metabolic pathways. Machine learning techniques are showing state-of-the-art performance to handle the rapidly increasing volume of data in synthetic biology. To support researchers in this field, we briefly review the research progress of metabolic pathway reconstruction and prediction based on machine learning. Some challenging issues in the reconstruction of metabolic pathways are also discussed in this paper.
Collapse
Affiliation(s)
- Hayat Ali Shah
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, China
| | - Juan Liu
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, China
| | - Zhihui Yang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, China
| | - Jing Feng
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, China
| |
Collapse
|
3
|
Modelling Cell Metabolism: A Review on Constraint-Based Steady-State and Kinetic Approaches. Processes (Basel) 2021. [DOI: 10.3390/pr9020322] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Studying cell metabolism serves a plethora of objectives such as the enhancement of bioprocess performance, and advancement in the understanding of cell biology, of drug target discovery, and in metabolic therapy. Remarkable successes in these fields emerged from heuristics approaches, for instance, with the introduction of effective strategies for genetic modifications, drug developments and optimization of bioprocess management. However, heuristics approaches have showed significant shortcomings, such as to describe regulation of metabolic pathways and to extrapolate experimental conditions. In the specific case of bioprocess management, such shortcomings limit their capacity to increase product quality, while maintaining desirable productivity and reproducibility levels. For instance, since heuristics approaches are not capable of prediction of the cellular functions under varying experimental conditions, they may lead to sub-optimal processes. Also, such approaches used for bioprocess control often fail in regulating a process under unexpected variations of external conditions. Therefore, methodologies inspired by the systematic mathematical formulation of cell metabolism have been used to address such drawbacks and achieve robust reproducible results. Mathematical modelling approaches are effective for both the characterization of the cell physiology, and the estimation of metabolic pathways utilization, thus allowing to characterize a cell population metabolic behavior. In this article, we present a review on methodology used and promising mathematical modelling approaches, focusing primarily to investigate metabolic events and regulation. Proceeding from a topological representation of the metabolic networks, we first present the metabolic modelling approaches that investigate cell metabolism at steady state, complying to the constraints imposed by mass conservation law and thermodynamics of reactions reversibility. Constraint-based models (CBMs) are reviewed highlighting the set of assumed optimality functions for reaction pathways. We explore models simulating cell growth dynamics, by expanding flux balance models developed at steady state. Then, discussing a change of metabolic modelling paradigm, we describe dynamic kinetic models that are based on the mathematical representation of the mechanistic description of nonlinear enzyme activities. In such approaches metabolic pathway regulations are considered explicitly as a function of the activity of other components of metabolic networks and possibly far from the metabolic steady state. We have also assessed the significance of metabolic model parameterization in kinetic models, summarizing a standard parameter estimation procedure frequently employed in kinetic metabolic modelling literature. Finally, some optimization practices used for the parameter estimation are reviewed.
Collapse
|
4
|
Pathway Tools Visualization of Organism-Scale Metabolic Networks. Metabolites 2021; 11:metabo11020064. [PMID: 33499002 PMCID: PMC7911265 DOI: 10.3390/metabo11020064] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 01/12/2021] [Accepted: 01/12/2021] [Indexed: 12/20/2022] Open
Abstract
Metabolomics, synthetic biology, and microbiome research demand information about organism-scale metabolic networks. The convergence of genome sequencing and computational inference of metabolic networks has enabled great progress toward satisfying that demand by generating metabolic reconstructions from the genomes of thousands of sequenced organisms. Visualization of whole metabolic networks is critical for aiding researchers in understanding, analyzing, and exploiting those reconstructions. We have developed bioinformatics software tools that automatically generate a full metabolic-network diagram for an organism, and that enable searching and analyses of the network. The software generates metabolic-network diagrams for unicellular organisms, for multi-cellular organisms, and for pan-genomes and organism communities. Search tools enable users to find genes, metabolites, enzymes, reactions, and pathways within a diagram. The diagrams are zoomable to enable researchers to study local neighborhoods in detail and to see the big picture. The diagrams also serve as tools for comparison of metabolic networks and for interpreting high-throughput datasets, including transcriptomics, metabolomics, and reaction fluxes computed by metabolic models. These data can be overlaid on the metabolic charts to produce animated zoomable displays of metabolic flux and metabolite abundance. The BioCyc.org website contains whole-network diagrams for more than 18,000 sequenced organisms. The ready availability of organism-specific metabolic network diagrams and associated tools for almost any sequenced organism are useful for researchers working to better understand the metabolism of their organism and to interpret high-throughput datasets in a metabolic context.
Collapse
|
5
|
Sona P, Hong JH, Lee S, Kim BJ, Hong WY, Jung J, Kim HN, Kim HL, Christopher D, Herviou L, Im YH, Lee KY, Kim TS, Jung J. Integrated genome sizing (IGS) approach for the parallelization of whole genome analysis. BMC Bioinformatics 2018; 19:462. [PMID: 30509173 PMCID: PMC6276166 DOI: 10.1186/s12859-018-2499-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 11/16/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The use of whole genome sequence has increased recently with rapid progression of next-generation sequencing (NGS) technologies. However, storing raw sequence reads to perform large-scale genome analysis pose hardware challenges. Despite advancement in genome analytic platforms, efficient approaches remain relevant especially as applied to the human genome. In this study, an Integrated Genome Sizing (IGS) approach is adopted to speed up multiple whole genome analysis in high-performance computing (HPC) environment. The approach splits a genome (GRCh37) into 630 chunks (fragments) wherein multiple chunks can simultaneously be parallelized for sequence analyses across cohorts. RESULTS IGS was integrated on Maha-Fs (HPC) system, to provide the parallelization required to analyze 2504 whole genomes. Using a single reference pilot genome, NA12878, we compared the NGS process time between Maha-Fs (NFS SATA hard disk drive) and SGI-UV300 (solid state drive memory). It was observed that SGI-UV300 was faster, having 32.5 mins of process time, while that of the Maha-Fs was 55.2 mins. CONCLUSIONS The implementation of IGS can leverage the ability of HPC systems to analyze multiple genomes simultaneously. We believe this approach will accelerate research advancement in personalized genomic medicine. Our method is comparable to the fastest methods for sequence alignment.
Collapse
Affiliation(s)
- Peter Sona
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025
| | - Jong Hui Hong
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025
| | - Sunho Lee
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025
| | - Byong Joon Kim
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025
| | - Woon-Young Hong
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025
| | - Jongcheol Jung
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025
| | - Han-Na Kim
- PGM21 (Personalized Genomic Medicine 21), Ewha Womans University Medical Center, 1071, Anyang Cheon-ro, Yangcheon-gu, Seoul, 158-710, Korea
| | - Hyung-Lae Kim
- PGM21 (Personalized Genomic Medicine 21), Ewha Womans University Medical Center, 1071, Anyang Cheon-ro, Yangcheon-gu, Seoul, 158-710, Korea
| | - David Christopher
- Bioinformatics Solutions, 900 N McCarthy Blvd., Milpitas, CA, 95035, USA
| | - Laurent Herviou
- Bioinformatics Solutions, 900 N McCarthy Blvd., Milpitas, CA, 95035, USA
| | - Young Hwan Im
- Bioinformatics Solutions, 900 N McCarthy Blvd., Milpitas, CA, 95035, USA
| | - Kwee-Yum Lee
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025.,Faculty of Medicine, University of Queensland, QLD, Brisbane, 4072, Australia
| | - Tae Soon Kim
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025.,Department of Clinical Medical Sciences, Seoul National University College of Medicine, 71 Ihwajang-gil, Jongno-gu, Seoul, 03087, South Korea
| | - Jongsun Jung
- Genome Data Integration Center, Syntekabio Incorporated, Techno-2ro B-512, Yuseong-gu, Daejeon, Republic of Korea, 34025.
| |
Collapse
|
6
|
Land M, Hauser L, Jun SR, Nookaew I, Leuze MR, Ahn TH, Karpinets T, Lund O, Kora G, Wassenaar T, Poudel S, Ussery DW. Insights from 20 years of bacterial genome sequencing. Funct Integr Genomics 2015; 15:141-61. [PMID: 25722247 PMCID: PMC4361730 DOI: 10.1007/s10142-015-0433-4] [Citation(s) in RCA: 405] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2015] [Revised: 02/11/2015] [Accepted: 02/12/2015] [Indexed: 12/18/2022]
Abstract
Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance, bioforensics, and infectious disease epidemiology. In the near future, high-throughput sequencing of patient metagenomic samples could revolutionize medicine in terms of speed and accuracy of finding pathogens and knowing how to treat them.
Collapse
Affiliation(s)
- Miriam Land
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Loren Hauser
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
- Joint Institute for Biological Sciences, University of Tennessee, Knoxville, TN 37996 USA
- Department of Microbiology, University of Tennessee, Knoxville, TN 37996 USA
| | - Se-Ran Jun
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Intawat Nookaew
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Michael R. Leuze
- Computer Science and Mathematics Division, Computer Science Research Group, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Tae-Hyuk Ahn
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
- Computer Science and Mathematics Division, Computer Science Research Group, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Tatiana Karpinets
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Ole Lund
- Center for Biological Sequence Analysis, Department of Systems Biology, The Technical University of Denmark, Kgs. Lyngby, 2800 Denmark
| | - Guruprased Kora
- Computer Science and Mathematics Division, Computer Science Research Group, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
| | - Trudy Wassenaar
- Molecular Microbiology and Genomics Consultants, Tannenstr 7, 55576 Zotzenheim, Germany
| | - Suresh Poudel
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
- Genome Science and Technology, University of Tennessee, Knoxville, TN 37996 USA
| | - David W. Ussery
- Comparative Genomics Group, Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831 USA
- Joint Institute for Biological Sciences, University of Tennessee, Knoxville, TN 37996 USA
- Center for Biological Sequence Analysis, Department of Systems Biology, The Technical University of Denmark, Kgs. Lyngby, 2800 Denmark
- Genome Science and Technology, University of Tennessee, Knoxville, TN 37996 USA
| |
Collapse
|
7
|
Loira N, Zhukova A, Sherman DJ. Pantograph: A template-based method for genome-scale metabolic model reconstruction. J Bioinform Comput Biol 2015; 13:1550006. [PMID: 25572717 DOI: 10.1142/s0219720015500067] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Genome-scale metabolic models are a powerful tool to study the inner workings of biological systems and to guide applications. The advent of cheap sequencing has brought the opportunity to create metabolic maps of biotechnologically interesting organisms. While this drives the development of new methods and automatic tools, network reconstruction remains a time-consuming process where extensive manual curation is required. This curation introduces specific knowledge about the modeled organism, either explicitly in the form of molecular processes, or indirectly in the form of annotations of the model elements. Paradoxically, this knowledge is usually lost when reconstruction of a different organism is started. We introduce the Pantograph method for metabolic model reconstruction. This method combines a template reaction knowledge base, orthology mappings between two organisms, and experimental phenotypic evidence, to build a genome-scale metabolic model for a target organism. Our method infers implicit knowledge from annotations in the template, and rewrites these inferences to include them in the resulting model of the target organism. The generated model is well suited for manual curation. Scripts for evaluating the model with respect to experimental data are automatically generated, to aid curators in iterative improvement. We present an implementation of the Pantograph method, as a toolbox for genome-scale model reconstruction, curation and validation. This open source package can be obtained from: http://pathtastic.gforge.inria.fr.
Collapse
Affiliation(s)
- Nicolas Loira
- Center for Mathematical Modeling and Center for Genome Regulation, Universidad de Chile, Beauchef 851, Piso7, Santiago, Chile
| | | | | |
Collapse
|
8
|
Azam SS, Shamim A. An insight into the exploration of druggable genome of Streptococcus gordonii for the identification of novel therapeutic candidates. Genomics 2014; 104:203-14. [DOI: 10.1016/j.ygeno.2014.07.007] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2014] [Revised: 07/02/2014] [Accepted: 07/17/2014] [Indexed: 01/17/2023]
|
9
|
Shanmugasundram A, Gonzalez-Galarza FF, Wastling JM, Vasieva O, Jones AR. An integrated approach to understand apicomplexan metabolism from their genomes. BMC Bioinformatics 2014. [PMCID: PMC4071867 DOI: 10.1186/1471-2105-15-s3-a3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
10
|
D'Eustachio P. Pathway databases: making chemical and biological sense of the genomic data flood. ACTA ACUST UNITED AC 2013; 20:629-35. [PMID: 23706629 DOI: 10.1016/j.chembiol.2013.03.018] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2013] [Revised: 03/16/2013] [Accepted: 03/22/2013] [Indexed: 01/16/2023]
Abstract
Pathway databases are a means to systematically associate proteins with their functions and link them into networks that describe the reaction space of an organism. Here, the Reactome Knowledgebase provides a convenient example to illustrate strategies used to assemble such a reaction space based on manually curated experimental data, approaches to semiautomated extension of these manual annotations to infer annotations for a large fraction of a species' proteins, and the use of networks of functional annotations to infer pathway relationships among variant proteins that have been associated with disease risk through genome-wide surveys and resequencing studies of tumors.
Collapse
Affiliation(s)
- Peter D'Eustachio
- Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, 550 First Avenue, MSB 390, New York, NY 10016, USA.
| |
Collapse
|
11
|
Croft D, Mundo AF, Haw R, Milacic M, Weiser J, Wu G, Caudy M, Garapati P, Gillespie M, Kamdar MR, Jassal B, Jupe S, Matthews L, May B, Palatnik S, Rothfels K, Shamovsky V, Song H, Williams M, Birney E, Hermjakob H, Stein L, D'Eustachio P. The Reactome pathway knowledgebase. Nucleic Acids Res 2013; 42:D472-7. [PMID: 24243840 PMCID: PMC3965010 DOI: 10.1093/nar/gkt1102] [Citation(s) in RCA: 1136] [Impact Index Per Article: 103.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Reactome (http://www.reactome.org) is a manually curated open-source open-data resource of human pathways and reactions. The current version 46 describes 7088 human proteins (34% of the predicted human proteome), participating in 6744 reactions based on data extracted from 15 107 research publications with PubMed links. The Reactome Web site and analysis tool set have been completely redesigned to increase speed, flexibility and user friendliness. The data model has been extended to support annotation of disease processes due to infectious agents and to mutation.
Collapse
Affiliation(s)
- David Croft
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada, College of Pharmacy and Health Sciences, St. John's University, Queens, NY 11439, USA, NYU School of Medicine, New York, NY 10016, USA, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA and Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A1, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Shanmugasundram A, Gonzalez-Galarza FF, Wastling JM, Vasieva O, Jones AR. Library of Apicomplexan Metabolic Pathways: a manually curated database for metabolic pathways of apicomplexan parasites. Nucleic Acids Res 2012. [PMID: 23193253 PMCID: PMC3531055 DOI: 10.1093/nar/gks1139] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Library of Apicomplexan Metabolic Pathways (LAMP, http://www.llamp.net) is a web database that provides near complete mapping from genes to the central metabolic functions for some of the prominent intracellular parasites of the phylum Apicomplexa. This phylum includes the causative agents of malaria, toxoplasmosis and theileriosis-diseases with a huge economic and social impact. A number of apicomplexan genomes have been sequenced, but the accurate annotation of gene function remains challenging. We have adopted an approach called metabolic reconstruction, in which genes are systematically assigned to functions within pathways/networks for Toxoplasma gondii, Neospora caninum, Cryptosporidium and Theileria species, and Babesia bovis. Several functions missing from pathways have been identified, where the corresponding gene for an essential process appears to be absent from the current genome annotation. For each species, LAMP contains interactive diagrams of each pathway, hyperlinked to external resources and annotated with detailed information, including the sources of evidence used. We have also developed a section to highlight the overall metabolic capabilities of each species, such as the ability to synthesize or the dependence on the host for a particular metabolite. We expect this new database will become a valuable resource for fundamental and applied research on the Apicomplexa.
Collapse
Affiliation(s)
- Achchuthan Shanmugasundram
- Department of Functional and Comparative Genomics, Institute of Integrative Biology, University of Liverpool, Biosciences Building, Crown Street, Liverpool L69 7ZB, UK.
| | | | | | | | | |
Collapse
|
13
|
Gomes MR, Guimarães ACR, de Miranda AB. Specific and nonhomologous isofunctional enzymes of the genetic information processing pathways as potential therapeutical targets for tritryps. Enzyme Res 2011; 2011:543912. [PMID: 21808726 PMCID: PMC3145330 DOI: 10.4061/2011/543912] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2011] [Revised: 03/22/2011] [Accepted: 05/05/2011] [Indexed: 12/03/2022] Open
Abstract
Leishmania major, Trypanosoma brucei, and Trypanosoma cruzi (Tritryps) are unicellular protozoa that cause leishmaniasis, sleeping sickness and Chagas' disease, respectively. Most drugs against them were discovered through the screening of large numbers of compounds against whole parasites. Nonhomologous isofunctional enzymes (NISEs) may present good opportunities for the identification of new putative drug targets because, though sharing the same enzymatic activity, they possess different three-dimensional structures thus allowing the development of molecules against one or other isoform. From public data of the Tritryps' genomes, we reconstructed the Genetic Information Processing Pathways (GIPPs). We then used AnEnPi to look for the presence of these enzymes between Homo sapiens and Tritryps, as well as specific enzymes of the parasites. We identified three candidates (ECs 3.1.11.2 and 6.1.1.-) in these pathways that may be further studied as new therapeutic targets for drug development against these parasites.
Collapse
Affiliation(s)
- Monete Rajão Gomes
- Laboratório de Biologia Computacional e Sistemas, Instituto Oswaldo Cruz/FIOCRUZ, 21045-900 Rio de Janeiro, RJ, Brazil
| | | | | |
Collapse
|
14
|
Boutte CC, Crosson S. The complex logic of stringent response regulation in Caulobacter crescentus: starvation signalling in an oligotrophic environment. Mol Microbiol 2011; 80:695-714. [PMID: 21338423 DOI: 10.1111/j.1365-2958.2011.07602.x] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Bacteria rapidly adapt to nutritional changes via the stringent response, which entails starvation-induced synthesis of the small molecule, ppGpp, by RelA/SpoT homologue (Rsh) enzymes. Binding of ppGpp to RNA polymerase modulates the transcription of hundreds of genes and remodels the physiology of the cell. Studies of the stringent response have primarily focused on copiotrophic bacteria such as Escherichia coli; little is known about how stringent signalling is regulated in species that live in consistently nutrient-limited (i.e. oligotrophic) environments. Here we define the input logic and transcriptional output of the stringent response in the oligotroph, Caulobacter crescentus. The sole Rsh protein, SpoT(CC), binds to and is regulated by the ribosome, and exhibits AND-type control logic in which amino acid starvation is a necessary but insufficient signal for activation of ppGpp synthesis. While both glucose and ammonium starvation upregulate the synthesis of ppGpp, SpoT(CC) detects these starvation signals by two independent mechanisms. Although the logic of stringent response control in C. crescentus differs from E. coli, the global transcriptional effects of elevated ppGpp are similar, with the exception of 16S rRNA transcription, which is controlled independently of spoT(CC). This study highlights how the regulatory logic controlling the stringent response may be adapted to the nutritional niche of a bacterial species.
Collapse
Affiliation(s)
- Cara C Boutte
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, IL, USA
| | | |
Collapse
|
15
|
Capriles PVSZ, Guimarães ACR, Otto TD, Miranda AB, Dardenne LE, Degrave WM. Structural modelling and comparative analysis of homologous, analogous and specific proteins from Trypanosoma cruzi versus Homo sapiens: putative drug targets for chagas' disease treatment. BMC Genomics 2010; 11:610. [PMID: 21034488 PMCID: PMC3091751 DOI: 10.1186/1471-2164-11-610] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2010] [Accepted: 10/29/2010] [Indexed: 11/25/2022] Open
Abstract
Background Trypanosoma cruzi is the etiological agent of Chagas' disease, an endemic infection that causes thousands of deaths every year in Latin America. Therapeutic options remain inefficient, demanding the search for new drugs and/or new molecular targets. Such efforts can focus on proteins that are specific to the parasite, but analogous enzymes and enzymes with a three-dimensional (3D) structure sufficiently different from the corresponding host proteins may represent equally interesting targets. In order to find these targets we used the workflows MHOLline and AnEnΠ obtaining 3D models from homologous, analogous and specific proteins of Trypanosoma cruzi versus Homo sapiens. Results We applied genome wide comparative modelling techniques to obtain 3D models for 3,286 predicted proteins of T. cruzi. In combination with comparative genome analysis to Homo sapiens, we were able to identify a subset of 397 enzyme sequences, of which 356 are homologous, 3 analogous and 38 specific to the parasite. Conclusions In this work, we present a set of 397 enzyme models of T. cruzi that can constitute potential structure-based drug targets to be investigated for the development of new strategies to fight Chagas' disease. The strategies presented here support the concept of structural analysis in conjunction with protein functional analysis as an interesting computational methodology to detect potential targets for structure-based rational drug design. For example, 2,4-dienoyl-CoA reductase (EC 1.3.1.34) and triacylglycerol lipase (EC 3.1.1.3), classified as analogous proteins in relation to H. sapiens enzymes, were identified as new potential molecular targets.
Collapse
Affiliation(s)
- Priscila V S Z Capriles
- Grupo de Modelagem Molecular de Sistemas Biológicos, Laboratório Nacional de Computação Científica, LNCC/MCT, Petrópolis, CEP 25651-075, Brazil.
| | | | | | | | | | | |
Collapse
|
16
|
Karp PD, Paley SM, Krummenacker M, Latendresse M, Dale JM, Lee TJ, Kaipa P, Gilham F, Spaulding A, Popescu L, Altman T, Paulsen I, Keseler IM, Caspi R. Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology. Brief Bioinform 2009; 11:40-79. [PMID: 19955237 DOI: 10.1093/bib/bbp043] [Citation(s) in RCA: 325] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Pathway Tools is a production-quality software environment for creating a type of model-organism database called a Pathway/Genome Database (PGDB). A PGDB such as EcoCyc integrates the evolving understanding of the genes, proteins, metabolic network and regulatory network of an organism. This article provides an overview of Pathway Tools capabilities. The software performs multiple computational inferences including prediction of metabolic pathways, prediction of metabolic pathway hole fillers and prediction of operons. It enables interactive editing of PGDBs by DB curators. It supports web publishing of PGDBs, and provides a large number of query and visualization tools. The software also supports comparative analyses of PGDBs, and provides several systems biology analyses of PGDBs including reachability analysis of metabolic networks, and interactive tracing of metabolites through a metabolic network. More than 800 PGDBs have been created using Pathway Tools by scientists around the world, many of which are curated DBs for important model organisms. Those PGDBs can be exchanged using a peer-to-peer DB sharing system called the PGDB Registry.
Collapse
Affiliation(s)
- Peter D Karp
- Artificial Intelligence Center, SRI International, 333 Ravenswood Ave, AE206, Menlo Park, CA 94025, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Alves-Ferreira M, Guimarães ACR, Capriles PVDSZ, Dardenne LE, Degrave WM. A new approach for potential drug target discovery through in silico metabolic pathway analysis using Trypanosoma cruzi genome information. Mem Inst Oswaldo Cruz 2009; 104:1100-10. [DOI: 10.1590/s0074-02762009000800006] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2009] [Accepted: 10/28/2009] [Indexed: 11/22/2022] Open
|
18
|
Evsikov AV, Dolan ME, Genrich MP, Patek E, Bult CJ. MouseCyc: a curated biochemical pathways database for the laboratory mouse. Genome Biol 2009; 10:R84. [PMID: 19682380 PMCID: PMC2745765 DOI: 10.1186/gb-2009-10-8-r84] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2009] [Revised: 07/17/2009] [Accepted: 08/14/2009] [Indexed: 11/10/2022] Open
Abstract
Linking biochemical genetic data to the reference genome for the laboratory mouse is important for comparative physiology and for developing mouse models of human biology and disease. We describe here a new database of curated metabolic pathways for the laboratory mouse called MouseCyc http://mousecyc.jax.org. MouseCyc has been integrated with genetic and genomic data for the laboratory mouse available from the Mouse Genome Informatics database and with pathway data from other organisms, including human.
Collapse
|
19
|
Go EP. Database Resources in Metabolomics: An Overview. J Neuroimmune Pharmacol 2009; 5:18-30. [DOI: 10.1007/s11481-009-9157-3] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2009] [Accepted: 04/15/2009] [Indexed: 12/22/2022]
|
20
|
Otto TD, Guimarães ACR, Degrave WM, de Miranda AB. AnEnPi: identification and annotation of analogous enzymes. BMC Bioinformatics 2008; 9:544. [PMID: 19091081 PMCID: PMC2628392 DOI: 10.1186/1471-2105-9-544] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2008] [Accepted: 12/17/2008] [Indexed: 11/10/2022] Open
Abstract
Background Enzymes are responsible for the catalysis of the biochemical reactions in metabolic pathways. Analogous enzymes are able to catalyze the same reactions, but they present no significant sequence similarity at the primary level, and possibly different tertiary structures as well. They are thought to have arisen as the result of independent evolutionary events. A detailed study of analogous enzymes may reveal new catalytic mechanisms, add information about the origin and evolution of biochemical pathways and disclose potential targets for drug development. Results In this work, we have constructed and implemented a new approach, AnEnPi (the Analogous Enzyme Pipeline), using a combination of bioinformatics tools like BLAST, HMMer, and in-house scripts, to assist in the identification, annotation, comparison and study of analogous and homologous enzymes. The algorithm for the detection of analogy is based i) on the construction of groups of homologous enzymes and ii) on the identification of cases where a given enzymatic activity is performed by two or more proteins without significant similarity between their primary structures. We applied this approach to a dataset obtained from KEGG Comprising all annotated enzymes, which resulted in the identification of 986 EC classes where putative analogy was detected (40.5% of all EC classes). AnEnPi is of considerable value in the construction of initial datasets that can be further curated, particularly in gene and genome annotation, in studies involving molecular evolution and metabolism and in the identification of new potential drug targets. Conclusion AnEnPi is an efficient tool for detection and annotation of analogous enzymes and other enzymes in whole genomes. It is available for academic use at:
Collapse
Affiliation(s)
- Thomas D Otto
- Laboratory for Functional Genomics and Bioinformatics, Oswaldo Cruz Institute, Fiocruz, Rio de Janeiro, Brazil.
| | | | | | | |
Collapse
|
21
|
Zhao J, Tao L, Yu H, Luo J, Cao Z, Li Y. Bow-tie topological features of metabolic networks and the functional significance. ACTA ACUST UNITED AC 2008. [DOI: 10.1007/s11434-007-0143-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
22
|
Perumal D, Lim CS, Chow VTK, Sakharkar KR, Sakharkar MK. A combined computational-experimental analyses of selected metabolic enzymes in Pseudomonas species. Int J Biol Sci 2008; 4:309-17. [PMID: 18802474 PMCID: PMC2536706 DOI: 10.7150/ijbs.4.309] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2008] [Accepted: 09/06/2008] [Indexed: 11/09/2022] Open
Abstract
Comparative genomic analysis has revolutionized our ability to predict the metabolic subsystems that occur in newly sequenced genomes, and to explore the functional roles of the set of genes within each subsystem. These computational predictions can considerably reduce the volume of experimental studies required to assess basic metabolic properties of multiple bacterial species. However, experimental validations are still required to resolve the apparent inconsistencies in the predictions by multiple resources. Here, we present combined computational-experimental analyses on eight completely sequenced Pseudomonas species. Comparative pathway analyses reveal that several pathways within the Pseudomonas species show high plasticity and versatility. Potential bypasses in 11 metabolic pathways were identified. We further confirmed the presence of the enzyme O-acetyl homoserine (thiol) lyase (EC: 2.5.1.49) in P. syringae pv. tomato that revealed inconsistent annotations in KEGG and in the recently published SYSTOMONAS database. These analyses connect and integrate systematic data generation, computational data interpretation, and experimental validation and represent a synergistic and powerful means for conducting biological research.
Collapse
Affiliation(s)
- Deepak Perumal
- Advanced Design and Modeling Lab, Nanyang Technological University, Singapore
| | | | | | | | | |
Collapse
|
23
|
Planes FJ, Beasley JE. A critical examination of stoichiometric and path-finding approaches to metabolic pathways. Brief Bioinform 2008; 9:422-36. [DOI: 10.1093/bib/bbn018] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
24
|
Bacon J, Dover LG, Hatch KA, Zhang Y, Gomes JM, Kendall S, Wernisch L, Stoker NG, Butcher PD, Besra GS, Marsh PD. Lipid composition and transcriptional response of Mycobacterium tuberculosis grown under iron-limitation in continuous culture: identification of a novel wax ester. MICROBIOLOGY-SGM 2007; 153:1435-1444. [PMID: 17464057 PMCID: PMC3123377 DOI: 10.1099/mic.0.2006/004317-0] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The low level of available iron in vivo is a major obstacle for microbial pathogens and is a stimulus for the expression of virulence genes. In this study, Mycobacterium tuberculosis H37Rv was grown aerobically in the presence of limited iron availability in chemostat culture to determine the physiological response of the organism to iron-limitation. A previously unidentified wax ester accumulated under iron-limited growth, and changes in the abundance of triacylglycerol and menaquinone were also observed between iron-replete and iron-limited chemostat cultures. DNA microarray analysis revealed differential expression of genes involved in glycerolipid metabolism and isoprenoid quinone biosynthesis, providing some insight into the underlying genetic changes that correlate with cell-wall lipid profiles of M. tuberculosis growing in an iron-limited environment.
Collapse
Affiliation(s)
- Joanna Bacon
- TB Research group, Health Protection Agency, Centre for Emergency Preparedness and Response, Porton Down, Salisbury, Wiltshire SP4 0JG, UK
| | - Lynn G. Dover
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK
| | - Kim A. Hatch
- TB Research group, Health Protection Agency, Centre for Emergency Preparedness and Response, Porton Down, Salisbury, Wiltshire SP4 0JG, UK
| | - Yi Zhang
- School of Crystallography, Birkbeck College, University of London, Malet Street, London, WC1E 7HX, UK
| | - Jessica M. Gomes
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK
| | - Sharon Kendall
- Department of Pathology and Infectious Diseases, Royal Veterinary College, Royal College Street, London NW1 0TU, UK
| | - Lorenz Wernisch
- School of Crystallography, Birkbeck College, University of London, Malet Street, London, WC1E 7HX, UK
| | - Neil G. Stoker
- Department of Pathology and Infectious Diseases, Royal Veterinary College, Royal College Street, London NW1 0TU, UK
| | - Philip D. Butcher
- Bacterial Microarray Group, Department of Cellular and Molecular Medicine, St George’s Hospital Medical School, Cranmer Terrace, London SW17 0RE, UK
| | - Gurdyal S. Besra
- School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK
| | - Philip D. Marsh
- TB Research group, Health Protection Agency, Centre for Emergency Preparedness and Response, Porton Down, Salisbury, Wiltshire SP4 0JG, UK
| |
Collapse
|
25
|
Koschützki D, Schwöbbermeyer H, Schreiber F. Ranking of network elements based on functional substructures. J Theor Biol 2007; 248:471-9. [PMID: 17644116 DOI: 10.1016/j.jtbi.2007.05.038] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2006] [Revised: 04/25/2007] [Accepted: 05/31/2007] [Indexed: 11/29/2022]
Abstract
Centrality analysis has been shown to be a valuable method for the structural analysis of biological networks. It is used to identify key elements within networks and to rank network elements such that experiments can be tailored to interesting candidates. Several centrality measures have been studied, in particular for gene regulatory, metabolic and protein interaction networks. However, these centralities have been developed in other fields of science and are not adapted to biological networks. In particular, they ignore functional building blocks within biological networks and therefore do not consider specific network substructures of interest. We incorporate functional substructures (motifs) into network centrality analysis and present a new approach to rank vertices of networks. A method for motif-based centrality analysis is presented and two extensions are discussed which broaden the idea of motif-based centrality to specific functions of particular motif elements, and to the consideration of classes of related motifs. The presented method is applied to the gene regulatory network of Escherichia coli, where it yields interesting results about key regulators.
Collapse
Affiliation(s)
- Dirk Koschützki
- Leibniz Institute of Plant Genetics and Crop Plant Research, 06466 Gatersleben, Germany.
| | | | | |
Collapse
|
26
|
Peters LL, Robledo RF, Bult CJ, Churchill GA, Paigen BJ, Svenson KL. The mouse as a model for human biology: a resource guide for complex trait analysis. Nat Rev Genet 2007; 8:58-69. [PMID: 17173058 DOI: 10.1038/nrg2025] [Citation(s) in RCA: 239] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
The mouse has been a powerful force in elucidating the genetic basis of human physiology and pathophysiology. From its beginnings as the model organism for cancer research and transplantation biology to the present, when dissection of the genetic basis of complex disease is at the forefront of genomics research, an enormous and remarkable mouse resource infrastructure has accumulated. This review summarizes those resources and provides practical guidelines for their use, particularly in the analysis of quantitative traits.
Collapse
Affiliation(s)
- Luanne L Peters
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine 04609, USA.
| | | | | | | | | | | |
Collapse
|
27
|
Tulipano PK, Tao Y, Millar WS, Zanzonico P, Kolbert K, Xu H, Yu H, Chen L, Lussier YA, Friedman C. Natural language processing and visualization in the molecular imaging domain. J Biomed Inform 2006; 40:270-81. [PMID: 17084109 DOI: 10.1016/j.jbi.2006.08.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2005] [Revised: 08/25/2006] [Accepted: 08/29/2006] [Indexed: 11/16/2022]
Abstract
Molecular imaging is at the crossroads of genomic sciences and medical imaging. Information within the molecular imaging literature could be used to link to genomic and imaging information resources and to organize and index images in a way that is potentially useful to researchers. A number of natural language processing (NLP) systems are available to automatically extract information from genomic literature. One existing NLP system, known as BioMedLEE, automatically extracts biological information consisting of biomolecular substances and phenotypic data. This paper focuses on the adaptation, evaluation, and application of BioMedLEE to the molecular imaging domain. In order to adapt BioMedLEE for this domain, we extend an existing molecular imaging terminology and incorporate it into BioMedLEE. BioMedLEE's performance is assessed with a formal evaluation study. The system's performance, measured as recall and precision, is 0.74 (95% CI: [.70-.76]) and 0.70 (95% CI [.63-.76]), respectively. We adapt a JAVA viewer known as PGviewer for the simultaneous visualization of images with NLP extracted information.
Collapse
Affiliation(s)
- P Karina Tulipano
- Department of Biomedical Informatics, Columbia University, 622 West 168th Street, Vanderbilt Clinic Floor 5, NY 10032, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Abstract
The Pathway Tools cellular overview diagram is a visual representation of the biochemical network of an organism. The overview is automatically created from a Pathway/Genome Database describing that organism. The cellular overview includes metabolic, transport and signaling pathways, and other membrane and periplasmic proteins. Pathway Tools supports interrogation and exploration of cellular biochemical networks through the overview diagram. Furthermore, a software component called the Omics Viewer provides visual analysis of whole-organism datasets using the overview diagram as an organizing framework. For example, gene expression and metabolomics measurements, alone or in combination, can be painted onto the overview, as can computed whole-organism datasets, such as predicted reaction-flux values. The cellular overview and Omics Viewer provide a mechanism whereby biologists can apply the pattern-recognition capabilities of the human visual system to analyze large-scale datasets in a biologically meaningful context. SRI's BioCyc.org website provides overview diagrams for more than 200 organisms. This article describes enhancements to the overview made since a 1999 publication, including the automatic layout capability, expansion of the cellular machinery that it includes, new semantic zooming and poster-generating capabilities, and extension of the Omics Viewer to support painting of metabolites, animations and zooming to individual pathway diagrams.
Collapse
Affiliation(s)
- Suzanne M. Paley
- To whom correspondence should be addressed. Tel: +1 650 859 5904; Fax: +1 650 859 3735;
| | - Peter D. Karp
- Correspondence may also be addressed to Peter D. Karp. Tel: +1 650 859 4358; Fax: +1 650 859 3735;
| |
Collapse
|
29
|
Green ML, Karp PD. The outcomes of pathway database computations depend on pathway ontology. Nucleic Acids Res 2006; 34:3687-97. [PMID: 16893953 PMCID: PMC1540720 DOI: 10.1093/nar/gkl438] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Different biological notions of pathways are used in different pathway databases. Those pathway ontologies significantly impact pathway computations. Computational users of pathway databases will obtain different results depending on the pathway ontology used by the databases they employ, and different pathway ontologies are preferable for different end uses. We explore differences in pathway ontologies by comparing the BioCyc and KEGG ontologies. The BioCyc ontology defines a pathway as a conserved, atomic module of the metabolic network of a single organism, i.e. often regulated as a unit, whose boundaries are defined at high-connectivity stable metabolites. KEGG pathways are on average 4.2 times larger than BioCyc pathways, and combine multiple biological processes from different organisms to produce a substrate-centered reaction mosaic. We compared KEGG and BioCyc pathways using genome context methods, which determine the functional relatedness of pairs of genes. For each method we employed, a pair of genes randomly selected from a BioCyc pathway is more likely to be related by that method than is a pair of genes randomly selected from a KEGG pathway, supporting the conclusion that the BioCyc pathway conceptualization is closer to a single conserved biological process than is that of KEGG.
Collapse
Affiliation(s)
- M. L. Green
- Correspondence may also be addressed to M. L. Green. Tel: +1 650 859 5669; Fax: +1 650 859 3735;
| | - P. D. Karp
- To whom correspondence should be addressed. Tel: +1 650 859 4358; Fax: +1 650 859 3735;
| |
Collapse
|
30
|
Zhao J, Yu H, Luo J, Cao ZW, Li Y. Complex networks theory for analyzing metabolic networks. ACTA ACUST UNITED AC 2006. [DOI: 10.1007/s11434-006-2015-2] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
31
|
Shi J, Romero PR, Schoolnik GK, Spormann AM, Karp PD. Evidence supporting predicted metabolic pathways for Vibrio cholerae: gene expression data and clinical tests. Nucleic Acids Res 2006; 34:2438-44. [PMID: 16682451 PMCID: PMC1458520 DOI: 10.1093/nar/gkl310] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Vibrio cholerae, the etiological agent of the diarrheal illness cholera, can kill an infected adult in 24 h. V.cholerae lives as an autochthonous microbe in estuaries, rivers and coastal waters. A better understanding of its metabolic pathways will assist the development of more effective treatments and will provide a deeper understanding of how this bacterium persists in natural aquatic habitats. Using the completed V.cholerae genome sequence and PathoLogic software, we created VchoCyc, a pathway-genome database that predicted 171 likely metabolic pathways in the bacterium. We report here experimental evidence supporting the computationally predicted pathways. The evidence comes from microarray gene expression studies of V.cholerae in the stools of three cholera patients [D. S. Merrell, S. M. Butler, F. Qadri, N. A. Dolganov, A. Alam, M. B. Cohen, S. B. Calderwood, G. K. Schoolnik and A. Camilli (2002) Nature, 417, 642–645.], from gene expression studies in minimal growth conditions and LB rich medium, and from clinical tests that identify V.cholerae. Expression data provide evidence supporting 92 (53%) of the 171 pathways. The clinical tests provide evidence supporting seven pathways, with six pathways supported by both methods. VchoCyc provides biologists with a useful tool for analyzing this organism's metabolic and genomic information, which could lead to potential insights into new anti-bacterial agents. VchoCyc is available in the BioCyc database collection ().
Collapse
Affiliation(s)
- Jing Shi
- Biomedical Informatics Program, MC 5429, Stanford University, Stanford, CA 94305, USA.
| | | | | | | | | |
Collapse
|
32
|
Abstract
Our information about the gene content of organisms continues to grow as more genomes are sequenced and gene products are characterized. Sequence-based annotation efforts have led to a list of cellular components, which can be thought of as a one-dimensional annotation. With growing information about component interactions, facilitated by the advancement of various high-throughput technologies, systemic, or two-dimensional, annotations can be generated. Knowledge about the physical arrangement of chromosomes will lead to a three-dimensional spatial annotation of the genome and a fourth dimension of annotation will arise from the study of changes in genome sequences that occur during adaptive evolution. Here we discuss all four levels of genome annotation, with specific emphasis on two-dimensional annotation methods.
Collapse
Affiliation(s)
- Jennifer L Reed
- Department of Bioengineering, University of California, San Diego, La Jolla, California, 92093, USA
| | | | | | | |
Collapse
|
33
|
Mombach JCM, Lemke N, da Silva NM, Ferreira RA, Isaia E, Barcellos CK. Bioinformatics analysis of mycoplasma metabolism: Important enzymes, metabolic similarities, and redundancy. Comput Biol Med 2006; 36:542-52. [PMID: 15913593 DOI: 10.1016/j.compbiomed.2005.03.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2004] [Revised: 03/09/2005] [Accepted: 03/09/2005] [Indexed: 11/26/2022]
Abstract
In this work we apply a bioinformatics approach to determine the most important enzymes of the metabolic network of mycoplasmas. The genomes of several mycoplasmas shared predicted important enzymes. Our method allows us to determine both enzymes that are isolated from the metabolic network of the organism and those that are redundant. We also compare the similarities of the mycoplasmas metabolic networks with the phylogenetic relationships predicted from their 16s rRNA sequences.
Collapse
Affiliation(s)
- José C M Mombach
- Laboratório de Bioinformática e Biologia Computacional, Universidade do Vale do Rio dos Sinos, 93022-000 São Leopoldo, RS, Brazil.
| | | | | | | | | | | |
Collapse
|
34
|
Hasegawa Y, Seki M, Mochizuki Y, Heida N, Hirosawa K, Okamoto N, Sakurai T, Satou M, Akiyama K, Iida K, Lee K, Kanaya S, Demura T, Shinozaki K, Konagaya A, Toyoda T. A flexible representation of omic knowledge for thorough analysis of microarray data. PLANT METHODS 2006; 2:5. [PMID: 16509996 PMCID: PMC1421397 DOI: 10.1186/1746-4811-2-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2005] [Accepted: 03/02/2006] [Indexed: 05/06/2023]
Abstract
BACKGROUND In order to understand microarray data reasonably in the context of other existing biological knowledge, it is necessary to conduct a thorough examination of the data utilizing every aspect of available omic knowledge libraries. So far, a number of bioinformatics tools have been developed. However, each of them is restricted to deal with one type of omic knowledge, e.g., pathways, interactions or gene ontology. Now that the varieties of omic knowledge are expanding, analysis tools need a way to deal with any type of omic knowledge. Hence, we have designed the Omic Space Markup Language (OSML) that can represent a wide range of omic knowledge, and also, we have developed a tool named GSCope3, which can statistically analyze microarray data in comparison with the OSML-formatted omic knowledge data. RESULTS In order to test the applicability of OSML to represent a variety of omic knowledge specifically useful for analysis of Arabidopsis thaliana microarray data, we have constructed a Biological Knowledge Library (BiKLi) by converting eight different types of omic knowledge into OSML-formatted datasets. We applied GSCope3 and BiKLi to previously reported A. thaliana microarray data, so as to extract any additional insights from the data. As a result, we have discovered a new insight that lignin formation resists drought stress and activates transcription of many water channel genes to oppose drought stress; and most of the 20S proteasome subunit genes show similar expression profiles under drought stress. In addition to this novel discovery, similar findings previously reported were also quickly confirmed using GSCope3 and BiKLi. CONCLUSION GSCope3 can statistically analyze microarray data in the context of any OSML-represented omic knowledge. OSML is not restricted to a specific data type structure, but it can represent a wide range of omic knowledge. It allows us to convert new types of omic knowledge into datasets that can be used for microarray data analysis with GSCope3. In addition to BiKLi, by collecting various types of omic knowledge as OSML libraries, it becomes possible for us to conduct detailed thorough analysis from various biological viewpoints. GSCope3 and BiKLi are available for academic users at our web site http://omicspace.riken.jp.
Collapse
Affiliation(s)
- Yoshikazu Hasegawa
- Phenome Informatics Team, Functional Genomics Research Group, Genomic Sciences Center, RIKEN, Suehiro, Tsurumi, Yokohama, Kanagawa, Japan
| | - Motoaki Seki
- Plant Functional Genomics Research Team, Functional Genomics Research Group, Genomic Sciences Center, RIKEN, Suehiro, Tsurumi, Yokohama, Kanagawa, Japan
| | - Yoshiki Mochizuki
- Phenome Informatics Team, Functional Genomics Research Group, Genomic Sciences Center, RIKEN, Suehiro, Tsurumi, Yokohama, Kanagawa, Japan
| | - Naohiko Heida
- Phenome Informatics Team, Functional Genomics Research Group, Genomic Sciences Center, RIKEN, Suehiro, Tsurumi, Yokohama, Kanagawa, Japan
| | - Katsura Hirosawa
- Phenome Informatics Team, Functional Genomics Research Group, Genomic Sciences Center, RIKEN, Suehiro, Tsurumi, Yokohama, Kanagawa, Japan
| | - Naoki Okamoto
- NEC Infomatec Systems Ltd, Sakato, Takatsu, Kawasaki, Kanagawa, Japan
| | - Tetsuya Sakurai
- Integrated Genome Informatics Research Unit, Metabolomics Group, Plant Science Center, RIKEN, Suehiro, Tsurumi, Yokohama, Kanagawa, Japan
| | - Masakazu Satou
- Integrated Genome Informatics Research Unit, Metabolomics Group, Plant Science Center, RIKEN, Suehiro, Tsurumi, Yokohama, Kanagawa, Japan
| | - Kenji Akiyama
- Integrated Genome Informatics Research Unit, Metabolomics Group, Plant Science Center, RIKEN, Suehiro, Tsurumi, Yokohama, Kanagawa, Japan
| | - Kei Iida
- Faculty of Bio-Science, Nagahama Institute of Bio-Science and Technology, Tamura, Nagahama, Shiga, Japan
| | - Kisik Lee
- IT technology research institute, Taehung Telcom co., Ltd., Dangsan-dong 3-ga 402, Youngdungpo-gu, Seoul, South Korea
| | - Shigehiko Kanaya
- Laboratory of Comparative Genomics, Department of Bioinformatics and Genomics, Graduate School of Information Science, NARA Institute of Science and Technology, Takayama, Ikoma, Nara, Japan
| | - Taku Demura
- Morphoregulation Research Team, Plant Productivity Systems Research Group, Plant Science Center, RIKEN, Suehiro, Tsurumi, Yokohama, Kanagawa, Japan
| | - Kazuo Shinozaki
- Plant Science Center, RIKEN, Suehiro, Tsurumi, Yokohama, Kanagawa, Japan
| | - Akihiko Konagaya
- Advanced Genome Information Technology Research Group, Genomic Sciences Center, RIKEN Suehiro, Tsurumi, Yokohama, Kanagawa, Japan
| | - Tetsuro Toyoda
- Phenome Informatics Team, Functional Genomics Research Group, Genomic Sciences Center, RIKEN, Suehiro, Tsurumi, Yokohama, Kanagawa, Japan
| |
Collapse
|
35
|
Yi M, Horton JD, Cohen JC, Hobbs HH, Stephens RM. WholePathwayScope: a comprehensive pathway-based analysis tool for high-throughput data. BMC Bioinformatics 2006; 7:30. [PMID: 16423281 PMCID: PMC1388242 DOI: 10.1186/1471-2105-7-30] [Citation(s) in RCA: 177] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2005] [Accepted: 01/19/2006] [Indexed: 12/18/2022] Open
Abstract
Background Analysis of High Throughput (HTP) Data such as microarray and proteomics data has provided a powerful methodology to study patterns of gene regulation at genome scale. A major unresolved problem in the post-genomic era is to assemble the large amounts of data generated into a meaningful biological context. We have developed a comprehensive software tool, WholePathwayScope (WPS), for deriving biological insights from analysis of HTP data. Result WPS extracts gene lists with shared biological themes through color cue templates. WPS statistically evaluates global functional category enrichment of gene lists and pathway-level pattern enrichment of data. WPS incorporates well-known biological pathways from KEGG (Kyoto Encyclopedia of Genes and Genomes) and Biocarta, GO (Gene Ontology) terms as well as user-defined pathways or relevant gene clusters or groups, and explores gene-term relationships within the derived gene-term association networks (GTANs). WPS simultaneously compares multiple datasets within biological contexts either as pathways or as association networks. WPS also integrates Genetic Association Database and Partial MedGene Database for disease-association information. We have used this program to analyze and compare microarray and proteomics datasets derived from a variety of biological systems. Application examples demonstrated the capacity of WPS to significantly facilitate the analysis of HTP data for integrative discovery. Conclusion This tool represents a pathway-based platform for discovery integration to maximize analysis power. The tool is freely available at .
Collapse
Affiliation(s)
- Ming Yi
- Advanced Biomedical Computing Center, National Cancer Institute-Frederick/SAIC-Frederick Inc., Frederick, MD 21702, USA
| | - Jay D Horton
- McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center at Dallas, TX 75390-9046, USA
| | - Jonathan C Cohen
- McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center at Dallas, TX 75390-9046, USA
- Departments of Internal Medicine and Molecular Genetics, University of Texas Southwestern Medical Center at Dallas, TX 75390-9046, USA
| | - Helen H Hobbs
- McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center at Dallas, TX 75390-9046, USA
- Departments of Internal Medicine and Molecular Genetics, University of Texas Southwestern Medical Center at Dallas, TX 75390-9046, USA
- The Howard Hughes Medical Institute, University of Texas Southwestern Medical Center at Dallas, TX 75390-9046, USA
| | - Robert M Stephens
- Advanced Biomedical Computing Center, National Cancer Institute-Frederick/SAIC-Frederick Inc., Frederick, MD 21702, USA
| |
Collapse
|
36
|
Karp PD, Ouzounis CA, Moore-Kochlacs C, Goldovsky L, Kaipa P, Ahrén D, Tsoka S, Darzentas N, Kunin V, López-Bigas N. Expansion of the BioCyc collection of pathway/genome databases to 160 genomes. Nucleic Acids Res 2005; 33:6083-9. [PMID: 16246909 PMCID: PMC1266070 DOI: 10.1093/nar/gki892] [Citation(s) in RCA: 395] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
The BioCyc database collection is a set of 160 pathway/genome databases (PGDBs) for most eukaryotic and prokaryotic species whose genomes have been completely sequenced to date. Each PGDB in the BioCyc collection describes the genome and predicted metabolic network of a single organism, inferred from the MetaCyc database, which is a reference source on metabolic pathways from multiple organisms. In addition, each bacterial PGDB includes predicted operons for the corresponding species. The BioCyc collection provides a unique resource for computational systems biology, namely global and comparative analyses of genomes and metabolic networks, and a supplement to the BioCyc resource of curated PGDBs. The Omics viewer available through the BioCyc website allows scientists to visualize combinations of gene expression, proteomics and metabolomics data on the metabolic maps of these organisms. This paper discusses the computational methodology by which the BioCyc collection has been expanded, and presents an aggregate analysis of the collection that includes the range of number of pathways present in these organisms, and the most frequently observed pathways. We seek scientists to adopt and curate individual PGDBs within the BioCyc collection. Only by harnessing the expertise of many scientists we can hope to produce biological databases, which accurately reflect the depth and breadth of knowledge that the biomedical research community is producing.
Collapse
Affiliation(s)
- Peter D Karp
- Bioinformatics Research Group, SRI International EK207, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Yoo C, Cooper GF, Schmidt M. A control study to evaluate a computer-based microarray experiment design recommendation system for gene-regulation pathways discovery. J Biomed Inform 2005; 39:126-46. [PMID: 16203178 DOI: 10.1016/j.jbi.2005.05.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2005] [Revised: 04/22/2005] [Accepted: 05/27/2005] [Indexed: 11/22/2022]
Abstract
The main topic of this paper is evaluating a system that uses the expected value of experimentation for discovering causal pathways in gene expression data. By experimentation we mean both interventions (e.g., a gene knock-out experiment) and observations (e.g., passively observing the expression level of a "wild-type" gene). We introduce a system called GEEVE (causal discovery in Gene Expression data using Expected Value of Experimentation), which implements expected value of experimentation in discovering causal pathways using gene expression data. GEEVE provides the following assistance, which is intended to help biologists in their quest to discover gene-regulation pathways: Recommending which experiments to perform (with a focus on "knock-out" experiments) using an expected value of experimentation (EVE) method. Recommending the number of measurements (observational and experimental) to include in the experimental design, again using an EVE method. Providing a Bayesian analysis that combines prior knowledge with the results of recent microarray experimental results to derive posterior probabilities of gene regulation relationships. In recommending which experiments to perform (and how many times to repeat them) the EVE approach considers the biologist's preferences for which genes to focus the discovery process. Also, since exact EVE calculations are exponential in time, GEEVE incorporates approximation methods. GEEVE is able to combine data from knock-out experiments with data from wild-type experiments to suggest additional experiments to perform and then to analyze the results of those microarray experimental results. It models the possibility that unmeasured (latent) variables may be responsible for some of the statistical associations among the expression levels of the genes under study. To evaluate the GEEVE system, we used a gene expression simulator to generate data from specified models of gene regulation. Using the simulator, we evaluated the GEEVE system using a randomized control study that involved 10 biologists, some of whom used GEEVE and some of whom did not. The results show that biologists who used GEEVE reached correct causal assessments about gene regulation more often than did those biologists who did not use GEEVE. The GEEVE users also reached their assessments in a more cost-effective manner.
Collapse
Affiliation(s)
- Changwon Yoo
- Department of Computer Science, University of Montana, 420 Social Sciences, University of Montana, Missoula, MT 59803, USA.
| | | | | |
Collapse
|
38
|
Laghaee A, Malcolm C, Hallam J, Ghazal P. Artificial intelligence and robotics in high throughput post-genomics. Drug Discov Today 2005; 10:1253-9. [PMID: 16213418 DOI: 10.1016/s1359-6446(05)03581-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The shift of post-genomics towards a systems approach has offered an ever-increasing role for artificial intelligence (AI) and robotics. Many disciplines (e.g. engineering, robotics, computer science) bear on the problem of automating the different stages involved in post-genomic research with a view to developing quality assured high-dimensional data. We review some of the latest contributions of AI and robotics to this end and note the limitations arising from the current independent, exploratory way in which specific solutions are being presented for specific problems without regard to how these could be eventually integrated into one comprehensible integrated intelligent system.
Collapse
Affiliation(s)
- Aroosha Laghaee
- Institute for Perception, Action and Behaviour (IPAB), School of Informatics, James Clerk Maxwell Building, University of Edinburgh, Mayfield Road, Edinburgh EH9 3JZ, UK.
| | | | | | | |
Collapse
|
39
|
Lee DY, Fan LT, Park S, Lee SY, Shafie S, Bertók B, Friedler F. Complementary identification of multiple flux distributions and multiple metabolic pathways. Metab Eng 2005; 7:182-200. [PMID: 15885617 DOI: 10.1016/j.ymben.2005.02.002] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2004] [Revised: 12/07/2004] [Accepted: 02/08/2005] [Indexed: 11/27/2022]
Abstract
Cell robustness and complexity have been recognized as unique features of biological systems. Such robustness and complexity of metabolic-reaction systems can be explored by discovering, or identifying, the multiple flux distributions (MFD) and redundant pathways that lead to a given external state; however, this is exceedingly cumbersome to accomplish. It is, therefore, highly desirable to establish an effective computational method for their identification, which, in turn, gives rise to a novel insight into the cellular function. An effective approach is proposed for complementarily identifying MFD in metabolic flux analysis and multiple metabolic pathways (MMP) in structural pathway analysis. This approach judiciously integrates flux balance analysis (FBA) based on linear programming and the graph-theoretic method for determining reaction pathways. A single metabolic pathway, with the concomitant flux distribution and the overall reaction manifesting itself as the desired phenotype under some environmental conditions, is determined by FBA from the initial candidate sequence of metabolic reactions. Subsequently, the graph-theoretic method recovers all feasible MMP and the corresponding MFD. The approach's efficacy is demonstrated by applying it to the in silico Escherichia coli model under various culture conditions. The resultant MMP and MFD attaining a unique external state reveal the surprising adaptability and robustness of the intricate cellular network as a key to cell survival against environmental or genetic changes. These results indicate that the proposed approach would be useful in facilitating drug discovery.
Collapse
Affiliation(s)
- Dong-Yup Lee
- Metabolic and Biomolecular Engineering National Research Laboratory, Korea Advanced Institute of Science and Technology, 373-1 Guseong-dong, Yuseong-gu, Daejeon 305-701, Korea
| | | | | | | | | | | | | |
Collapse
|
40
|
Abstract
Bioinformatics is playing an increasingly important role in nearly all aspects of drug discovery, drug assessment, and drug development. This growing importance lies not only in the role that bioinformatics plays in handling large volumes of data, but also in the utility of bioinformatics tools to predict, analyze, or help interpret clinical and preclinical findings. This review focuses on describing and evaluating some of the newer or more important bioinformatics resources (i.e., databases and software) that are of growing importance to understanding or predicting drug metabolism, especially with respect to the absorption, distribution, metabolism, excretion, (ADME), and toxicity (T) of both existing drugs and potential drug leads. Detailed descriptions and critical assessments of a number of potentially useful bioinformatics/cheminformatics databases and predictive ADMET software tools are provided. Additionally, several pharmaceutically important applications of both the databases and software are highlighted. Given the rapid growth in this area and the rapid changes that are taking place, a special emphasis is placed on freely available or Web-accessible resources.
Collapse
Affiliation(s)
- David S Wishart
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada.
| |
Collapse
|
41
|
Mombach JC, Lemke N, Silva NMD, Ferreira RA, Isaia Filho E, Barcellos CK, Ormazabal RJ. Using the FORESTS and KEGG databases to investigate the metabolic network of Eucalyptus. Genet Mol Biol 2005. [DOI: 10.1590/s1415-47572005000400018] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Affiliation(s)
| | - Ney Lemke
- Universidade do Vale do Rio dos Sinos, Brazil
| | | | | | | | | | | |
Collapse
|
42
|
Yoo C, Cooper GF. An evaluation of a system that recommends microarray experiments to perform to discover gene-regulation pathways. Artif Intell Med 2004; 31:169-82. [PMID: 15219293 DOI: 10.1016/j.artmed.2004.01.018] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2003] [Revised: 04/14/2003] [Accepted: 01/16/2004] [Indexed: 11/23/2022]
Abstract
The main topic of this paper is modeling the expected value of experimentation (EVE) for discovering causal pathways in gene expression data. By experimentation we mean both interventions (e.g., a gene knockout experiment) and observations (e.g., passively observing the expression level of a "wild-type" gene). We introduce a system called GEEVE (causal discovery in Gene Expression data using Expected Value of Experimentation), which implements expected value of experimentation in discovering causal pathways using gene expression data. GEEVE provides the following assistance, which is intended to help biologists in their quest to discover gene-regulation pathways: Recommending which experiments to perform (with a focus on "knockout" experiments) using an expected value of experimentation method. Recommending the number of measurements (observational and experimental) to include in the experimental design, again using an EVE method. Providing a Bayesian analysis that combines prior knowledge with the results of recent microarray experimental results to derive posterior probabilities of gene regulation relationships. In recommending which experiments to perform (and how many times to repeat them) the EVE approach considers the biologist's preferences for which genes to focus the discovery process. Also, since exact EVE calculations are exponential in time, GEEVE incorporates approximation methods. GEEVE is able to combine data from knockout experiments with data from wild-type experiments to suggest additional experiments to perform and then to analyze the results of those microarray experimental results. It models the possibility that unmeasured (latent) variables may be responsible for some of the statistical associations among the expression levels of the genes under study. To evaluate the GEEVE system, we used a gene expression simulator to generate data from specified models of gene regulation. The results show that the GEEVE system gives better results than two recently published approaches (1) in learning the generating models of gene regulation and (2) in recommending experiments to perform.
Collapse
Affiliation(s)
- Changwon Yoo
- 420 Social Science, University of Montana, Missoula, MT 59812, USA.
| | | |
Collapse
|
43
|
Tsoka S, Ouzounis CA. Metabolic database systems for the analysis of genome-wide function. Biotechnol Bioeng 2004; 84:750-5. [PMID: 14708115 DOI: 10.1002/bit.10881] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Genome sequencing projects provide an inventory of molecular components for a wide variety of organisms. Metabolic databases integrate these functional descriptions of individual modules into a higher-level characterization of cellular metabolism. This article reviews efforts related to the development of metabolic databases and discusses how such systems have aided the delineation of genome properties. We illustrate the design features of metabolic databases and discuss the challenges facing metabolic as well as databases of other functional type.
Collapse
Affiliation(s)
- Sophia Tsoka
- Computational Genomics Group, The European Bioinformatics Institute, EMBL Cambridge Outstation, Cambridge CB1O 1SD, UK.
| | | |
Collapse
|
44
|
Ueda HR, Hayashi S, Matsuyama S, Yomo T, Hashimoto S, Kay SA, Hogenesch JB, Iino M. Universality and flexibility in gene expression from bacteria to human. Proc Natl Acad Sci U S A 2004; 101:3765-9. [PMID: 14999098 PMCID: PMC374318 DOI: 10.1073/pnas.0306244101] [Citation(s) in RCA: 113] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Highly parallel experimental biology is offering opportunities to not just accomplish work more easily, but to explore for underlying governing principles. Recent analysis of the large-scale organization of gene expression has revealed its complex and dynamic nature. However, the underlying dynamics that generate complex gene expression and cellular organization are not yet understood. To comprehensively and quantitatively elucidate these underlying gene expression dynamics, we have analyzed genome-wide gene expression in many experimental conditions in Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, Drosophila melanogaster, Mus musculus, and Homo sapiens. Here we demonstrate that the gene expression dynamics follows the same and surprisingly simple principle from E. coli to human, where gene expression changes are proportional to their expression levels, and show that this "proportional" dynamics or "rich-travel-more" mechanism can regenerate the observed complex and dynamic organization of the transcriptome. These findings provide a universal principle in the regulation of gene expression, show how complex and dynamic organization can emerge from simple underlying dynamics, and demonstrate the flexibility of transcription across a wide range of expression levels.
Collapse
Affiliation(s)
- Hiroki R Ueda
- Laboratory for Systems Biology, Center for Developmental Biology, RIKEN, 2-2-3 Minatojima-minamimachi, Chuo-ku, Kobe, Hyogo 650-0047, Japan.
| | | | | | | | | | | | | | | |
Collapse
|
45
|
|
46
|
McShan DC, Updadhayaya M, Shah I. Symbolic inference of xenobiotic metabolism. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2004:545-56. [PMID: 14992532 PMCID: PMC2709528 DOI: 10.1142/9789812704856_0051] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
We present a new symbolic computational approach to elucidate the biochemical networks of living systems de novo and we apply it to an important biomedical problem: xenobiotic metabolism. A crucial issue in analyzing and modeling a living organism is understanding its biochemical network beyond what is already known. Our objective is to use the available metabolic information in a representational framework that enables the inference of novel biochemical knowledge and whose results can be validated experimentally. We describe a symbolic computational approach consisting of two parts. First, biotransformation rules are inferred from the molecular graphs of compounds in enzyme-catalyzed reactions. Second, these rules are recursively applied to different compounds to generate novel metabolic networks, containing new biotransformations and new metabolites. Using data for 456 generic reactions and 825 generic compounds from KEGG we were able to extract 110 biotransformation rules, which generalize a subset of known biocatalytic functions. We tested our approach by applying these rules to ethanol, a common substance of abuse and to furfuryl alcohol, a xenobiotic organic solvent, which is absent in metabolic databases. In both cases our predictions on the fate of ethanol and furfuryl alcohol are consistent with the literature on the metabolism of these compounds.
Collapse
Affiliation(s)
- D C McShan
- School of Medicine, University of Colorado, 4200 East 9th Avenue, B-119, Denver, CO 80262, USA.
| | | | | |
Collapse
|
47
|
Wiback SJ, Mahadevan R, Palsson BØ. Reconstructing metabolic flux vectors from extreme pathways: defining the alpha-spectrum. J Theor Biol 2003; 224:313-24. [PMID: 12941590 DOI: 10.1016/s0022-5193(03)00168-1] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The move towards genome-scale analysis of cellular functions has necessitated the development of analytical (in silico) methods to understand such large and complex biochemical reaction networks. One such method is extreme pathway analysis that uses stoichiometry and thermodynamic irreversibly to define mathematically unique, systemic metabolic pathways. These extreme pathways form the edges of a high-dimensional convex cone in the flux space that contains all the attainable steady state solutions, or flux distributions, for the metabolic network. By definition, any steady state flux distribution can be described as a nonnegative linear combination of the extreme pathways. To date, much effort has been focused on calculating, defining, and understanding these extreme pathways. However, little work has been performed to determine how these extreme pathways contribute to a given steady state flux distribution. This study represents an initial effort aimed at defining how physiological steady state solutions can be reconstructed from a network's extreme pathways. In general, there is not a unique set of nonnegative weightings on the extreme pathways that produce a given steady state flux distribution but rather a range of possible values. This range can be determined using linear optimization to maximize and minimize the weightings of a particular extreme pathway in the reconstruction, resulting in what we have termed the alpha-spectrum. The alpha-spectrum defines which extreme pathways can and cannot be included in the reconstruction of a given steady state flux distribution and to what extent they individually contribute to the reconstruction. It is shown that accounting for transcriptional regulatory constraints can considerably shrink the alpha-spectrum. The alpha-spectrum is computed and interpreted for two cases; first, optimal states of a skeleton representation of core metabolism that include transcriptional regulation, and second for human red blood cell metabolism under various physiological, non-optimal conditions.
Collapse
Affiliation(s)
- Sharon J Wiback
- Department of Bioengineering, University of California, 9500 Gilman Drive EBU 1 Room 6607, San Diego, La Jolla, CA 92093, USA
| | | | | |
Collapse
|
48
|
Berka RM, Cui X, Yanofsky C. Genomewide transcriptional changes associated with genetic alterations and nutritional supplementation affecting tryptophan metabolism in Bacillus subtilis. Proc Natl Acad Sci U S A 2003; 100:5682-7. [PMID: 12719520 PMCID: PMC156261 DOI: 10.1073/pnas.1031606100] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
DNA microarrays comprising approximately 95% of the Bacillus subtilis annotated protein coding ORFs were deployed to generate a series of snapshots of genomewide transcriptional changes that occur when cells are grown under various conditions that are expected to increase or decrease transcription of the trp operon segment of the aromatic supraoperon. Comparisons of global expression patterns were made between cells grown in the presence of indole acrylic acid, a specific inhibitor of tRNA(Trp) charging; cells deficient in expression of the mtrB gene, which encodes the tryptophan-activated negative regulatory protein, TRAP; WT cells grown in the presence or absence of two or three of the aromatic amino acids; and cells harboring a tryptophanyl tRNA synthetase mutation conferring temperature-sensitive tryptophan-dependent growth. Our findings validate expected responses of the tryptophan biosynthetic genes and presumed regulatory interrelationships between genes in the different aromatic amino acid pathways and the histidine biosynthetic pathway. Using a combination of supervised and unsupervised statistical methods we identified approximately 100 genes whose expression profiles were closely correlated with those of the genes in the trp operon. This finding suggests that expression of these genes is influenced directly or indirectly by regulatory events that affect or are a consequence of altered tryptophan metabolism.
Collapse
|
49
|
Abstract
Metabolic pathways are a central paradigm in biology. Historically, they have been defined on the basis of their step-by-step discovery. However, the genome-scale metabolic networks now being reconstructed from annotation of genome sequences demand new network-based definitions of pathways to facilitate analysis of their capabilities and functions, such as metabolic versatility and robustness, and optimal growth rates. This demand has led to the development of a new mathematically based analysis of complex, metabolic networks that enumerates all their unique pathways that take into account all requirements for cofactors and byproducts. Applications include the design of engineered biological systems, the generation of testable hypotheses regarding network structure and function, and the elucidation of properties that can not be described by simple descriptions of individual components (such as product yield, network robustness, correlated reactions and predictions of minimal media). Recently, these properties have also been studied in genome-scale networks. Thus, network-based pathways are emerging as an important paradigm for analysis of biological systems.
Collapse
Affiliation(s)
- Jason A Papin
- Department of Bioengineering, University of California, San Diego, La Jolla 92093-0412, USA
| | | | | | | | | |
Collapse
|
50
|
Palsson BO, Price ND, Papin JA. Development of network-based pathway definitions: the need to analyze real metabolic networks. Trends Biotechnol 2003; 21:195-8. [PMID: 12727379 DOI: 10.1016/s0167-7799(03)00080-5] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|