1
|
Pretorius E, Kell DB. A Perspective on How Fibrinaloid Microclots and Platelet Pathology May be Applied in Clinical Investigations. Semin Thromb Hemost 2024; 50:537-551. [PMID: 37748515 PMCID: PMC11105946 DOI: 10.1055/s-0043-1774796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/27/2023]
Abstract
Microscopy imaging has enabled us to establish the presence of fibrin(ogen) amyloid (fibrinaloid) microclots in a range of chronic, inflammatory diseases. Microclots may also be induced by a variety of purified substances, often at very low concentrations. These molecules include bacterial inflammagens, serum amyloid A, and the S1 spike protein of severe acute respiratory syndrome coronavirus 2. Here, we explore which of the properties of these microclots might be used to contribute to differential clinical diagnoses and prognoses of the various diseases with which they may be associated. Such properties include distributions in their size and number before and after the addition of exogenous thrombin, their spectral properties, the diameter of the fibers of which they are made, their resistance to proteolysis by various proteases, their cross-seeding ability, and the concentration dependence of their ability to bind small molecules including fluorogenic amyloid stains. Measuring these microclot parameters, together with microscopy imaging itself, along with methodologies like proteomics and imaging flow cytometry, as well as more conventional assays such as those for cytokines, might open up the possibility of a much finer use of these microclot properties in generative methods for a future where personalized medicine will be standard procedures in all clotting pathology disease diagnoses.
Collapse
Affiliation(s)
- Etheresia Pretorius
- Department of Physiological Sciences, Faculty of Science, Stellenbosch University, Stellenbosch, Matieland, South Africa
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Liverpool, United Kingdom
| | - Douglas B. Kell
- Department of Physiological Sciences, Faculty of Science, Stellenbosch University, Stellenbosch, Matieland, South Africa
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Liverpool, United Kingdom
- The Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Lyngby, Denmark
| |
Collapse
|
2
|
Singh DP, Kaushik B. A systematic literature review for the prediction of anticancer drug response using various machine-learning and deep-learning techniques. Chem Biol Drug Des 2023; 101:175-194. [PMID: 36303299 DOI: 10.1111/cbdd.14164] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 10/13/2022] [Accepted: 10/24/2022] [Indexed: 12/24/2022]
Abstract
Computational methods have gained prominence in healthcare research. The accessibility of healthcare data has greatly incited academicians and researchers to develop executions that help in prognosis of cancer drug response. Among various computational methods, machine-learning (ML) and deep-learning (DL) methods provide the most consistent and effectual approaches to handle the serious aftermaths of the deadly disease and drug administered to the patients. Hence, this systematic literature review has reviewed researches that have investigated drug discovery and prognosis of anticancer drug response using ML and DL algorithms. Fot this purpose, PRISMA guidelines have been followed to choose research papers from Google Scholar, PubMed, and Sciencedirect websites. A total count of 105 papers that align with the context of this review were chosen. Further, the review also presents accuracy of the existing ML and DL methods in the prediction of anticancer drug response. It has been found from the review that, amidst the availability of various studies, there are certain challenges associated with each method. Thus, future researchers can consider these limitations and challenges to develop a prominent anticancer drug response prediction method, and it would be greatly beneficial to the medical professionals in administering non-invasive treatment to the patients.
Collapse
Affiliation(s)
- Davinder Paul Singh
- School of Computer Science and Engineering, Shri Mata Vaishno Devi University, Katra, Jammu and Kashmir, India
| | - Baijnath Kaushik
- School of Computer Science and Engineering, Shri Mata Vaishno Devi University, Katra, Jammu and Kashmir, India
| |
Collapse
|
3
|
Kell DB. A protet-based, protonic charge transfer model of energy coupling in oxidative and photosynthetic phosphorylation. Adv Microb Physiol 2021; 78:1-177. [PMID: 34147184 DOI: 10.1016/bs.ampbs.2021.01.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Textbooks of biochemistry will explain that the otherwise endergonic reactions of ATP synthesis can be driven by the exergonic reactions of respiratory electron transport, and that these two half-reactions are catalyzed by protein complexes embedded in the same, closed membrane. These views are correct. The textbooks also state that, according to the chemiosmotic coupling hypothesis, a (or the) kinetically and thermodynamically competent intermediate linking the two half-reactions is the electrochemical difference of protons that is in equilibrium with that between the two bulk phases that the coupling membrane serves to separate. This gradient consists of a membrane potential term Δψ and a pH gradient term ΔpH, and is known colloquially as the protonmotive force or pmf. Artificial imposition of a pmf can drive phosphorylation, but only if the pmf exceeds some 150-170mV; to achieve in vivo rates the imposed pmf must reach 200mV. The key question then is 'does the pmf generated by electron transport exceed 200mV, or even 170mV?' The possibly surprising answer, from a great many kinds of experiment and sources of evidence, including direct measurements with microelectrodes, indicates it that it does not. Observable pH changes driven by electron transport are real, and they control various processes; however, compensating ion movements restrict the Δψ component to low values. A protet-based model, that I outline here, can account for all the necessary observations, including all of those inconsistent with chemiosmotic coupling, and provides for a variety of testable hypotheses by which it might be refined.
Collapse
Affiliation(s)
- Douglas B Kell
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative, Biology, University of Liverpool, Liverpool, United Kingdom; The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark.
| |
Collapse
|
4
|
Triba MN, Le Moyec L, Amathieu R, Goossens C, Bouchemal N, Nahon P, Rutledge DN, Savarin P. PLS/OPLS models in metabolomics: the impact of permutation of dataset rows on the K-fold cross-validation quality parameters. MOLECULAR BIOSYSTEMS 2014; 11:13-9. [PMID: 25382277 DOI: 10.1039/c4mb00414k] [Citation(s) in RCA: 370] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Among all the software packages available for discriminant analyses based on projection to latent structures (PLS-DA) or orthogonal projection to latent structures (OPLS-DA), SIMCA (Umetrics, Umeå Sweden) is the more widely used in the metabolomics field. SIMCA proposes many parameters or tests to assess the quality of the computed model (the number of significant components, R2, Q2, pCV-ANOVA, and the permutation test). Significance thresholds for these parameters are strongly application-dependent. Concerning the Q2 parameter, a significance threshold of 0.5 is generally admitted. However, during the last few years, many PLS-DA/OPLS-DA models built using SIMCA have been published with Q2 values lower than 0.5. The purpose of this opinion note is to point out that, in some circumstances frequently encountered in metabolomics, the values of these parameters strongly depend on the individuals that constitute the validation subsets. As a result of the way in which the software selects members of the calibration and validation subsets, a simple permutation of dataset rows can, in several cases, lead to contradictory conclusions about the significance of the models when a K-fold cross-validation is used. We believe that, when Q2 values lower than 0.5 are obtained, SIMCA users should at least verify that the quality parameters are stable towards permutation of the rows in their dataset.
Collapse
Affiliation(s)
- Mohamed N Triba
- Université Paris 13, Sorbonne Paris Cité, Laboratoire Chimie, Structures, Propriétés de Biomatériaux et d'Agents Thérapeutiques (CSPBAT), Unité Mixte de Recherche (UMR) 7244, Centre National de Recherche Scientifique (CNRS), Equipe Spectroscopie des Biomolécules et des Milieux Biologiques (SBMB), 74 rue Marcel Cachin, 93037, Bobigny, France.
| | | | | | | | | | | | | | | |
Collapse
|
5
|
Abstract
Capturing the dynamism that pervades biological systems requires a computational approach that can accommodate both the continuous features of the system environment as well as the flexible and heterogeneous nature of component interactions. This presents a serious challenge for the more traditional mathematical approaches that assume component homogeneity to relate system observables using mathematical equations. While the homogeneity condition does not lead to loss of accuracy while simulating various continua, it fails to offer detailed solutions when applied to systems with dynamically interacting heterogeneous components. As the functionality and architecture of most biological systems is a product of multi-faceted individual interactions at the sub-system level, continuum models rarely offer much beyond qualitative similarity. Agent-based modelling is a class of algorithmic computational approaches that rely on interactions between Turing-complete finite-state machines--or agents--to simulate, from the bottom-up, macroscopic properties of a system. In recognizing the heterogeneity condition, they offer suitable ontologies to the system components being modelled, thereby succeeding where their continuum counterparts tend to struggle. Furthermore, being inherently hierarchical, they are quite amenable to coupling with other computational paradigms. The integration of any agent-based framework with continuum models is arguably the most elegant and precise way of representing biological systems. Although in its nascence, agent-based modelling has been utilized to model biological complexity across a broad range of biological scales (from cells to societies). In this article, we explore the reasons that make agent-based modelling the most precise approach to model biological systems that tend to be non-linear and complex.
Collapse
|
6
|
Kell DB. Scientific discovery as a combinatorial optimisation problem: how best to navigate the landscape of possible experiments? Bioessays 2012; 34:236-44. [PMID: 22252984 PMCID: PMC3321226 DOI: 10.1002/bies.201100144] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A considerable number of areas of bioscience, including gene and drug discovery, metabolic engineering for the biotechnological improvement of organisms, and the processes of natural and directed evolution, are best viewed in terms of a ‘landscape’ representing a large search space of possible solutions or experiments populated by a considerably smaller number of actual solutions that then emerge. This is what makes these problems ‘hard’, but as such these are to be seen as combinatorial optimisation problems that are best attacked by heuristic methods known from that field. Such landscapes, which may also represent or include multiple objectives, are effectively modelled in silico, with modern active learning algorithms such as those based on Darwinian evolution providing guidance, using existing knowledge, as to what is the ‘best’ experiment to do next. An awareness, and the application, of these methods can thereby enhance the scientific discovery process considerably. This analysis fits comfortably with an emerging epistemology that sees scientific reasoning, the search for solutions, and scientific discovery as Bayesian processes.
Collapse
Affiliation(s)
- Douglas B Kell
- School of Chemistry and Manchester Interdisciplinary Biocentre, The University of Manchester, Manchester, Lancs, UK.
| |
Collapse
|
7
|
Patel Y, Heyward CA, White MRH, Kell DB. Predicting the points of interaction of small molecules in the NF-κB pathway. BMC SYSTEMS BIOLOGY 2011; 5:32. [PMID: 21342508 PMCID: PMC3050742 DOI: 10.1186/1752-0509-5-32] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/08/2010] [Accepted: 02/22/2011] [Indexed: 12/03/2022]
Abstract
BACKGROUND The similarity property principle has been used extensively in drug discovery to identify small compounds that interact with specific drug targets. Here we show it can be applied to identify the interactions of small molecules within the NF-κB signalling pathway. RESULTS Clusters that contain compounds with a predominant interaction within the pathway were created, which were then used to predict the interaction of compounds not included in the clustering analysis. CONCLUSIONS The technique successfully predicted the points of interactions of compounds that are known to interact with the NF-κB pathway. The method was also shown to be successful when compounds for which the interaction points were unknown were included in the clustering analysis.
Collapse
Affiliation(s)
- Yogendra Patel
- Manchester Interdisciplinary Biocentre, University of Manchester, Manchester, 131 Princess Street, M1 7DN, UK
| | - Catherine A Heyward
- Institute of Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK
| | - Michael RH White
- Institute of Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK
- Faculty of Life Sciences, Michael Smith Building, Oxford Road, University of Manchester, Manchester, M13 9PT, UK
| | - Douglas B Kell
- Manchester Interdisciplinary Biocentre, University of Manchester, Manchester, 131 Princess Street, M1 7DN, UK
| |
Collapse
|
8
|
Kell DB. Towards a unifying, systems biology understanding of large-scale cellular death and destruction caused by poorly liganded iron: Parkinson's, Huntington's, Alzheimer's, prions, bactericides, chemical toxicology and others as examples. Arch Toxicol 2010; 84:825-89. [PMID: 20967426 PMCID: PMC2988997 DOI: 10.1007/s00204-010-0577-x] [Citation(s) in RCA: 286] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2010] [Accepted: 07/14/2010] [Indexed: 12/11/2022]
Abstract
Exposure to a variety of toxins and/or infectious agents leads to disease, degeneration and death, often characterised by circumstances in which cells or tissues do not merely die and cease to function but may be more or less entirely obliterated. It is then legitimate to ask the question as to whether, despite the many kinds of agent involved, there may be at least some unifying mechanisms of such cell death and destruction. I summarise the evidence that in a great many cases, one underlying mechanism, providing major stresses of this type, entails continuing and autocatalytic production (based on positive feedback mechanisms) of hydroxyl radicals via Fenton chemistry involving poorly liganded iron, leading to cell death via apoptosis (probably including via pathways induced by changes in the NF-κB system). While every pathway is in some sense connected to every other one, I highlight the literature evidence suggesting that the degenerative effects of many diseases and toxicological insults converge on iron dysregulation. This highlights specifically the role of iron metabolism, and the detailed speciation of iron, in chemical and other toxicology, and has significant implications for the use of iron chelating substances (probably in partnership with appropriate anti-oxidants) as nutritional or therapeutic agents in inhibiting both the progression of these mainly degenerative diseases and the sequelae of both chronic and acute toxin exposure. The complexity of biochemical networks, especially those involving autocatalytic behaviour and positive feedbacks, means that multiple interventions (e.g. of iron chelators plus antioxidants) are likely to prove most effective. A variety of systems biology approaches, that I summarise, can predict both the mechanisms involved in these cell death pathways and the optimal sites of action for nutritional or pharmacological interventions.
Collapse
Affiliation(s)
- Douglas B Kell
- School of Chemistry and the Manchester Interdisciplinary Biocentre, The University of Manchester, Manchester M1 7DN, UK.
| |
Collapse
|
9
|
Kaderbhai NN, Broadhurst DI, Ellis DI, Goodacre R, Kell DB. Functional genomics via metabolic footprinting: monitoring metabolite secretion by Escherichia coli tryptophan metabolism mutants using FT-IR and direct injection electrospray mass spectrometry. Comp Funct Genomics 2010; 4:376-91. [PMID: 18629082 PMCID: PMC2447367 DOI: 10.1002/cfg.302] [Citation(s) in RCA: 101] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2003] [Revised: 04/23/2003] [Accepted: 05/22/2003] [Indexed: 12/14/2022] Open
Abstract
We sought to test the hypothesis that mutant bacterial strains could be discriminated from each other on the basis of the metabolites they secrete into the medium (their
‘metabolic footprint’), using two methods of ‘global’ metabolite analysis (FT–IR and
direct injection electrospray mass spectrometry). The biological system used was
based on a published study of Escherichia coli tryptophan mutants that had been
analysed and discriminated by Yanofsky and colleagues using transcriptome analysis.
Wild-type strains supplemented with tryptophan or analogues could be discriminated
from controls using FT–IR of 24 h broths, as could each of the mutant strains in both
minimal and supplemented media. Direct injection electrospray mass spectrometry
with unit mass resolution could also be used to discriminate the strains from each
other, and had the advantage that the discrimination required the use of just two
or three masses in each case. These were determined via a genetic algorithm. Both
methods are rapid, reagentless, reproducible and cheap, and might beneficially be
extended to the analysis of gene knockout libraries.
Collapse
Affiliation(s)
- Naheed N Kaderbhai
- Institute of Biological Sciences, University of Wales, Aberystwyth, Wales Ceredigion SY23 3DD, UK
| | | | | | | | | |
Collapse
|
10
|
Shimokawa K, Okamura-Oho Y, Kurita T, Frith MC, Kawai J, Carninci P, Hayashizaki Y. Large-scale clustering of CAGE tag expression data. BMC Bioinformatics 2007; 8:161. [PMID: 17517134 PMCID: PMC1890301 DOI: 10.1186/1471-2105-8-161] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2006] [Accepted: 05/21/2007] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Recent analyses have suggested that many genes possess multiple transcription start sites (TSSs) that are differentially utilized in different tissues and cell lines. We have identified a huge number of TSSs mapped onto the mouse genome using the cap analysis of gene expression (CAGE) method. The standard hierarchical clustering algorithm, which gives us easily understandable graphical tree images, has difficulties in processing such huge amounts of TSS data and a better method to calculate and display the results is needed. RESULTS We use a combination of hierarchical and non-hierarchical clustering to cluster expression profiles of TSSs based on a large amount of CAGE data to profit from the best of both methods. We processed the genome-wide expression data, including 159,075 TSSs derived from 127 RNA samples of various organs of mouse, and succeeded in categorizing them into 70-100 clusters. The clusters exhibited intriguing biological features: a cluster supergroup with a ubiquitous expression profile, tissue-specific patterns, a distinct distribution of non-coding RNA and functional TSS groups. CONCLUSION Our approach succeeded in greatly reducing the calculation cost, and is an appropriate solution for analyzing large-scale TSS usage data.
Collapse
Affiliation(s)
- Kazuro Shimokawa
- Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Yuko Okamura-Oho
- Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Takio Kurita
- National Institute of Advanced Industrial Science and Technology, Tsukuba, Ibaraki 305-8568, Japan
| | - Martin C Frith
- Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Qld 4072, Australia
| | - Jun Kawai
- Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Piero Carninci
- Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Yoshihide Hayashizaki
- Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| |
Collapse
|
11
|
German JB, Schanbacher FL, Lönnerdal B, Medrano JF, McGuire MA, McManaman JL, Rocke DM, Smith TP, Neville MC, Donnelly P, Lange M, Ward R. International milk genomics consortium. Trends Food Sci Technol 2006. [DOI: 10.1016/j.tifs.2006.07.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
12
|
Kell DB. Systems biology, metabolic modelling and metabolomics in drug discovery and development. Drug Discov Today 2006; 11:1085-92. [PMID: 17129827 DOI: 10.1016/j.drudis.2006.10.004] [Citation(s) in RCA: 219] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2006] [Revised: 09/25/2006] [Accepted: 10/09/2006] [Indexed: 01/03/2023]
Abstract
Unlike signalling pathways, metabolic networks are subject to strict stoichiometric constraints. Metabolomics amplifies changes in the proteome, and represents more closely the phenotype of an organism. Recent advances enable the production (and computer-readable encoding as SBML) of metabolic network models reconstructed from genome sequences, as well as experimental measurements of much of the metabolome. There is increasing convergence between the number of human metabolites estimated via genomics ( approximately 3000) and the number measured experimentally. It is thus both timely, and now possible, to bring these two approaches together as an integrated (if distributed) whole to help understand the genesis of metabolic biomarkers, the progress of disease, and the modes of action, efficacy, off-target effects and toxicity of pharmaceutical drugs.
Collapse
Affiliation(s)
- Douglas B Kell
- School of Chemistry, Faraday Building, The University of Manchester. PO Box 88, Manchester, M60 1QD, UK.
| |
Collapse
|
13
|
Spasić I, Dunn WB, Velarde G, Tseng A, Jenkins H, Hardy N, Oliver SG, Kell DB. MeMo: a hybrid SQL/XML approach to metabolomic data management for functional genomics. BMC Bioinformatics 2006; 7:281. [PMID: 16753052 PMCID: PMC1522028 DOI: 10.1186/1471-2105-7-281] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2005] [Accepted: 06/05/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The genome sequencing projects have shown our limited knowledge regarding gene function, e.g. S. cerevisiae has 5-6,000 genes of which nearly 1,000 have an uncertain function. Their gross influence on the behaviour of the cell can be observed using large-scale metabolomic studies. The metabolomic data produced need to be structured and annotated in a machine-usable form to facilitate the exploration of the hidden links between the genes and their functions. DESCRIPTION MeMo is a formal model for representing metabolomic data and the associated metadata. Two predominant platforms (SQL and XML) are used to encode the model. MeMo has been implemented as a relational database using a hybrid approach combining the advantages of the two technologies. It represents a practical solution for handling the sheer volume and complexity of the metabolomic data effectively and efficiently. The MeMo model and the associated software are available at http://dbkgroup.org/memo/. CONCLUSION The maturity of relational database technology is used to support efficient data processing. The scalability and self-descriptiveness of XML are used to simplify the relational schema and facilitate the extensibility of the model necessitated by the creation of new experimental techniques. Special consideration is given to data integration issues as part of the systems biology agenda. MeMo has been physically integrated and cross-linked to related metabolomic and genomic databases. Semantic integration with other relevant databases has been supported through ontological annotation. Compatibility with other data formats is supported by automatic conversion.
Collapse
Affiliation(s)
- Irena Spasić
- School of Chemistry, Faraday Building, The University of Manchester, Manchester, M60 1QD, UK
| | - Warwick B Dunn
- School of Chemistry, Faraday Building, The University of Manchester, Manchester, M60 1QD, UK
| | - Giles Velarde
- School of Chemistry, Faraday Building, The University of Manchester, Manchester, M60 1QD, UK
| | - Andy Tseng
- School of Chemistry, Faraday Building, The University of Manchester, Manchester, M60 1QD, UK
| | - Helen Jenkins
- Department of Computer Science, The University of Wales, Aberystwyth, SY23 3DB, UK
| | - Nigel Hardy
- Department of Computer Science, The University of Wales, Aberystwyth, SY23 3DB, UK
| | - Stephen G Oliver
- Faculty of Life Sciences, Michael Smith Building, The University of Manchester, Manchester, M13 9PT, UK
| | - Douglas B Kell
- School of Chemistry, Faraday Building, The University of Manchester, Manchester, M60 1QD, UK
| |
Collapse
|
14
|
Kell DB. Theodor Bücher Lecture. Metabolomics, modelling and machine learning in systems biology - towards an understanding of the languages of cells. Delivered on 3 July 2005 at the 30th FEBS Congress and the 9th IUBMB conference in Budapest. FEBS J 2006; 273:873-94. [PMID: 16478464 DOI: 10.1111/j.1742-4658.2006.05136.x] [Citation(s) in RCA: 130] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The newly emerging field of systems biology involves a judicious interplay between high-throughput 'wet' experimentation, computational modelling and technology development, coupled to the world of ideas and theory. This interplay involves iterative cycles, such that systems biology is not at all confined to hypothesis-dependent studies, with intelligent, principled, hypothesis-generating studies being of high importance and consequently very far from aimless fishing expeditions. I seek to illustrate each of these facets. Novel technology development in metabolomics can increase substantially the dynamic range and number of metabolites that one can detect, and these can be exploited as disease markers and in the consequent and principled generation of hypotheses that are consistent with the data and achieve this in a value-free manner. Much of classical biochemistry and signalling pathway analysis has concentrated on the analyses of changes in the concentrations of intermediates, with 'local' equations - such as that of Michaelis and Menten v=(Vmax x S)/(S+K m) - that describe individual steps being based solely on the instantaneous values of these concentrations. Recent work using single cells (that are not subject to the intellectually unsupportable averaging of the variable displayed by heterogeneous cells possessing nonlinear kinetics) has led to the recognition that some protein signalling pathways may encode their signals not (just) as concentrations (AM or amplitude-modulated in a radio analogy) but via changes in the dynamics of those concentrations (the signals are FM or frequency-modulated). This contributes in principle to a straightforward solution of the crosstalk problem, leads to a profound reassessment of how to understand the downstream effects of dynamic changes in the concentrations of elements in these pathways, and stresses the role of signal processing (and not merely the intermediates) in biological signalling. It is this signal processing that lies at the heart of understanding the languages of cells. The resolution of many of the modern and postgenomic problems of biochemistry requires the development of a myriad of new technologies (and maybe a new culture), and thus regular input from the physical sciences, engineering, mathematics and computer science. One solution, that we are adopting in the Manchester Interdisciplinary Biocentre (http://www.mib.ac.uk/) and the Manchester Centre for Integrative Systems Biology (http://www.mcisb.org/), is thus to colocate individuals with the necessary combinations of skills. Novel disciplines that require such an integrative approach continue to emerge. These include fields such as chemical genomics, synthetic biology, distributed computational environments for biological data and modelling, single cell diagnostics/bionanotechnology, and computational linguistics/text mining.
Collapse
Affiliation(s)
- Douglas B Kell
- School of Chemistry, Faraday Building, The University of Manchester, UK.
| |
Collapse
|
15
|
Abstract
MOTIVATION The genome of Arabidopsis thaliana, which has the best understood plant genome, still has approximately one-third of its genes with no functional annotation at all from either MIPS or TAIR. We have applied our Data Mining Prediction (DMP) method to the problem of predicting the functional classes of these protein sequences. This method is based on using a hybrid machine-learning/data-mining method to identify patterns in the bioinformatic data about sequences that are predictive of function. We use data about sequence, predicted secondary structure, predicted structural domain, InterPro patterns, sequence similarity profile and expressions data. RESULTS We predicted the functional class of a high percentage of the Arabidopsis genes with currently unknown function. These predictions are interpretable and have good test accuracies. We describe in detail seven of the rules produced.
Collapse
Affiliation(s)
- A Clare
- Department of Computer Science, University of Wales Aberystwyth SY23 3DB, UK.
| | | | | | | |
Collapse
|
16
|
Yang J, Xu G, Zheng Y, Kong H, Wang C, Zhao X, Pang T. Strategy for metabonomics research based on high-performance liquid chromatography and liquid chromatography coupled with tandem mass spectrometry. J Chromatogr A 2005; 1084:214-21. [PMID: 16114257 DOI: 10.1016/j.chroma.2004.10.100] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Metabonomics, the study of metabolites and their roles in various disease states, is a novel methodology arising from the post-genomics era. This methodology has been applied in many fields. Current metabonomic practice has relied on mass spectrometry (MS), gas chromatography-mass spectrometry (GC-MS), and nuclear magnetic resonance (NMR) to analyze metabolites. In this study, a strategy was developed for applying high-performance liquid chromatography (HPLC) and LC-MS-MS to metabonomics research. One of the key problems to be solved in this strategy is to match the peaks between the chromatograms. A peak alignment algorithm has been developed to match the chromatograms before the pattern recognition. As an application example, the strategy described above was applied to metabonomics research on liver diseases, and the false-positive result of live cancer diagnosis from the hepatocirrhosis and hepatitis diseases was effectively reduced to 7.40%. Based on the pattern recognition, several potential biomarkers were found and further identified by the following LC-MS-MS experiments. The structures of eight potential biomarkers were given for distinguishing the liver cancer from the hepatocirrhosis and hepatitis diseases.
Collapse
Affiliation(s)
- Jun Yang
- National Chromatographic R&A Center, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, 116011 Dalian, China
| | | | | | | | | | | | | |
Collapse
|
17
|
Kell DB, Brown M, Davey HM, Dunn WB, Spasic I, Oliver SG. Metabolic footprinting and systems biology: the medium is the message. Nat Rev Microbiol 2005; 3:557-65. [PMID: 15953932 DOI: 10.1038/nrmicro1177] [Citation(s) in RCA: 261] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
One element of classical systems analysis treats a system as a black or grey box, the inner structure and behaviour of which can be analysed and modelled by varying an internal or external condition, probing it from outside and studying the effect of the variation on the external observables. The result is an understanding of the inner make-up and workings of the system. The equivalent of this in biology is to observe what a cell or system excretes under controlled conditions - the 'metabolic footprint' or exometabolome - as this is readily and accurately measurable. Here, we review the principles, experimental approaches and scientific outcomes that have been obtained with this useful and convenient strategy.
Collapse
Affiliation(s)
- Douglas B Kell
- School of Chemistry, University of Manchester, Faraday Building, PO Box 88, Sackville Street, Manchester M60 1QD, UK.
| | | | | | | | | | | |
Collapse
|
18
|
Kell DB. Metabolomics, machine learning and modelling: towards an understanding of the language of cells. Biochem Soc Trans 2005; 33:520-4. [PMID: 15916555 DOI: 10.1042/bst0330520] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In answering the question ‘Systems Biology – will it work?’ (which it self-evidently has already), it is appropriate to highlight advances in philosophy, in new technique development and in novel findings. In terms of philosophy, we see that systems biology involves an iterative interplay between linked activities – for instance, between theory and experiment, between induction and deduction and between measurements of parameters and variables – with more emphasis than has perhaps been common now being focused on the first in each of these pairs. In technique development, we highlight closed loop machine learning and its use in the optimization of scientific instrumentation, and the ability to effect high-quality and quasi-continuous optical images of cells. This leads to many important and novel findings. In the first case, these may involve new biomarkers for disease, whereas in the second case, we have determined that many biological signals may be frequency-rather than amplitude-encoded. This leads to a very different view of how signalling ‘works’ (equations such as that of Michaelis and Menten which use only amplitudes, i.e. concentrations, are inadequate descriptors), lays emphasis on the signal processing network elements that lie ‘downstream’ of what are traditionally considered the signals, and allows one simply to understand how cross-talk may be avoided between pathways which nevertheless use common signalling elements. The language of cells is much richer than we had supposed, and we are now well placed to decode it.
Collapse
Affiliation(s)
- D B Kell
- School of Chemistry, The University of Manchester, Faraday Building, Sackville Street, P.O. Box 88, Manchester M60 1QD, UK.
| |
Collapse
|
19
|
Goodacre R, Vaidyanathan S, Dunn WB, Harrigan GG, Kell DB. Metabolomics by numbers: acquiring and understanding global metabolite data. Trends Biotechnol 2005; 22:245-52. [PMID: 15109811 DOI: 10.1016/j.tibtech.2004.03.007] [Citation(s) in RCA: 781] [Impact Index Per Article: 41.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Royston Goodacre
- Department of Chemistry, UMIST, P.O. Box 88, Sackville Street, Manchester M60 1QD, UK.
| | | | | | | | | |
Collapse
|
20
|
Kikugawa S, Takehara H, Kuhara S, Kimura M. A Novel Model for Prediction of RNA binding Proteins. CHEM-BIO INFORMATICS JOURNAL 2005. [DOI: 10.1273/cbij.5.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Shingo Kikugawa
- Laboratory of Biochemistry, Department of Bioscience and Biotechnology, Faculty of Agriculture, Graduate School, Kyushu University
| | - Hideki Takehara
- Laboratory of Molecular Gene Technics, Faculty of Agriculture, Graduate School, Kyushu University
| | - Satoru Kuhara
- Laboratory of Molecular Gene Technics, Faculty of Agriculture, Graduate School, Kyushu University
| | - Makoto Kimura
- Laboratory of Biochemistry, Department of Bioscience and Biotechnology, Faculty of Agriculture, Graduate School, Kyushu University
| |
Collapse
|
21
|
German J, Watkins S. Metabolic assessment—a key to nutritional strategies for health. Trends Food Sci Technol 2004. [DOI: 10.1016/j.tifs.2004.01.009] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
22
|
Deng M, Zhang K, Mehta S, Chen T, Sun F. Prediction of protein function using protein-protein interaction data. J Comput Biol 2004; 10:947-60. [PMID: 14980019 DOI: 10.1089/106652703322756168] [Citation(s) in RCA: 238] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Assigning functions to novel proteins is one of the most important problems in the postgenomic era. Several approaches have been applied to this problem, including the analysis of gene expression patterns, phylogenetic profiles, protein fusions, and protein-protein interactions. In this paper, we develop a novel approach that employs the theory of Markov random fields to infer a protein's functions using protein-protein interaction data and the functional annotations of protein's interaction partners. For each function of interest and protein, we predict the probability that the protein has such function using Bayesian approaches. Unlike other available approaches for protein annotation in which a protein has or does not have a function of interest, we give a probability for having the function. This probability indicates how confident we are about the prediction. We employ our method to predict protein functions based on "biochemical function," "subcellular location," and "cellular role" for yeast proteins defined in the Yeast Proteome Database (YPD, www.incyte.com), using the protein-protein interaction data from the Munich Information Center for Protein Sequences (MIPS, mips.gsf.de). We show that our approach outperforms other available methods for function prediction based on protein interaction data. The supplementary data is available at www-hto.usc.edu/~msms/ProteinFunction.
Collapse
Affiliation(s)
- Minghua Deng
- Program in Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089-1113, USA
| | | | | | | | | |
Collapse
|
23
|
Deng M, Chen T, Sun F. An integrated probabilistic model for functional prediction of proteins. J Comput Biol 2004; 11:463-75. [PMID: 15285902 DOI: 10.1089/1066527041410346] [Citation(s) in RCA: 111] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We develop an integrated probabilistic model to combine protein physical interactions, genetic interactions, highly correlated gene expression networks, protein complex data, and domain structures of individual proteins to predict protein functions. The model is an extension of our previous model for protein function prediction based on Markovian random field theory. The model is flexible in that other protein pairwise relationship information and features of individual proteins can be easily incorporated. Two features distinguish the integrated approach from other available methods for protein function prediction. One is that the integrated approach uses all available sources of information with different weights for different sources of data. It is a global approach that takes the whole network into consideration. The second feature is that the posterior probability that a protein has the function of interest is assigned. The posterior probability indicates how confident we are about assigning the function to the protein. We apply our integrated approach to predict functions of yeast proteins based upon MIPS protein function classifications and upon the interaction networks based on MIPS physical and genetic interactions, gene expression profiles, tandem affinity purification (TAP) protein complex data, and protein domain information. We study the recall and precision of the integrated approach using different sources of information by the leave-one-out approach. In contrast to using MIPS physical interactions only, the integrated approach combining all of the information increases the recall from 57% to 87% when the precision is set at 57%-an increase of 30%.
Collapse
Affiliation(s)
- Minghua Deng
- Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, 1042 West 36th Place, Los Angeles, CA 90089-1113, USA
| | | | | |
Collapse
|
24
|
Kell DB, Oliver SG. Here is the evidence, now what is the hypothesis? The complementary roles of inductive and hypothesis-driven science in the post-genomic era. Bioessays 2004; 26:99-105. [PMID: 14696046 DOI: 10.1002/bies.10385] [Citation(s) in RCA: 279] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
It is considered in some quarters that hypothesis-driven methods are the only valuable, reliable or significant means of scientific advance. Data-driven or 'inductive' advances in scientific knowledge are then seen as marginal, irrelevant, insecure or wrong-headed, while the development of technology--which is not of itself 'hypothesis-led' (beyond the recognition that such tools might be of value)--must be seen as equally irrelevant to the hypothetico-deductive scientific agenda. We argue here that data- and technology-driven programmes are not alternatives to hypothesis-led studies in scientific knowledge discovery but are complementary and iterative partners with them. Many fields are data-rich but hypothesis-poor. Here, computational methods of data analysis, which may be automated, provide the means of generating novel hypotheses, especially in the post-genomic era.
Collapse
|
25
|
Castrillo JI, Oliver SG. Yeast as a Touchstone in Post-genomic Research: Strategies for Integrative Analysis in Functional Genomics. BMB Rep 2004; 37:93-106. [PMID: 14761307 DOI: 10.5483/bmbrep.2004.37.1.093] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
The new complexity arising from the genome sequencing projects requires new comprehensive post-genomic strategies: advanced studies in regulatory mechanisms, application of new high-throughput technologies at a genome-wide scale, at the different levels of cellular complexity (genome, transcriptome, proteome and metabolome), efficient analysis of the results, and application of new bioinformatic methods in an integrative or systems biology perspective. This can be accomplished in studies with model organisms under controlled conditions. In this review a perspective of the favourable characteristics of yeast as a touchstone model in post-genomic research is presented. The state-of-the art, latest advances in the field and bottlenecks, new strategies, new regulatory mechanisms, applications (patents) and high-throughput technologies, most of them being developed and validated in yeast, are presented. The optimal characteristics of yeast as a well-defined system for comprehensive studies under controlled conditions makes it a perfect model to be used in integrative, "systems biology" studies to get new insights into the mechanisms of regulation (regulatory networks) responsible of specific phenotypes under particular environmental conditions, to be applied to more complex organisms (e.g. plants, human).
Collapse
Affiliation(s)
- Juan I Castrillo
- School of Biological Sciences, University of Manchester, 2205 Stopford Building, Oxford Road, Manchester M13 9PT, UK.
| | | |
Collapse
|
26
|
Kell DB. Metabolomics and machine learning: explanatory analysis of complex metabolome data using genetic programming to produce simple, robust rules. Mol Biol Rep 2003; 29:237-41. [PMID: 12241064 DOI: 10.1023/a:1020342216314] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Douglas B Kell
- Institute of Biological Sciences, University of Wales, Aberystwyth, UK.
| |
Collapse
|
27
|
|
28
|
Allen J, Davey HM, Broadhurst D, Heald JK, Rowland JJ, Oliver SG, Kell DB. High-throughput classification of yeast mutants for functional genomics using metabolic footprinting. Nat Biotechnol 2003; 21:692-6. [PMID: 12740584 DOI: 10.1038/nbt823] [Citation(s) in RCA: 361] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2002] [Accepted: 02/28/2003] [Indexed: 11/08/2022]
Abstract
Many technologies have been developed to help explain the function of genes discovered by systematic genome sequencing. At present, transcriptome and proteome studies dominate large-scale functional analysis strategies. Yet the metabolome, because it is 'downstream', should show greater effects of genetic or physiological changes and thus should be much closer to the phenotype of the organism. We earlier presented a functional analysis strategy that used metabolic fingerprinting to reveal the phenotype of silent mutations of yeast genes. However, this is difficult to scale up for high-throughput screening. Here we present an alternative that has the required throughput (2 min per sample). This 'metabolic footprinting' approach recognizes the significance of 'overflow metabolism' in appropriate media. Measuring intracellular metabolites is time-consuming and subject to technical difficulties caused by the rapid turnover of intracellular metabolites and the need to quench metabolism and separate metabolites from the extracellular space. We therefore focused instead on direct, noninvasive, mass spectrometric monitoring of extracellular metabolites in spent culture medium. Metabolic footprinting can distinguish between different physiological states of wild-type yeast and between yeast single-gene deletion mutants even from related areas of metabolism. By using appropriate clustering and machine learning techniques, the latter based on genetic programming, we show that metabolic footprinting is an effective method to classify 'unknown' mutants by genetic defect.
Collapse
Affiliation(s)
- Jess Allen
- Institute of Biological Sciences, Cledwyn Building, University of Wales, Aberystwyth, Aberystwyth SY23 3DD, UK
| | | | | | | | | | | | | |
Collapse
|
29
|
Iizuka N, Oka M, Yamada-Okabe H, Nishida M, Maeda Y, Mori N, Takao T, Tamesa T, Tangoku A, Tabuchi H, Hamada K, Nakayama H, Ishitsuka H, Miyamoto T, Hirabayashi A, Uchimura S, Hamamoto Y. Oligonucleotide microarray for prediction of early intrahepatic recurrence of hepatocellular carcinoma after curative resection. Lancet 2003; 361:923-9. [PMID: 12648972 DOI: 10.1016/s0140-6736(03)12775-4] [Citation(s) in RCA: 406] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
BACKGROUND Hepatocellular carcinoma has a poor prognosis because of the high intrahepatic recurrence rate. There are technological limitations to traditional methods such as TNM staging for accurate prediction of recurrence, suggesting that new techniques are needed. METHODS We investigated mRNA expression profiles in tissue specimens from a training set, comprising 33 patients with hepatocellular carcinoma, with high-density oligonucleotide microarrays representing about 6000 genes. We used this training set in a supervised learning manner to construct a predictive system, consisting of 12 genes, with the Fisher linear classifier. We then compared the predictive performance of our system with that of a predictive system with a support vector machine (SVM-based system) on a blinded set of samples from 27 newly enrolled patients. FINDINGS Early intrahepatic recurrence within 1 year after curative surgery occurred in 12 (36%) and eight (30%) patients in the training and blinded sets, respectively. Our system correctly predicted early intrahepatic recurrence or non-recurrence in 25 (93%) of 27 samples in the blinded set and had a positive predictive value of 88% and a negative predictive value of 95%. By contrast, the SVM-based system predicted early intrahepatic recurrence or non-recurrence correctly in only 16 (60%) individuals in the blinded set, and the result yielded a positive predictive value of only 38% and a negative predictive value of 79%. INTERPRETATION Our system predicted early intrahepatic recurrence or non-recurrence for patients with hepatocellular carcinoma much more accurately than the SVM-based system, suggesting that our system could serve as a new method for characterising the metastatic potential of hepatocellular carcinoma.
Collapse
Affiliation(s)
- Norio Iizuka
- Department of Bioregulatory Function, Yamaguchi University School of Medicine, Ube, Yamaguchi, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Castrillo JI, Hayes A, Mohammed S, Gaskell SJ, Oliver SG. An optimized protocol for metabolome analysis in yeast using direct infusion electrospray mass spectrometry. PHYTOCHEMISTRY 2003; 62:929-37. [PMID: 12590120 DOI: 10.1016/s0031-9422(02)00713-6] [Citation(s) in RCA: 139] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
A method for the global analysis of yeast intracellular metabolites, based on electrospray mass spectrometry (ES-MS), has been developed. This has involved the optimization of methods for quenching metabolism in Saccharomyces cerevisiae and extracting the metabolites for analysis by positive-ion electrospray mass spectrometry. The influence of cultivation conditions, sampling, quenching and extraction conditions, concentration step, and storage have all been studied and adapted to allow direct infusion of samples into the mass spectrometer and the acquisition of metabolic profiles with simultaneous detection of more than 25 intracellular metabolites. The method, which can be applied to other micro-organisms and biological systems, may be used for comparative analysis and screening of metabolite profiles of yeast strains and mutants under controlled conditions in order to elucidate gene function via metabolomics. Examples of the application of this analytical strategy to specific yeast strains and single-ORF yeast deletion mutants generated through the EUROFAN programme are presented.
Collapse
Affiliation(s)
- Juan I Castrillo
- School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK
| | | | | | | | | |
Collapse
|
31
|
Abstract
An increasingly popular model of regulation is to represent networks of genes as if they directly affect each other. Although such gene networks are phenomenological because they do not explicitly represent the proteins and metabolites that mediate cell interactions, they are a logical way of describing phenomena observed with transcription profiling, such as those that occur with popular microarray technology. The ability to create gene networks from experimental data and use them to reason about their dynamics and design principles will increase our understanding of cellular function. We propose that gene networks are also a good way to describe function unequivocally, and that they could be used for genome functional annotation. Here, we review some of the concepts and methods associated with gene networks, with emphasis on their construction based on experimental data.
Collapse
Affiliation(s)
- Paul Brazhnik
- Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | | | | |
Collapse
|
32
|
Abstract
The effects of genes on phenotype are mediated by processes that are typically unknown but whose determination is desirable. The conversion from gene to phenotype is not a simple function of individual genes, but involves the complex interactions of many genes; it is what is known as a nonlinear mapping problem. A computational method called genetic programming allows the representation of candidate nonlinear mappings in several possible trees. To find the best model, the trees are 'evolved' by processes akin to mutation and recombination, and the trees that more closely represent the actual data are preferentially selected. The result is an improved tree of rules that represent the nonlinear mapping directly. In this way, the encoding of cellular and higher-order activities by genes is seen as directly analogous to computer programs. This analogy is of utility in biological genetics and in problems of genotype-phenotype mapping.
Collapse
|
33
|
Ricciardi-Castagnoli P, Granucci F. Opinion: Interpretation of the complexity of innate immune responses by functional genomics. Nat Rev Immunol 2002; 2:881-9. [PMID: 12415311 DOI: 10.1038/nri936] [Citation(s) in RCA: 77] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Understanding how the immune system is regulated and responds to pathogens will require whole-system approaches, because the study of single immunological parameters has, so far, been unable to unlock immune-system complexity. Global transcription analysis using microarray technologies provides a new approach to the description of complex biological phenomena. Here, we discuss insights into innate immunity that have been provided by genome-wide approaches and their impact on the interpretation of immune-system complexity.
Collapse
Affiliation(s)
- Paola Ricciardi-Castagnoli
- Department of Biotechnology and Bioscience, University of Milano-Bicocca, Piazza della Scienza 2, Milan, Italy.
| | | |
Collapse
|
34
|
Lin K, Kuang Y, Joseph JS, Kolatkar PR. Conserved codon composition of ribosomal protein coding genes in Escherichia coli, Mycobacterium tuberculosis and Saccharomyces cerevisiae: lessons from supervised machine learning in functional genomics. Nucleic Acids Res 2002; 30:2599-607. [PMID: 12034849 PMCID: PMC117187 DOI: 10.1093/nar/30.11.2599] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Genomics projects have resulted in a flood of sequence data. Functional annotation currently relies almost exclusively on inter-species sequence comparison and is restricted in cases of limited data from related species and widely divergent sequences with no known homologs. Here, we demonstrate that codon composition, a fusion of codon usage bias and amino acid composition signals, can accurately discriminate, in the absence of sequence homology information, cytoplasmic ribosomal protein genes from all other genes of known function in Saccharomyces cerevisiae, Escherichia coli and Mycobacterium tuberculosis using an implementation of support vector machines, SVM(light). Analysis of these codon composition signals is instructive in determining features that confer individuality to ribosomal protein genes. Each of the sets of positively charged, negatively charged and small hydrophobic residues, as well as codon bias, contribute to their distinctive codon composition profile. The representation of all these signals is sensitively detected, combined and augmented by the SVMs to perform an accurate classification. Of special mention is an obvious outlier, yeast gene RPL22B, highly homologous to RPL22A but employing very different codon usage, perhaps indicating a non-ribosomal function. Finally, we propose that codon composition be used in combination with other attributes in gene/protein classification by supervised machine learning algorithms.
Collapse
Affiliation(s)
- Kui Lin
- IMCB-BIC, Institute of Molecular and Cell Biology, 30 Medical Drive, 117609 Singapore
| | | | | | | |
Collapse
|
35
|
Chapter One Bioinformatics and computational biology for plant functional genomics. ACTA ACUST UNITED AC 2002. [DOI: 10.1016/s0079-9920(02)80017-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
36
|
Werf M, Schuren F, Bijlsma S, Tas A, Ommen BV. Nutrigenomics: Application of Genomics Technologies in Nutritional Sciences and Food Technology. J Food Sci 2001. [DOI: 10.1111/j.1365-2621.2001.tb15171.x] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
37
|
ter Kuile BH, Westerhoff HV. Transcriptome meets metabolome: hierarchical and metabolic regulation of the glycolytic pathway. FEBS Lett 2001; 500:169-71. [PMID: 11445079 DOI: 10.1016/s0014-5793(01)02613-8] [Citation(s) in RCA: 272] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The fact that information flows from DNA to RNA to protein to function suggests that regulation is 'hierarchical', i.e. dominated by regulation of gene expression. In the case of dominant regulation at the metabolic level, however, there is no quantitative relationship between mRNA levels and function. We here develop a method to quantitate the relative contributions of metabolic and hierarchical regulation. Applying this method to the glycolytic flux in three species of parasitic protists, we conclude that it is rarely regulated by gene expression alone. This casts strong doubts on whether transcriptome and proteome analysis suffices to assess biological function.
Collapse
|
38
|
Kell DB, Darby RM, Draper J. Genomic computing. Explanatory analysis of plant expression profiling data using machine learning. PLANT PHYSIOLOGY 2001; 126:943-951. [PMID: 11457944 PMCID: PMC1540126 DOI: 10.1104/pp.126.3.943] [Citation(s) in RCA: 44] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Affiliation(s)
- D B Kell
- of Biological Sciences, University of Wales, Aberystwyth SY23 3DD, United Kingdom
| | | | | |
Collapse
|
39
|
Brown AJ, Planta RJ, Restuhadi F, Bailey DA, Butler PR, Cadahia JL, Cerdan M, De Jonge M, Gardner DC, Gent ME, Hayes A, Kolen CP, Lombardia LJ, Murad AMA, Oliver RA, Sefton M, Thevelein JM, Tournu H, van Delft YJ, Verbart DJ, Winderickx J, Oliver SG. Transcript analysis of 1003 novel yeast genes using high-throughput northern hybridizations. EMBO J 2001; 20:3177-86. [PMID: 11406594 PMCID: PMC150198 DOI: 10.1093/emboj/20.12.3177] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The expression of 1008 open reading frames (ORFs) from the yeast Saccharomyces cerevisiae has been examined under eight different physiological conditions, using classical northern analysis. These northern data have been compared with publicly available data from a microarray analysis of the diauxic transition in S.cerevisiae. The results demonstrate the importance of comparing biologically equivalent situations and of the standardization of data normalization procedures. We have also used our northern data to identify co-regulated gene clusters and define the putative target sites of transcriptional activators responsible for their control. Clusters containing genes of known function identify target sites of known activators. In contrast, clusters comprised solely of genes of unknown function usually define novel putative target sites. Finally, we have examined possible global controls on gene expression. It was discovered that ORFs that are highly expressed following a nutritional upshift tend to employ favoured codons, whereas those overexpressed in starvation conditions do not. These results are interpreted in terms of a model in which competition between mRNA molecules for translational capacity selects for codons translated by abundant tRNAs.
Collapse
Affiliation(s)
| | - Rudi J. Planta
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - Fajar Restuhadi
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | | | - Philip R. Butler
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - Jose L. Cadahia
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - M.Esperanza Cerdan
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - Martine De Jonge
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - David C.J. Gardner
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - Manda E. Gent
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - Andrew Hayes
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - Carin P.A.M. Kolen
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - Luis J. Lombardia
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | | | - Rachel A. Oliver
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - Mark Sefton
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - Johan M. Thevelein
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | | | - Yvon J. van Delft
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - Dennis J. Verbart
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - Joris Winderickx
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| | - Stephen G. Oliver
- Department of Molecular and Cell Biology, University of Aberdeen, Institute of Medical Sciences, Foresterhill, Aberdeen AB25 2ZD,
Department of Biomolecular Sciences, UMIST, PO Box 88, Sackville St, Manchester M60 1QD, School of Biological Sciences, University of Manchester, 2.205 Stopford Building, Oxford Road, Manchester M13 9PT, UK, Department of Biochemistry and Molecular Biology, Vrije Universiteit, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands, Departamento de Biologia Celular y Molecular, Facultad de Ciencias, Universidad de la Coruna, Campus de la Zapateira s/n, E-15071 La Coruna, Spain and Laboratory of Molecular Cell Biology, Katholieke Universiteit Leuven, Kardinaal Mercierlaan 92, B-3001 Leuven-Heverlee, Belgium Corresponding author e-mail:
| |
Collapse
|
40
|
Lucchini S, Thompson A, Hinton JCD. Microarrays for microbiologists. MICROBIOLOGY (READING, ENGLAND) 2001; 147:1403-1414. [PMID: 11390672 DOI: 10.1099/00221287-147-6-1403] [Citation(s) in RCA: 89] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- S Lucchini
- Molecular Microbiology, Institute of Food Research, Norwich Research Park, Norwich NR4 7UA, UK1
| | - A Thompson
- Molecular Microbiology, Institute of Food Research, Norwich Research Park, Norwich NR4 7UA, UK1
| | - J C D Hinton
- Molecular Microbiology, Institute of Food Research, Norwich Research Park, Norwich NR4 7UA, UK1
| |
Collapse
|
41
|
Raychaudhuri S, Sutphin PD, Chang JT, Altman RB. Basic microarray analysis: grouping and feature reduction. Trends Biotechnol 2001; 19:189-93. [PMID: 11301132 DOI: 10.1016/s0167-7799(01)01599-2] [Citation(s) in RCA: 96] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
DNA microarray technologies are useful for addressing a broad range of biological problems - including the measurement of mRNA expression levels in target cells. These studies typically produce large data sets that contain measurements on thousands of genes under hundreds of conditions. There is a critical need to summarize this data and to pick out the important details. The most common activities, therefore, are to group together microarray data and to reduce the number of features. Both of these activities can be done using only the raw microarray data (unsupervised methods) or using external information that provides labels for the microarray data (supervised methods). We briefly review supervised and unsupervised methods for grouping and reducing data in the context of a publicly available suite of tools called CLEAVER, and illustrate their application on a representative data set collected to study lymphoma.
Collapse
Affiliation(s)
- S Raychaudhuri
- Stanford Medical Informatics, Department of Medicine, Stanford University, 251 Campus Drive, MSOB X-215, Stanford, CA 94305-5479, USA
| | | | | | | |
Collapse
|
42
|
Delneri D, Brancia FL, Oliver SG. Towards a truly integrative biology through the functional genomics of yeast. Curr Opin Biotechnol 2001; 12:87-91. [PMID: 11167079 DOI: 10.1016/s0958-1669(00)00179-8] [Citation(s) in RCA: 59] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
A complete library of mutant Saccharomyces cerevisiae strains, each deleted for a single representative of yeast's 6000 protein-encoding genes, has been constructed. This represents a major biological resource for the study of eukaryotic functional genomics. However, yeast is also being used as a test-bed for the development of functional genomic technologies at all levels of analysis, including the transcriptome, proteome and metabolome.
Collapse
Affiliation(s)
- D Delneri
- School of Biological Science, University of Manchester, 2.205 Stopford Building, Manchester Oxford Road, M13 9PT, UK.
| | | | | |
Collapse
|
43
|
Knowledge Discovery in Multi-label Phenotype Data. PRINCIPLES OF DATA MINING AND KNOWLEDGE DISCOVERY 2001. [DOI: 10.1007/3-540-44794-6_4] [Citation(s) in RCA: 333] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
|
44
|
King RD, Karwath A, Clare A, Dehaspe L. Accurate prediction of protein functional class from sequence in the Mycobacterium tuberculosis and Escherichia coli genomes using data mining. Yeast 2000; 17:283-93. [PMID: 11119305 PMCID: PMC2448385 DOI: 10.1002/1097-0061(200012)17:4<283::aid-yea52>3.0.co;2-f] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
The analysis of genomics data needs to become as automated as its generation. Here we present a novel data-mining approach to predicting protein functional class from sequence. This method is based on a combination of inductive logic programming clustering and rule learning. We demonstrate the effectiveness of this approach on the M. tuberculosis and E. coli genomes, and identify biologically interpretable rules which predict protein functional class from information only available from the sequence. These rules predict 65% of the ORFs with no assigned function in M. tuberculosis and 24% of those in E. coli, with an estimated accuracy of 60-80% (depending on the level of functional assignment). The rules are founded on a combination of detection of remote homology, convergent evolution and horizontal gene transfer. We identify rules that predict protein functional class even in the absence of detectable sequence or structural homology. These rules give insight into the evolutionary history of M. tuberculosis and E. coli.
Collapse
Affiliation(s)
- R D King
- Department of Computer Science, University of Wales, Aberystwyth, Penglais, Aberystwyth, Ceredigion SY23 3DB, UK
| | | | | | | |
Collapse
|
45
|
Current awareness on comparative and functional genomics. Yeast 2000; 17:339-46. [PMID: 11119313 PMCID: PMC2448380 DOI: 10.1002/1097-0061(200012)17:4<339::aid-yea10>3.0.co;2-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|
46
|
|