1
|
Lo-Thong O, Charton P, Cadet XF, Grondin-Perez B, Saavedra E, Damour C, Cadet F. Identification of flux checkpoints in a metabolic pathway through white-box, grey-box and black-box modeling approaches. Sci Rep 2020; 10:13446. [PMID: 32778715 PMCID: PMC7417601 DOI: 10.1038/s41598-020-70295-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 07/27/2020] [Indexed: 11/29/2022] Open
Abstract
Metabolic pathway modeling plays an increasing role in drug design by allowing better understanding of the underlying regulation and controlling networks in the metabolism of living organisms. However, despite rapid progress in this area, pathway modeling can become a real nightmare for researchers, notably when few experimental data are available or when the pathway is highly complex. Here, three different approaches were developed to model the second part of glycolysis of E. histolytica as an application example, and have succeeded in predicting the final pathway flux: one including detailed kinetic information (white-box), another with an added adjustment term (grey-box) and the last one using an artificial neural network method (black-box). Afterwards, each model was used for metabolic control analysis and flux control coefficient determination. The first two enzymes of this pathway are identified as the key enzymes playing a role in flux control. This study revealed the significance of the three methods for building suitable models adjusted to the available data in the field of metabolic pathway modeling, and could be useful to biologists and modelers.
Collapse
Affiliation(s)
- Ophélie Lo-Thong
- University of Paris, UMR_S1134, BIGR, Inserm, 75015, Paris, France.,DSIMB, UMR_S1134, BIGR, Inserm, Laboratory of Excellence GR-Ex, Faculty of Sciences and Technology, University of La Reunion, 97715, Saint-Denis, France
| | - Philippe Charton
- University of Paris, UMR_S1134, BIGR, Inserm, 75015, Paris, France.,DSIMB, UMR_S1134, BIGR, Inserm, Laboratory of Excellence GR-Ex, Faculty of Sciences and Technology, University of La Reunion, 97715, Saint-Denis, France
| | - Xavier F Cadet
- PEACCEL, Artificial Intelligence Department, 6 square Albin Cachot, box 42, 75013, Paris, France
| | - Brigitte Grondin-Perez
- LE2P, Laboratory of Energy, Electronics and Processes EA 4079, Faculty of Sciences and Technology, University of La Reunion, 97444, St Denis cedex, France
| | - Emma Saavedra
- Departamento de Bioquímica, Instituto Nacional de Cardiología Ignacio Chávez, 14080, Mexico City, Mexico
| | - Cédric Damour
- LE2P, Laboratory of Energy, Electronics and Processes EA 4079, Faculty of Sciences and Technology, University of La Reunion, 97444, St Denis cedex, France
| | - Frédéric Cadet
- University of Paris, UMR_S1134, BIGR, Inserm, 75015, Paris, France. .,DSIMB, UMR_S1134, BIGR, Inserm, Laboratory of Excellence GR-Ex, Faculty of Sciences and Technology, University of La Reunion, 97715, Saint-Denis, France.
| |
Collapse
|
2
|
Tenori L, Oakman C, Claudino WM, Bernini P, Cappadona S, Nepi S, Biganzoli L, Arbushites MC, Luchinat C, Bertini I, Di Leo A. Exploration of serum metabolomic profiles and outcomes in women with metastatic breast cancer: a pilot study. Mol Oncol 2012; 6:437-44. [PMID: 22687601 DOI: 10.1016/j.molonc.2012.05.003] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2012] [Revised: 04/26/2012] [Accepted: 05/18/2012] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Metabolomics, a global study of metabolites and small molecules, is a novel expanding field. In this pilot study, metabolomics has been applied to serum samples from women with metastatic breast cancer to explore outcomes and response to treatment. PATIENTS AND METHODS Pre-treatment and serial on-treatment serum samples were available from an international clinical trial in which 579 women with metastatic breast cancer were randomized to paclitaxel plus either a targeted anti-HER2 treatment (lapatinib) or placebo. Serum metabolomic profiles were obtained using 600 MHz nuclear magnetic resonance spectroscopy. Profiles were compared with time to progression, overall survival and treatment toxicity. RESULTS Pre- and on-treatment serum samples were assessed for over 500 patients. Unbiased metabolomic profiles in the biologically unselected overall trial population did not correlate with outcome or toxicity. In a subgroup of patients with HER2-positive disease treated with paclitaxel plus lapatinib, metabolomic profiles from patients in the upper and lower thirds of the dataset showed significant differences for time to progression (N = 22, predictive accuracy = 89.6%) and overall survival (N = 16, predictive accuracy = 78.0%). CONCLUSIONS In metastatic breast cancer, metabolomics may play a role in sub selecting patients with HER2 positive disease with greater sensitivity to paclitaxel plus lapatinib.
Collapse
Affiliation(s)
- Leonardo Tenori
- Magnetic Resonance Center (CERM), University of Florence, Via L. Sacconi 6, 50019 Sesto Fiorentino, Italy
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Schmidt MD, Vallabhajosyula RR, Jenkins JW, Hood JE, Soni AS, Wikswo JP, Lipson H. Automated refinement and inference of analytical models for metabolic networks. Phys Biol 2011; 8:055011. [PMID: 21832805 DOI: 10.1088/1478-3975/8/5/055011] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The reverse engineering of metabolic networks from experimental data is traditionally a labor-intensive task requiring a priori systems knowledge. Using a proven model as a test system, we demonstrate an automated method to simplify this process by modifying an existing or related model--suggesting nonlinear terms and structural modifications--or even constructing a new model that agrees with the system's time series observations. In certain cases, this method can identify the full dynamical model from scratch without prior knowledge or structural assumptions. The algorithm selects between multiple candidate models by designing experiments to make their predictions disagree. We performed computational experiments to analyze a nonlinear seven-dimensional model of yeast glycolytic oscillations. This approach corrected mistakes reliably in both approximated and overspecified models. The method performed well to high levels of noise for most states, could identify the correct model de novo, and make better predictions than ordinary parametric regression and neural network models. We identified an invariant quantity in the model, which accurately derived kinetics and the numerical sensitivity coefficients of the system. Finally, we compared the system to dynamic flux estimation and discussed the scaling and application of this methodology to automated experiment design and control in biological systems in real time.
Collapse
Affiliation(s)
- Michael D Schmidt
- Cornell Computational Systems Laboratory, Cornell University, Ithaca, NY, USA
| | | | | | | | | | | | | |
Collapse
|
4
|
Kell DB. Towards a unifying, systems biology understanding of large-scale cellular death and destruction caused by poorly liganded iron: Parkinson's, Huntington's, Alzheimer's, prions, bactericides, chemical toxicology and others as examples. Arch Toxicol 2010; 84:825-89. [PMID: 20967426 PMCID: PMC2988997 DOI: 10.1007/s00204-010-0577-x] [Citation(s) in RCA: 286] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2010] [Accepted: 07/14/2010] [Indexed: 12/11/2022]
Abstract
Exposure to a variety of toxins and/or infectious agents leads to disease, degeneration and death, often characterised by circumstances in which cells or tissues do not merely die and cease to function but may be more or less entirely obliterated. It is then legitimate to ask the question as to whether, despite the many kinds of agent involved, there may be at least some unifying mechanisms of such cell death and destruction. I summarise the evidence that in a great many cases, one underlying mechanism, providing major stresses of this type, entails continuing and autocatalytic production (based on positive feedback mechanisms) of hydroxyl radicals via Fenton chemistry involving poorly liganded iron, leading to cell death via apoptosis (probably including via pathways induced by changes in the NF-κB system). While every pathway is in some sense connected to every other one, I highlight the literature evidence suggesting that the degenerative effects of many diseases and toxicological insults converge on iron dysregulation. This highlights specifically the role of iron metabolism, and the detailed speciation of iron, in chemical and other toxicology, and has significant implications for the use of iron chelating substances (probably in partnership with appropriate anti-oxidants) as nutritional or therapeutic agents in inhibiting both the progression of these mainly degenerative diseases and the sequelae of both chronic and acute toxin exposure. The complexity of biochemical networks, especially those involving autocatalytic behaviour and positive feedbacks, means that multiple interventions (e.g. of iron chelators plus antioxidants) are likely to prove most effective. A variety of systems biology approaches, that I summarise, can predict both the mechanisms involved in these cell death pathways and the optimal sites of action for nutritional or pharmacological interventions.
Collapse
Affiliation(s)
- Douglas B Kell
- School of Chemistry and the Manchester Interdisciplinary Biocentre, The University of Manchester, Manchester M1 7DN, UK.
| |
Collapse
|
5
|
Chou IC, Voit EO. Recent developments in parameter estimation and structure identification of biochemical and genomic systems. Math Biosci 2009; 219:57-83. [PMID: 19327372 PMCID: PMC2693292 DOI: 10.1016/j.mbs.2009.03.002] [Citation(s) in RCA: 298] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2008] [Revised: 03/06/2009] [Accepted: 03/15/2009] [Indexed: 01/16/2023]
Abstract
The organization, regulation and dynamical responses of biological systems are in many cases too complex to allow intuitive predictions and require the support of mathematical modeling for quantitative assessments and a reliable understanding of system functioning. All steps of constructing mathematical models for biological systems are challenging, but arguably the most difficult task among them is the estimation of model parameters and the identification of the structure and regulation of the underlying biological networks. Recent advancements in modern high-throughput techniques have been allowing the generation of time series data that characterize the dynamics of genomic, proteomic, metabolic, and physiological responses and enable us, at least in principle, to tackle estimation and identification tasks using 'top-down' or 'inverse' approaches. While the rewards of a successful inverse estimation or identification are great, the process of extracting structural and regulatory information is technically difficult. The challenges can generally be categorized into four areas, namely, issues related to the data, the model, the mathematical structure of the system, and the optimization and support algorithms. Many recent articles have addressed inverse problems within the modeling framework of Biochemical Systems Theory (BST). BST was chosen for these tasks because of its unique structural flexibility and the fact that the structure and regulation of a biological system are mapped essentially one-to-one onto the parameters of the describing model. The proposed methods mainly focused on various optimization algorithms, but also on support techniques, including methods for circumventing the time consuming numerical integration of systems of differential equations, smoothing overly noisy data, estimating slopes of time series, reducing the complexity of the inference task, and constraining the parameter search space. Other methods targeted issues of data preprocessing, detection and amelioration of model redundancy, and model-free or model-based structure identification. The total number of proposed methods and their applications has by now exceeded one hundred, which makes it difficult for the newcomer, as well as the expert, to gain a comprehensive overview of available algorithmic options and limitations. To facilitate the entry into the field of inverse modeling within BST and related modeling areas, the article presented here reviews the field and proposes an operational 'work-flow' that guides the user through the estimation process, identifies possibly problematic steps, and suggests corresponding solutions based on the specific characteristics of the various available algorithms. The article concludes with a discussion of the present state of the art and with a description of open questions.
Collapse
Affiliation(s)
- I-Chun Chou
- Integrative BioSystems Institute and The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, 313 Ferst Drive, Atlanta, GA 30332, USA.
| | | |
Collapse
|
6
|
Abstract
Mathematical modelling has great potential in biochemical network analysis because, in contrast with the unaided human mind, mathematics has no problems keeping track of hundreds of interacting variables that affect each other in intricate ways. The scalability of mathematical models, together with their ability to capture all imaginable non-linear responses, allows us to explore the dynamics of complicated pathway systems, to study what happens if a metabolite, gene or enzyme is altered, and to optimize biochemical systems, for instance toward the goal of increased yield of some desired organic compound. Before we can utilize models for such purposes, we must define their mathematical structure and identify suitable parameter values. Because nature has not provided us with guidelines for selecting the best model design, the choice of the most useful model is not trivial. In the present chapter I show that power-law modelling within BST (Biochemical Systems Theory) offers guidance for model selection, construction and analysis that is otherwise difficult to find.
Collapse
|
7
|
Ressom HW, Zhang Y, Xuan J, Wang Y, Clarke R. Inferring network interactions using recurrent neural networks and swarm intelligence. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2008; 2006:4241-4. [PMID: 17946231 DOI: 10.1109/iembs.2006.259812] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
We present a novel algorithm combining artificial neural networks and swarm intelligence (SI) methods to infer network interactions. The algorithm uses ant colony optimization (ACO) to identify the optimal architecture of a recurrent neural network (RNN), while the weights of the RNN are optimized using particle swarm optimization (PSO). Our goal is to construct an RNN that mimics the true structure of an unknown network and the time-series data that the network generated. We applied the proposed hybrid SI-RNN algorithm to infer a simulated genetic network. The results indicate that the algorithm has a promising potential to infer complex interactions such as gene regulatory networks from time-series gene expression data.
Collapse
Affiliation(s)
- Habtom W Ressom
- Dept. of Biostat., Bioinf., & Biomath., Georgetown Univ., Washington, DC 20057, USA
| | | | | | | | | |
Collapse
|
8
|
|
9
|
Kell DB. Theodor Bücher Lecture. Metabolomics, modelling and machine learning in systems biology - towards an understanding of the languages of cells. Delivered on 3 July 2005 at the 30th FEBS Congress and the 9th IUBMB conference in Budapest. FEBS J 2006; 273:873-94. [PMID: 16478464 DOI: 10.1111/j.1742-4658.2006.05136.x] [Citation(s) in RCA: 130] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The newly emerging field of systems biology involves a judicious interplay between high-throughput 'wet' experimentation, computational modelling and technology development, coupled to the world of ideas and theory. This interplay involves iterative cycles, such that systems biology is not at all confined to hypothesis-dependent studies, with intelligent, principled, hypothesis-generating studies being of high importance and consequently very far from aimless fishing expeditions. I seek to illustrate each of these facets. Novel technology development in metabolomics can increase substantially the dynamic range and number of metabolites that one can detect, and these can be exploited as disease markers and in the consequent and principled generation of hypotheses that are consistent with the data and achieve this in a value-free manner. Much of classical biochemistry and signalling pathway analysis has concentrated on the analyses of changes in the concentrations of intermediates, with 'local' equations - such as that of Michaelis and Menten v=(Vmax x S)/(S+K m) - that describe individual steps being based solely on the instantaneous values of these concentrations. Recent work using single cells (that are not subject to the intellectually unsupportable averaging of the variable displayed by heterogeneous cells possessing nonlinear kinetics) has led to the recognition that some protein signalling pathways may encode their signals not (just) as concentrations (AM or amplitude-modulated in a radio analogy) but via changes in the dynamics of those concentrations (the signals are FM or frequency-modulated). This contributes in principle to a straightforward solution of the crosstalk problem, leads to a profound reassessment of how to understand the downstream effects of dynamic changes in the concentrations of elements in these pathways, and stresses the role of signal processing (and not merely the intermediates) in biological signalling. It is this signal processing that lies at the heart of understanding the languages of cells. The resolution of many of the modern and postgenomic problems of biochemistry requires the development of a myriad of new technologies (and maybe a new culture), and thus regular input from the physical sciences, engineering, mathematics and computer science. One solution, that we are adopting in the Manchester Interdisciplinary Biocentre (http://www.mib.ac.uk/) and the Manchester Centre for Integrative Systems Biology (http://www.mcisb.org/), is thus to colocate individuals with the necessary combinations of skills. Novel disciplines that require such an integrative approach continue to emerge. These include fields such as chemical genomics, synthetic biology, distributed computational environments for biological data and modelling, single cell diagnostics/bionanotechnology, and computational linguistics/text mining.
Collapse
Affiliation(s)
- Douglas B Kell
- School of Chemistry, Faraday Building, The University of Manchester, UK.
| |
Collapse
|
10
|
Antoniewicz MR, Stephanopoulos G, Kelleher JK. Evaluation of regression models in metabolic physiology: predicting fluxes from isotopic data without knowledge of the pathway. Metabolomics 2006; 2:41-52. [PMID: 17066125 PMCID: PMC1622920 DOI: 10.1007/s11306-006-0018-2] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/01/2005] [Accepted: 02/06/2006] [Indexed: 10/27/2022]
Abstract
This study explores the ability of regression models, with no knowledge of the underlying physiology, to estimate physiological parameters relevant for metabolism and endocrinology. Four regression models were compared: multiple linear regression (MLR), principal component regression (PCR), partial least-squares regression (PLS) and regression using artificial neural networks (ANN). The pathway of mammalian gluconeogenesis was analyzed using [U-(13)C]glucose as tracer. A set of data was simulated by randomly selecting physiologically appropriate metabolic fluxes for the 9 steps of this pathway as independent variables. The isotope labeling patterns of key intermediates in the pathway were then calculated for each set of fluxes, yielding 29 dependent variables. Two thousand sets were created, allowing independent training and test data. Regression models were asked to predict the nine fluxes, given only the 29 isotopomers. For large training sets (>50) the artificial neural network model was superior, capturing 95% of the variability in the gluconeogenic flux, whereas the three linear models captured only 75%. This reflects the ability of neural networks to capture the inherent non-linearities of the metabolic system. The effect of error in the variables and the addition of random variables to the data set was considered. Model sensitivities were used to find the isotopomers that most influenced the predicted flux values. These studies provide the first test of multivariate regression models for the analysis of isotopomer flux data. They provide insight for metabolomics and the future of isotopic tracers in metabolic research where the underlying physiology is complex or unknown.
Collapse
Affiliation(s)
- Maciek R. Antoniewicz
- Department of Chemical Engineering, Bioinformatics and Metabolic Engineering Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139 USA
| | - Gregory Stephanopoulos
- Department of Chemical Engineering, Bioinformatics and Metabolic Engineering Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139 USA
| | - Joanne K. Kelleher
- Department of Chemical Engineering, Bioinformatics and Metabolic Engineering Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139 USA
| |
Collapse
|
11
|
Kell DB. Metabolomics, machine learning and modelling: towards an understanding of the language of cells. Biochem Soc Trans 2005; 33:520-4. [PMID: 15916555 DOI: 10.1042/bst0330520] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In answering the question ‘Systems Biology – will it work?’ (which it self-evidently has already), it is appropriate to highlight advances in philosophy, in new technique development and in novel findings. In terms of philosophy, we see that systems biology involves an iterative interplay between linked activities – for instance, between theory and experiment, between induction and deduction and between measurements of parameters and variables – with more emphasis than has perhaps been common now being focused on the first in each of these pairs. In technique development, we highlight closed loop machine learning and its use in the optimization of scientific instrumentation, and the ability to effect high-quality and quasi-continuous optical images of cells. This leads to many important and novel findings. In the first case, these may involve new biomarkers for disease, whereas in the second case, we have determined that many biological signals may be frequency-rather than amplitude-encoded. This leads to a very different view of how signalling ‘works’ (equations such as that of Michaelis and Menten which use only amplitudes, i.e. concentrations, are inadequate descriptors), lays emphasis on the signal processing network elements that lie ‘downstream’ of what are traditionally considered the signals, and allows one simply to understand how cross-talk may be avoided between pathways which nevertheless use common signalling elements. The language of cells is much richer than we had supposed, and we are now well placed to decode it.
Collapse
Affiliation(s)
- D B Kell
- School of Chemistry, The University of Manchester, Faraday Building, Sackville Street, P.O. Box 88, Manchester M60 1QD, UK.
| |
Collapse
|
12
|
Veflingstad SR, Almeida J, Voit EO. Priming nonlinear searches for pathway identification. Theor Biol Med Model 2004; 1:8. [PMID: 15367330 PMCID: PMC522751 DOI: 10.1186/1742-4682-1-8] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2004] [Accepted: 09/14/2004] [Indexed: 11/21/2022] Open
Abstract
BACKGROUND Dense time series of metabolite concentrations or of the expression patterns of proteins may be available in the near future as a result of the rapid development of novel, high-throughput experimental techniques. Such time series implicitly contain valuable information about the connectivity and regulatory structure of the underlying metabolic or proteomic networks. The extraction of this information is a challenging task because it usually requires nonlinear estimation methods that involve iterative search algorithms. Priming these algorithms with high-quality initial guesses can greatly accelerate the search process. In this article, we propose to obtain such guesses by preprocessing the temporal profile data and fitting them preliminarily by multivariate linear regression. RESULTS The results of a small-scale analysis indicate that the regression coefficients reflect the connectivity of the network quite well. Using the mathematical modeling framework of Biochemical Systems Theory (BST), we also show that the regression coefficients may be translated into constraints on the parameter values of the nonlinear BST model, thereby reducing the parameter search space considerably. CONCLUSION The proposed method provides a good approach for obtaining a preliminary network structure from dense time series. This will be more valuable as the systems become larger, because preprocessing and effective priming can significantly limit the search space of parameters defining the network connectivity, thereby facilitating the nonlinear estimation task.
Collapse
Affiliation(s)
- Siren R Veflingstad
- Department of Chemistry, Biotechnology and Food Science, Agricultural University of Norway, N-1432 Ås, Norway
- Center for Integrative Genetics (Cigene), Agricultural University of Norway, N-1432 Ås, Norway
| | - Jonas Almeida
- Department of Biostatistics, Bioinformatics and Epidemiology, Medical University of South Carolina, 303K Cannon Place, 135 Cannon Street, Charleston, SC 29425, USA
| | - Eberhard O Voit
- Department of Biostatistics, Bioinformatics and Epidemiology, Medical University of South Carolina, 303K Cannon Place, 135 Cannon Street, Charleston, SC 29425, USA
- Department of Biochemistry and Molecular Biology, Medical University of South Carolina, 303K Cannon Place, 171 Ashley Avenue, Charleston, SC 29425, USA
| |
Collapse
|
13
|
Kell DB, King RD. On the optimization of classes for the assignment of unidentified reading frames in functional genomics programmes: the need for machine learning. Trends Biotechnol 2000; 18:93-8. [PMID: 10675895 DOI: 10.1016/s0167-7799(99)01407-9] [Citation(s) in RCA: 62] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
At present, the assignment of function to novel genes uncovered by the systematic genome-sequencing programmes is a problem. Many studies anticipate that this can be achieved by analysing patterns of gene expression via the transcriptome, proteome and metabolome. Thus, functional genomics is, in part, an exercise in pattern classification. Because many genes have known functional classes, the problem of predicting their functional class is a supervised learning problem. However, most pattern classification methods that have been applied to the problem have been unsupervised clustering methods. Consequently, the best classification tools have not always been used. Furthermore, the present functional classes are suboptimal and new unsupervised clustering methods are needed to improve them. Better-structured functional classes will facilitate the prediction of biochemically testable functions.
Collapse
Affiliation(s)
- D B Kell
- Institute of Biological Sciences, University of Wales, Aberystwyth, UK SY23 3DD.
| | | |
Collapse
|
14
|
Shaw AD, di Camillo A, Vlahov G, Jones A, Bianchi G, Rowland J, Kell DB. Discrimination of the variety and region of origin of extra virgin olive oils using 13C NMR and multivariate calibration with variable reduction. Anal Chim Acta 1997. [DOI: 10.1016/s0003-2670(97)00037-8] [Citation(s) in RCA: 51] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|