1
|
Joshi PB. Navigating with chemometrics and machine learning in chemistry. Artif Intell Rev 2023; 56:1-26. [PMID: 36714038 PMCID: PMC9870782 DOI: 10.1007/s10462-023-10391-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/09/2023] [Indexed: 01/25/2023]
Abstract
Chemometrics and machine learning are artificial intelligence-based methods stirring a transformative change in chemistry. Organic synthesis, drug discovery and analytical techniques are incorporating machine learning techniques at an accelerated pace. However, machine-assisted chemistry faces challenges while solving critical problems in chemistry due to complex relationships in data sets. Even with increasing publishing volumes on machine learning, its application in areas of chemistry is not a straightforward endeavour. A particular concern in applying machine learning in chemistry is data availability and reproducibility. The present review article discusses the various chemometric methods, expert systems, and machine learning techniques developed for solving problems of organic synthesis and drug discovery with selected examples. Further, a concise discussion on chemometrics and ML deployed in analytical techniques such as, spectroscopy, microscopy and chromatography are presented. Finally, the review reflects the challenges, opportunities and future perspectives on machine learning and automation in chemistry. The review concludes by pondering on some tough questions on applying machine learning and their possibility of navigation in the different terrains of chemistry.
Collapse
Affiliation(s)
- Payal B. Joshi
- Operations and Method Development, Shefali Research Laboratories, Ambernath (East), Thane, Maharashtra 421501 India
| |
Collapse
|
2
|
Williams W, Zeng L, Gensch T, Sigman MS, Doyle AG, Anslyn EV. The Evolution of Data-Driven Modeling in Organic Chemistry. ACS CENTRAL SCIENCE 2021; 7:1622-1637. [PMID: 34729406 PMCID: PMC8554870 DOI: 10.1021/acscentsci.1c00535] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Indexed: 05/14/2023]
Abstract
Organic chemistry is replete with complex relationships: for example, how a reactant's structure relates to the resulting product formed; how reaction conditions relate to yield; how a catalyst's structure relates to enantioselectivity. Questions like these are at the foundation of understanding reactivity and developing novel and improved reactions. An approach to probing these questions that is both longstanding and contemporary is data-driven modeling. Here, we provide a synopsis of the history of data-driven modeling in organic chemistry and the terms used to describe these endeavors. We include a timeline of the steps that led to its current state. The case studies included highlight how, as a community, we have advanced physical organic chemistry tools with the aid of computers and data to augment the intuition of expert chemists and to facilitate the prediction of structure-activity and structure-property relationships.
Collapse
Affiliation(s)
- Wendy
L. Williams
- Department
of Chemistry and Biochemistry, University
of California, Los Angeles, California 90095, United States
- Department
of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Lingyu Zeng
- Department
of Chemistry, The University of Texas at
Austin, Austin, Texas 78712, United States
| | - Tobias Gensch
- Department
of Chemistry, TU Berlin, Straße des 17. Juni 135, Sekr. C2, 10623 Berlin, Germany
| | - Matthew S. Sigman
- Department
of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Abigail G. Doyle
- Department
of Chemistry and Biochemistry, University
of California, Los Angeles, California 90095, United States
- Department
of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Eric V. Anslyn
- Department
of Chemistry, The University of Texas at
Austin, Austin, Texas 78712, United States
| |
Collapse
|
3
|
Abstract
Two-dimensional gel electrophoresis has been instrumental in the development of proteomics. Although it is no longer the exclusive scheme used for proteomics, its unique features make it a still highly valuable tool, especially when multiple quantitative comparisons of samples must be made, and even for large samples series. However, quantitative proteomics using two-dimensional gels is critically dependent on the performances of the protein detection methods used after the electrophoretic separations. This chapter therefore examines critically the various detection methods, (radioactivity, dyes, fluorescence, and silver) as well as the data analysis issues that must be taken into account when quantitative comparative analysis of two-dimensional gels is performed.
Collapse
|
4
|
Molina-Mora JA, Chinchilla-Montero D, Castro-Peña C, García F. Two-dimensional gel electrophoresis (2D-GE) image analysis based on CellProfiler: Pseudomonas aeruginosa AG1 as model. Medicine (Baltimore) 2020; 99:e23373. [PMID: 33285719 PMCID: PMC7717798 DOI: 10.1097/md.0000000000023373] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Two-dimensional gel electrophoresis (2D-GE) is an indispensable technique for the study of proteomes of biological systems, providing an assessment of changes in protein abundance under various experimental conditions. However, due to the complexity of 2D-GE gels, there is no systematic, automatic, and reproducible protocol for image analysis and specific implementations are required for each context. In addition, practically all available solutions are commercial, which implies high cost and little flexibility to modulate the parameters of the algorithms. Using the bacterial strain, Pseudomonas aeruginosaAG1 as a model, we obtained images from 2D-GE of periplasmic protein profiles when the strain was exposed to multiple conditions, including antibiotics. Then, we proceeded to implement and evaluate an image analysis protocol with open-source software, CellProfiler. First, a preprocessing step included a bUnwarpJ-Image pipeline for aligning 2D-GE images. Then, using CellProfiler, we standardized two pipelines for spots identification. Total spots recognition was achieved using segmentation by intensity, whose performance was evaluated when compared with a reference protocol. In a second pipeline with the same program, differential identification of spots was addressed when comparing pairs of protein profiles. Due to the characteristics of the programs used, our workflow can automatically analyze a large number of images and it is parallelizable, which is an advantage with respect to other implementations. Finally, we compared six experimental conditions of bacterial strain in the presence or absence of antibiotics, determining protein profiles relationships by applying clustering algorithms PCA (Principal Components Analysis) and HC (Hierarchical Clustering).
Collapse
|
5
|
Santucci L, Bruschi M, Ghiggeri GM, Candiano G. The latest advancements in proteomic two-dimensional gel electrophoresis analysis applied to biological samples. Methods Mol Biol 2015; 1243:103-125. [PMID: 25384742 DOI: 10.1007/978-1-4939-1872-0_6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Two-dimensional gel electrophoresis (2DE) is one of the fundamental approaches in proteomics for the separation and visualization of complex protein mixtures. Proteins can be analyzed by 2DE using isoelectric focusing (IEF) in the first dimension, combined to sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) in the second dimension, gel staining (silver and Coomassie), image analysis, and 2DE gel database. High-resolution 2DE can resolve up to 5,000 different proteins simultaneously (∼2,000 proteins routinely), and detect and quantify <1 ng of protein per spot. Here, we describe the latest developments for a more complete analysis of biological fluids.
Collapse
Affiliation(s)
- Laura Santucci
- Laboratory on Pathophysiology of Uremia, Istituto Giannina Gaslini, Largo G. Gaslini 5, Genoa, Italy
| | | | | | | |
Collapse
|
6
|
Paleoproteomics explained to youngsters: how did the wedding of two-dimensional electrophoresis and protein sequencing spark proteomics on: let there be light. J Proteomics 2014; 107:5-12. [PMID: 24657497 DOI: 10.1016/j.jprot.2014.03.011] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2014] [Revised: 02/26/2014] [Accepted: 03/04/2014] [Indexed: 11/22/2022]
Abstract
UNLABELLED Taking the opportunity of the 20th anniversary of the word "proteomics", this young adult age is a good time to remember how proteomics came from enormous progress in protein separation and protein microanalysis techniques, and from the conjugation of these advances into a high performance and streamlined working setup. However, in the history of the almost three decades that encompass the first attempts to perform large scale analysis of proteins to the current high throughput proteomics that we can enjoy now, it is also interesting to underline and to recall how difficult the first decade was. Indeed when the word was cast, the battle was already won. This recollection is mostly devoted to the almost forgotten period where proteomics was being conceived and put to birth, as this collective scientific work will never appear when searched through the keyword "proteomics". BIOLOGICAL SIGNIFICANCE The significance of this manuscript is to recall and review the two decades that separated the first attempts of performing large scale analysis of proteins from the solid technical corpus that existed when the word "proteomics" was coined twenty years ago. This recollection is made within the scientific historical context of this decade, which also saw the blossoming of DNA cloning and sequencing. This article is part of a Special Issue entitled: 20 years of Proteomics in memory of Viatliano Pallini. Guest Editors: Luca Bini , Juan J. Calvete, Natacha Turck, Denis Hochstrasser and Jean-Charles Sanchez.
Collapse
|
7
|
Affiliation(s)
- Dirk Benndorf
- Department of Bioprocess Engineering; Otto von Guericke University Magdeburg; Magdeburg Germany
| | - Udo Reichl
- Department of Bioprocess Engineering; Otto von Guericke University Magdeburg; Magdeburg Germany
- Department of Bioprocess Engineering; Max Planck Institute for Dynamics of Complex Technical Systems; Magdeburg Germany
| |
Collapse
|
8
|
Abstract
Two-dimensional gel electrophoresis has been instrumental in the development of proteomics. Although it is no longer the exclusive scheme used for proteomics, its unique features make it a still highly valuable tool, especially when multiple quantitative comparisons of samples must be made, and even for large samples series. However, quantitative proteomics using 2D gels is critically dependent on the performances of the protein detection methods used after the electrophoretic separations. This chapter therefore examines critically the various detection methods (radioactivity, dyes, fluorescence, and silver) as well as the data analysis issues that must be taken into account when quantitative comparative analysis of 2D gels is performed.
Collapse
Affiliation(s)
- Thierry Rabilloud
- CEA-DSV-iRTSV/CBM and UMR CNRS-UJF 5249, CEA Grenoble, Grenoble, France.
| |
Collapse
|
9
|
Two-dimensional gel electrophoresis in proteomics: a tutorial. J Proteomics 2011; 74:1829-41. [PMID: 21669304 DOI: 10.1016/j.jprot.2011.05.040] [Citation(s) in RCA: 169] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2011] [Revised: 05/23/2011] [Accepted: 05/26/2011] [Indexed: 12/12/2022]
Abstract
Two-dimensional electrophoresis of proteins has preceded, and accompanied, the birth of proteomics. Although it is no longer the only experimental scheme used in modern proteomics, it still has distinct features and advantages. The purpose of this tutorial paper is to guide the reader through the history of the field, then through the main steps of the process, from sample preparation to in-gel detection of proteins, commenting the constraints and caveats of the technique. Then the limitations and positive features of two-dimensional electrophoresis are discussed (e.g. its unique ability to separate complete proteins and its easy interfacing with immunoblotting techniques), so that the optimal type of applications of this technique in current and future proteomics can be perceived. This is illustrated by a detailed example taken from the literature and commented in detail. This Tutorial is part of the International Proteomics Tutorial Programme (IPTP 2).
Collapse
|
10
|
Two-dimensional gel electrophoresis in proteomics: Past, present and future. J Proteomics 2010; 73:2064-77. [PMID: 20685252 DOI: 10.1016/j.jprot.2010.05.016] [Citation(s) in RCA: 288] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2010] [Revised: 05/20/2010] [Accepted: 05/25/2010] [Indexed: 12/14/2022]
Abstract
Two-dimensional gel electrophoresis has been instrumental in the birth and developments of proteomics, although it is no longer the exclusive separation tool used in the field of proteomics. In this review, a historical perspective is made, starting from the days where two-dimensional gels were used and the word proteomics did not even exist. The events that have led to the birth of proteomics are also recalled, ending with a description of the now well-known limitations of two-dimensional gels in proteomics. However, the often-underestimated advantages of two-dimensional gels are also underlined, leading to a description of how and when to use two-dimensional gels for the best in a proteomics approach. Taking support of these advantages (robustness, resolution, and ability to separate entire, intact proteins), possible future applications of this technique in proteomics are also mentioned.
Collapse
|
11
|
Apweiler R, Aslanidis C, Deufel T, Gerstner A, Hansen J, Hochstrasser D, Kellner R, Kubicek M, Lottspeich F, Maser E, Mewes HW, Meyer HE, Müllner S, Mutter W, Neumaier M, Nollau P, Nothwang HG, Ponten F, Radbruch A, Reinert K, Rothe G, Stockinger H, Tárnok A, Taussig MJ, Thiel A, Thiery J, Ueffing M, Valet G, Vandekerckhove J, Wagener C, Wagner O, Schmitz G. Approaching clinical proteomics: Current state and future fields of application in cellular proteomics. Cytometry A 2009; 75:816-32. [DOI: 10.1002/cyto.a.20779] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
12
|
Rabilloud T, Vaezzadeh AR, Potier N, Lelong C, Leize-Wagner E, Chevallet M. Power and limitations of electrophoretic separations in proteomics strategies. MASS SPECTROMETRY REVIEWS 2009; 28:816-843. [PMID: 19072760 DOI: 10.1002/mas.20204] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Proteomics can be defined as the large-scale analysis of proteins. Due to the complexity of biological systems, it is required to concatenate various separation techniques prior to mass spectrometry. These techniques, dealing with proteins or peptides, can rely on chromatography or electrophoresis. In this review, the electrophoretic techniques are under scrutiny. Their principles are recalled, and their applications for peptide and protein separations are presented and critically discussed. In addition, the features that are specific to gel electrophoresis and that interplay with mass spectrometry (i.e., protein detection after electrophoresis, and the process leading from a gel piece to a solution of peptides) are also discussed.
Collapse
|
13
|
Apweiler R, Aslanidis C, Deufel T, Gerstner A, Hansen J, Hochstrasser D, Kellner R, Kubicek M, Lottspeich F, Maser E, Mewes HW, Meyer HE, Müllner S, Mutter W, Neumaier M, Nollau P, Nothwang HG, Ponten F, Radbruch A, Reinert K, Rothe G, Stockinger H, Tarnok A, Taussig MJ, Thiel A, Thiery J, Ueffing M, Valet G, Vandekerckhove J, Verhuven W, Wagener C, Wagner O, Schmitz G. Approaching clinical proteomics: current state and future fields of application in fluid proteomics. Clin Chem Lab Med 2009; 47:724-44. [DOI: 10.1515/cclm.2009.167] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
14
|
Gonzalez E, Neuhaus T, Kemper MJ, Girardin E. Proteomic analysis of mononuclear cells of patients with minimal-change nephrotic syndrome of childhood. Nephrol Dial Transplant 2008; 24:149-55. [DOI: 10.1093/ndt/gfn459] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
15
|
Berth M, Moser FM, Kolbe M, Bernhardt J. The state of the art in the analysis of two-dimensional gel electrophoresis images. Appl Microbiol Biotechnol 2007; 76:1223-43. [PMID: 17713763 PMCID: PMC2279157 DOI: 10.1007/s00253-007-1128-0] [Citation(s) in RCA: 140] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2007] [Revised: 07/13/2007] [Accepted: 07/14/2007] [Indexed: 11/21/2022]
Abstract
Software-based image analysis is a crucial step in the biological interpretation of two-dimensional gel electrophoresis experiments. Recent significant advances in image processing methods combined with powerful computing hardware have enabled the routine analysis of large experiments. We cover the process starting with the imaging of 2-D gels, quantitation of spots, creation of expression profiles to statistical expression analysis followed by the presentation of results. Challenges for analysis software as well as good practices are highlighted. We emphasize image warping and related methods that are able to overcome the difficulties that are due to varying migration positions of spots between gels. Spot detection, quantitation, normalization, and the creation of expression profiles are described in detail. The recent development of consensus spot patterns and complete expression profiles enables one to take full advantage of statistical methods for expression analysis that are well established for the analysis of DNA microarray experiments. We close with an overview of visualization and presentation methods (proteome maps) and current challenges in the field.
Collapse
Affiliation(s)
- Matthias Berth
- DECODON GmbH, Rathenau-Strasse 49a, 17489 Greifswald, Germany
| | | | - Markus Kolbe
- DECODON GmbH, Rathenau-Strasse 49a, 17489 Greifswald, Germany
| | - Jörg Bernhardt
- DECODON GmbH, Rathenau-Strasse 49a, 17489 Greifswald, Germany
- Institute of Microbiology, Greifswald University, Jahnstrasse 15, 17487 Greifswald, Germany
| |
Collapse
|
16
|
Matthiesen R. Methods, algorithms and tools in computational proteomics: A practical point of view. Proteomics 2007; 7:2815-32. [PMID: 17703506 DOI: 10.1002/pmic.200700116] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Computational MS-based proteomics is an emerging field arising from the demand of high throughput analysis in numerous large-scale experimental proteomics projects. The review provides a broad overview of a number of computational tools available for data analysis of MS-based proteomics data and gives appropriate literature references to detailed description of algorithms. The review provides, to some extent, discussion of algorithms and methods for peptide and protein identification using MS data, quantitative proteomics, and data storage. The hope is that it will stimulate discussion and further development in computational proteomics. Computational proteomics deserves more scientific attention. There are far fewer computational tools and methods available for proteomics compared to the number of microarray tools, despite the fact that data analysis in proteomics is much more complex than microarray analysis.
Collapse
Affiliation(s)
- Rune Matthiesen
- Bioinformatics Group, CIC bioGUNE, CIBER-HEPAD, Technology Park of Bizkaia, Derio, Bizkaia, Spain.
| |
Collapse
|
17
|
Fung ET, Weinberger SR, Gavin E, Zhang F. Bioinformatics approaches in clinical proteomics. Expert Rev Proteomics 2007; 2:847-62. [PMID: 16307515 DOI: 10.1586/14789450.2.6.847] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Protein expression profiling is increasingly being used to discover, validate and characterize biomarkers that can potentially be used for diagnostic purposes and to aid in pharmaceutical development. Correct analysis of data obtained from these experiments requires an understanding of the underlying analytic procedures used to obtain the data, statistical principles underlying high-dimensional data and clinical statistical tools used to determine the utility of the interpreted data. This review summarizes each of these steps, with the goal of providing the nonstatistician proteomics researcher with a working understanding of the various approaches that may be used by statisticians. Emphasis is placed on the process of mining high-dimensional data to identify a specific set of biomarkers that may be used in a diagnostic or other assay setting.
Collapse
Affiliation(s)
- Eric T Fung
- Ciphergen Biosystems, Inc., 6611 Dumbarton Circle, Fremont, CA 94555, USA.
| | | | | | | |
Collapse
|
18
|
Meunier B, Dumas E, Piec I, Béchet D, Hébraud M, Hocquette JF. Assessment of Hierarchical Clustering Methodologies for Proteomic Data Mining. J Proteome Res 2006; 6:358-66. [PMID: 17203979 DOI: 10.1021/pr060343h] [Citation(s) in RCA: 119] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Hierarchical clustering methodology is a powerful data mining approach for a first exploration of proteomic data. It enables samples or proteins to be grouped blindly according to their expression profiles. Nevertheless, the clustering results depend on parameters such as data preprocessing, between-profile similarity measurement, and the dendrogram construction procedure. We assessed several clustering strategies by calculating the F-measure, a widely used quality metric. The combination, on logged matrix, of Pearson correlation and Ward's methods for data aggregation is among the best clustering strategies, at least with the data sets we studied. This study was carried out using PermutMatrix, a freely available software derived from transcriptomics.
Collapse
Affiliation(s)
- Bruno Meunier
- UR 1213, Unité de Recherches sur les Herbivores, Equipe Croissance et Métabolisme du Muscle, INRA de Clermont-Ferrand/Theix, F-63122 [corrected] Saint-Genès Champanelle, France.
| | | | | | | | | | | |
Collapse
|
19
|
Biron DG, Ponton F, Marché L, Galeotti N, Renault L, Demey-Thomas E, Poncet J, Brown SP, Jouin P, Thomas F. 'Suicide' of crickets harbouring hairworms: a proteomics investigation. INSECT MOLECULAR BIOLOGY 2006; 15:731-42. [PMID: 17201766 DOI: 10.1111/j.1365-2583.2006.00671.x] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Despite increasing evidence of host phenotypic manipulation by parasites, the underlying mechanisms causing infected hosts to act in ways that benefit the parasite remain enigmatic in most cases. Here, we used proteomics tools to identify the biochemical alterations that occur in the head of the cricket Nemobius sylvestris when it is driven to water by the hairworm Paragordius tricuspidatus. We characterized host and parasite proteomes during the expression of the water-seeking behaviour. We found that the parasite produces molecules from the Wnt family that may act directly on the development of the central nervous system (CNS). In the head of manipulated cricket, we found differential expression of proteins specifically linked to neurogenesis, circadian rhythm and neurotransmitter activities. We also detected proteins for which the function(s) are still unknown. This proteomics study on the biochemical pathways altered by hairworms has also allowed us to tackle questions of physiological and molecular convergence in the mechanism(s) causing the alteration of orthoptera behaviour. The two hairworm species produce effective molecules acting directly on the CNS of their orthoptera hosts.
Collapse
Affiliation(s)
- D G Biron
- GEMI, UMR CNRS/IRD 2724, IRD, 911 av. Agropolis BP 64501, Montpellier cedex 5, France.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Biron DG, Brun C, Lefevre T, Lebarbenchon C, Loxdale HD, Chevenet F, Brizard JP, Thomas F. The pitfalls of proteomics experiments without the correct use of bioinformatics tools. Proteomics 2006; 6:5577-96. [PMID: 16991202 DOI: 10.1002/pmic.200600223] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The elucidation of the entire genomic sequence of various organisms, from viruses to complex metazoans, most recently man, is undoubtedly the greatest triumph of molecular biology since the discovery of the DNA double helix. Over the past two decades, the focus of molecular biology has gradually moved from genomes to proteomes, the intention being to discover the functions of the genes themselves. The postgenomic era stimulated the development of new techniques (e.g. 2-DE and MS) and bioinformatics tools to identify the functions, reactions, interactions and location of the gene products in tissues and/or cells of living organisms. Both 2-DE and MS have been very successfully employed to identify proteins involved in biological phenomena (e.g. immunity, cancer, host-parasite interactions, etc.), although recently, several papers have emphasised the pitfalls of 2-DE experiments, especially in relation to experimental design, poor statistical treatment and the high rate of 'false positive' results with regard to protein identification. In the light of these perceived problems, we review the advantages and misuses of bioinformatics tools - from realisation of 2-DE gels to the identification of candidate protein spots - and suggest some useful avenues to improve the quality of 2-DE experiments. In addition, we present key steps which, in our view, need to be to taken into consideration during such analyses. Lastly, we present novel biological entities named 'interactomes', and the bioinformatics tools developed to analyse the large protein-protein interaction networks they form, along with several new perspectives of the field.
Collapse
Affiliation(s)
- David G Biron
- GEMI, UMR CNRS/IRD 2724, Centre IRD, Montpellier, France.
| | | | | | | | | | | | | | | |
Collapse
|
21
|
Carrette O, Burkhard PR, Sanchez JC, Hochstrasser DF. State-of-the-art two-dimensional gel electrophoresis: a key tool of proteomics research. Nat Protoc 2006; 1:812-23. [PMID: 17406312 DOI: 10.1038/nprot.2006.104] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) is the most popular and versatile method of protein separation among a rapidly growing array of proteomics technologies. Based on two distinct procedures, it combines isoelectric focusing (IEF), which separates proteins according to their isoelectric point (pI), and SDS-PAGE, which separates them further according to their molecular mass. At present, 2D-PAGE is capable of simultaneously detecting and quantifying up to several thousand protein spots in the same gel image. Here we provide comprehensive step-by-step instructions for the application of a standardized 2D-PAGE protocol to a sample of human plasma or cerebrospinal fluid (CSF). The method can be easily adapted to any type of sample. This four-day protocol provides detailed information on how to apply complex biological fluids to an immobilized dry strip gel, cast home-made gradient acrylamide gels, run the gels, and perform standard staining methods. A troubleshooting guide is also included.
Collapse
Affiliation(s)
- Odile Carrette
- Biomedical Proteomics Research Group, Department of Structural Biology and Bioinformatics, Faculty of Medicine, Geneva University, 1 rue Michel Servet CH-1211 Geneva 4, Switzerland
| | | | | | | |
Collapse
|
22
|
Vohradsky J, Thompson CJ. Systems level analysis of protein synthesis patterns associated with bacterial growth and metabolic transitions. Proteomics 2006; 6:785-93. [PMID: 16400688 DOI: 10.1002/pmic.200500206] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Gene expression databases, acquired by proteomics and transcriptomics, describe physiological and developmental programs at the systems level. Here we analyze proteosynthetic profiles in a bacterium undergoing defined metabolic changes. Streptomyces coelicolor cultured in a defined liquid medium displays four distinct patterns of gene expression associated with growth on glutamate, diauxic transition, and growth on maltose and ammonia that terminates by starvation for nitrogen and entry into stationary phase. Principal component and fuzzy cluster analyses of the proteome database of 935 protein spot profiles revealed principal kinetic patterns. Online linkage of the proteome database (SWICZ) to a protein-function database (KEGG) revealed limited correlations between expression profiles and metabolic pathway activities. Proteins belonging to principal metabolic pathways defined characteristic kinetic profiles correlated with the physiological state of the culture. These analyses supported the concept that metabolic flux was regulated not by individual enzymes but rather by groups of enzymes whose synthesis responded to changes in nutritional conditions. Higher-level regulation is reflected by the distribution of all kinetic profiles into only nine groups. The observation that enzymes representing principal metabolic pathways displayed their own distinctive average kinetic profiles suggested that expression of a "high-flux backbone" may dominate regulation of metabolic flux.
Collapse
Affiliation(s)
- Jiri Vohradsky
- Institute of Microbiology, Czech Academy of Sciences, Prague, Czech Republic.
| | | |
Collapse
|
23
|
Biron DG, Marché L, Ponton F, Loxdale HD, Galéotti N, Renault L, Joly C, Thomas F. Behavioural manipulation in a grasshopper harbouring hairworm: a proteomics approach. Proc Biol Sci 2006; 272:2117-26. [PMID: 16191624 PMCID: PMC1559948 DOI: 10.1098/rspb.2005.3213] [Citation(s) in RCA: 105] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The parasitic Nematomorph hairworm, Spinochordodes tellinii (Camerano) develops inside the terrestrial grasshopper, Meconema thalassinum (De Geer) (Orthoptera: Tettigoniidae), changing the insect's responses to water. The resulting aberrant behaviour makes infected insects more likely to jump into an aquatic environment where the adult parasite reproduces. We used proteomics tools (i.e. two-dimensional gel electrophoresis (2-DE), computer assisted comparative analysis of host and parasite protein spots and MALDI-TOF mass spectrometry) to identify these proteins and to explore the mechanisms underlying this subtle behavioural modification. We characterized simultaneously the host (brain) and the parasite proteomes at three stages of the manipulative process, i.e. before, during and after manipulation. For the host, there was a differential proteomic expression in relation to different effects such as the circadian cycle, the parasitic status, the manipulative period itself, and worm emergence. For the parasite, a differential proteomics expression allowed characterization of the parasitic and the free-living stages, the manipulative period and the emergence of the worm from the host. The findings suggest that the adult worm alters the normal functions of the grasshopper's central nervous system (CNS) by producing certain 'effective' molecules. In addition, in the brain of manipulated insects, there was found to be a differential expression of proteins specifically linked to neurotransmitter activities. The evidence obtained also suggested that the parasite produces molecules from the family Wnt acting directly on the development of the CNS. These proteins show important similarities with those known in other insects, suggesting a case of molecular mimicry. Finally, we found many proteins in the host's CNS as well as in the parasite for which the function(s) are still unknown in the published literature (www) protein databases. These results support the hypothesis that host behavioural changes are mediated by a mix of direct and indirect chemical manipulation.
Collapse
Affiliation(s)
- D G Biron
- GEMI, UMR CNRS/IRD 2724, IRD, 911 av. Agropolis BP 64501, 34394 Montpellier cedex 5, France.
| | | | | | | | | | | | | | | |
Collapse
|
24
|
Biron DG, Agnew P, Marché L, Renault L, Sidobre C, Michalakis Y. Proteome of Aedes aegypti larvae in response to infection by the intracellular parasite Vavraia culicis. Int J Parasitol 2005; 35:1385-97. [PMID: 16102770 DOI: 10.1016/j.ijpara.2005.05.015] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2005] [Revised: 05/03/2005] [Accepted: 05/15/2005] [Indexed: 10/25/2022]
Abstract
We report on the modification of the Aedes aegypti larval proteome following infection by the microsporidian parasite Vavraia culicis. Mosquito larvae were sampled at 5 and 15 days of age to compare the effects of infection when the parasite was in two different developmental stages. Modifications of the host proteome due to the stress of infection were distinguished from those of a more general nature by treatments involving hypoxia. We found that the major reaction to stress was the suppression of particular protein spots. Older (15 days) larvae reacted more strongly to infection by V. culicis (46% of the total number of spots affected; 17% for 5 days larvae), while the strongest reaction of younger (5 days) larvae was to hypoxia for pH range 5-8 and to combined effects of infection and hypoxia for pH range 3-6. MALDI-TOF results indicate that proteins induced or suppressed by infection are involved directly or indirectly in defense against microorganisms. Finally, our MALDI-TOF results suggest that A. aegypti larvae try to control or clear V. culicis infection and also that V. culicis probably impairs the immune defense of this host via arginases-NOS competition.
Collapse
Affiliation(s)
- D G Biron
- GEMI, UMR CNRS/IRD 2724, Centre IRD de Montpellier, 911 Avenue Agropolis, BP 64501, 34394 Montpellier Cedex 5, France.
| | | | | | | | | | | |
Collapse
|
25
|
Biron DG, Moura H, Marché L, Hughes AL, Thomas F. Towards a new conceptual approach to ‘parasitoproteomics’. Trends Parasitol 2005; 21:162-8. [PMID: 15780837 DOI: 10.1016/j.pt.2005.02.009] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Many parasitologists are betting heavily on proteomic studies to explain biochemical host-parasite interactions and, thus, to contribute to disease control. However, many "parasitoproteomic" studies are performed with powerful techniques but without a conceptual approach to determine whether the host genomic responses during a parasite infection represent a nonspecific response that might be induced by any parasite or any other stress. In this article, a new conceptual approach, based on evolutionary concepts of immune responses of a host to a parasite, is suggested for parasitologists to study the host proteome reaction after parasite invasion. Also, this new conceptual approach can be used to study other host-parasite interactions such as behavioral manipulation.
Collapse
Affiliation(s)
- David G Biron
- GEMI, UMR CNRS, IRD 2724, IRD, 911 Avenue Agropolis BP 64501, 34394 Montpellier Cedex 5, France.
| | | | | | | | | |
Collapse
|
26
|
Marengo E, Robotti E, Antonucci F, Cecconi D, Campostrini N, Righetti PG. Numerical approaches for quantitative analysis of two-dimensional maps: A review of commercial software and home-made systems. Proteomics 2005; 5:654-66. [PMID: 15669000 DOI: 10.1002/pmic.200401015] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The present review attempts to cover a number of methods that have appeared in the last few years for performing quantitative proteome analysis. However, due to the large number of methods described for both electrophoretic and chromatographic approaches, we have limited this review to conventional two-dimensional (2-D) map analysis which couples orthogonally a charge-based step (isoelectric focusing) to a size-based separation step (sodium dodecyl sulfate-electrophoresis). The first and oldest method applied to 2-D map data reduction is based on statistical analysis performed on sets of gels via powerful software packages, such as Melanie, PDQuest, Z3 and Z4000, Phoretix and Progenesis. This method calls for separately running a number of replicas for control and treated samples. The two sets of data are then merged and compared via a number of software packages which we describe. In addition to commercially-available systems, a number of home made approaches for 2-D map comparison have been recently described and are also reviewed. They are based on fuzzyfication of the digitized 2-D gel image coupled to linear discriminant analysis, three-way principal component analysis or a combination of principal component analysis and soft-independent modeling of class analogy. These statistical tools appear to perform well in differential proteomic studies.
Collapse
Affiliation(s)
- Emilio Marengo
- Department of Environmental and Life Sciences, University of Eastern Piedmont, Alessandria, Italy
| | | | | | | | | | | |
Collapse
|
27
|
Gottlieb DM, Schultz J, Bruun SW, Jacobsen S, Søndergaard I. Multivariate approaches in plant science. PHYTOCHEMISTRY 2004; 65:1531-1548. [PMID: 15276450 DOI: 10.1016/j.phytochem.2004.04.008] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2003] [Revised: 04/01/2004] [Indexed: 05/24/2023]
Abstract
The objective of proteomics is to get an overview of the proteins expressed at a given point in time in a given tissue and to identify the connection to the biochemical status of that tissue. Therefore sample throughput and analysis time are important issues in proteomics. The concept of proteomics is to encircle the identity of proteins of interest. However, the overall relation between proteins must also be explained. Classical proteomics consist of separation and characterization, based on two-dimensional electrophoresis, trypsin digestion, mass spectrometry and database searching. Characterization includes labor intensive work in order to manage, handle and analyze data. The field of classical proteomics should therefore be extended to also include handling of large datasets in an objective way. The separation obtained by two-dimensional electrophoresis and mass spectrometry gives rise to huge amount of data. We present a multivariate approach to the handling of data in proteomics with the advantage that protein patterns can be spotted at an early stage and consequently the proteins selected for sequencing can be selected intelligently. These methods can also be applied to other data generating protein analysis methods like mass spectrometry and near infrared spectroscopy and examples of application to these techniques are also presented. Multivariate data analysis can unravel complicated data structures and may thereby relieve the characterization phase in classical proteomics. Traditionally statistical methods are not suitable for analysis of the huge amounts of data, where the number of variables exceed the number of objects. Multivariate data analysis, on the other hand, may uncover the hidden structures present in these data. This study takes its starting point in the field of classical proteomics and shows how multivariate data analysis can lead to faster ways of finding interesting proteins. Multivariate analysis has shown interesting results as a supplement to classical proteomics and added a new dimension to the field of proteomics.
Collapse
Affiliation(s)
- David M Gottlieb
- Plasma Product Division, Statens Serum Institut, Artillerivej 5, DK-2300 Copenhagen S, Denmark
| | | | | | | | | |
Collapse
|
28
|
Caron M, Imam-Sghiouar N, Poirier F, Le Caër JP, Labas V, Joubert-Caron R. Proteomic map and database of lymphoblastoid proteins. J Chromatogr B Analyt Technol Biomed Life Sci 2002; 771:197-209. [PMID: 12015999 DOI: 10.1016/s1570-0232(02)00040-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Advances in genomics have led to the accumulation of an unprecedented amount of data, giving rise to a new field in biochemistry, proteomics. We used a combination of two dimensional gel electrophoresis, analysis and annotation using third-generation software, and mass spectrometry to establish the proteome maps of lymphoblastoid B-cells, a prerequisite for analysis of drug effects and lymphocyte cell diseases. About 1200 protein spots were detected and characterised in terms of their isoelectric point, molecular mass and expression. The present status of proteomic technologies, as well as a description of the usefulness of human hematopoietic cells proteomic database are discussed.
Collapse
Affiliation(s)
- Michel Caron
- Université Paris 13, UFR SMBHI Leonard de Vinci, Bobigny, France.
| | | | | | | | | | | |
Collapse
|
29
|
Vuadens F, Gasparini D, Déon C, Sanchez JC, Hochstrasser DF, Schneider P, Tissot JD. Identification of specific proteins in different lymphocyte populations by proteomic tools. Proteomics 2002. [DOI: 10.1002/1615-9861(200201)2:1<105::aid-prot105>3.0.co;2-f] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
30
|
Abstract
The completed draft of the human genome sequence has facilitated a revolution in neuroscience research. This sequence information and the development of new technologies used to analyze gene expression on a genomic scale provides a new and powerful means to investigate brain disorders of unknown etiology and to isolate novel drug targets for these disorders. The term functional genomics broadly describes a set of technologies and strategies directed at the problem of determining the function of genes, and understanding how the genome works together to generate whole patterns of biological function. The most powerful of these functional genomics approaches, expression profiling or DNA microarrays, can be used to analyze the expression of thousands of genes simultaneously. The results to date from the application of DNA microarray methods to postmortem diseased human brain tissue, animal models and cell culture models of brain disorders provide an exciting glimpse into the future of this field.
Collapse
Affiliation(s)
- Paul D Shilling
- Department of Psychiatry, University of California at San Diego, and San Diego VA Healthcare System, La Jolla, 92093, USA
| | | |
Collapse
|
31
|
Abstract
This review describes briefly proteome science. It explains why proteome science or proteomics emerged only recently and why a shift from genomics to proteomics is occurring. This review further illustrates that proteomics can unravel new domains in nature's complexity. Finally, it demonstrates that proteomics is offering new tools for the study of complex biological or medical problems.
Collapse
Affiliation(s)
- D F Hochstrasser
- Medical Biochemistry Department, Geneva University Hospitals, Switzerland.
| |
Collapse
|
32
|
Anderson NL, Anderson NG. Proteome and proteomics: new technologies, new concepts, and new words. Electrophoresis 1998; 19:1853-61. [PMID: 9740045 DOI: 10.1002/elps.1150191103] [Citation(s) in RCA: 581] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
The goal of proteomics is a comprehensive, quantitative description of protein expression and its changes under the influence of biological perturbations such as disease or drug treatment. Quantitative analysis of protein expression data obtained by high-throughput methods has led us to define the concept of "regulatory homology" and use it to begin to elucidate the basic structure of gene expression control in vivo. Such investigations lay the groundwork for construction of comprehensive databases of mechanisms (cataloguing possible biological outcomes), the next logical step after the soon to be completed cataloguing of genes and gene products. Mechanism databases provide a roadmap towards effective therapeutic intervention that is more direct than that offered by conventional genomics approaches.
Collapse
Affiliation(s)
- N L Anderson
- Large Scale Biology Corporation, Rockville, MD 20850-3338, USA.
| | | |
Collapse
|
33
|
Neumann H, Müllner S. Two replica blotting methods for fast immunological analysis of common proteins in two-dimensional electrophoresis. Electrophoresis 1998; 19:752-7. [PMID: 9629910 DOI: 10.1002/elps.1150190525] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The combination of two-dimensional electrophoresis (2-DE) and subsequent Western blot analysis with antibodies directed against common cellular proteins is a straightforward and reliable method to quickly generate fix points in a protein map. In order to assure high accuracy in the allocation of protein spots, two different replica blotting methods for semidry blotting devices were established. The first of the two was described by Johansson (Electrophoresis 1987, 8, 379-383). By systematically changing the direction of the blotting current, proteins were simultaneously transferred from one gel onto two membranes placed at both sides of the gel. However, several modifications of this method were necessary in order to use a semidry blotting device. The second method described here combines the standard blotting procedure with the generation of a 'contact copy' from the gel. Both systems offer the possibility to subject one membrane to antibody-mediated imaging, while the second membrane can be stained with highly sensitive total protein detection procedures. Protein identification is then carried out by comparing the signals on both matrices.
Collapse
Affiliation(s)
- H Neumann
- Hoechst Marion Roussel, DG Rheumatology, Wiesbaden, Germany
| | | |
Collapse
|
34
|
Appel RD, Palagi PM, Walther D, Vargas JR, Sanchez JC, Ravier F, Pasquali C, Hochstrasser DF. Melanie II--a third-generation software package for analysis of two-dimensional electrophoresis images: I. Features and user interface. Electrophoresis 1997; 18:2724-34. [PMID: 9504804 DOI: 10.1002/elps.1150181506] [Citation(s) in RCA: 104] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Although two-dimensional electrophoresis (2-DE) computer analysis software packages have existed ever since 2-DE technology was developed, it is only now that the hardware and software technology allows large-scale studies to be performed on low-cost personal computers or workstations, and that setting up a 2-DE computer analysis system in a small laboratory is no longer considered a luxury. After a first attempt in the seventies and early eighties to develop 2-DE analysis software systems on hardware that had poor or even no graphical capabilities, followed in the late eighties by a wave of innovative software developments that were possible thanks to new graphical interface standards such as XWindows, a third generation of 2-DE analysis software packages has now come to maturity. It can be run on a variety of low-cost, general-purpose personal computers, thus making the purchase of a 2-DE analysis system easily attainable for even the smallest laboratory that is involved in proteome research. Melanie II 2-D PAGE, developed at the University Hospital of Geneva, is such a third-generation software system for 2-DE analysis. Based on unique image processing algorithms, this user-friendly object-oriented software package runs on multiple platforms, including Unix, MS-Windows 95 and NT, and Power Macintosh. It provides efficient spot detection and quantitation, state-of-the-art image comparison, statistical data analysis facilities, and is Internet-ready. Linked to proteome databases such as those available on the World Wide Web, it represents a valuable tool for the "Virtual Lab" of the post-genome area.
Collapse
Affiliation(s)
- R D Appel
- Medical Informatics Division, Geneva University Hospital, Switzerland.
| | | | | | | | | | | | | | | |
Collapse
|
35
|
Vohradský J. Adaptive classification of two-dimensional gel electrophoretic spot patterns by neural networks and cluster analysis. Electrophoresis 1997; 18:2749-54. [PMID: 9504806 DOI: 10.1002/elps.1150181508] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The interpretation of two-dimensional gel electrophoresis spot profiles can be facilitated by statistical and machine learning programs. Two different approaches to classification of spot profiles - cluster analysis and neural networks - are discussed. Neural networks for two different model patterns were designed and an algorithm for training of the net for the classification was developed. It was shown that the performance of neural networks is higher compared to cluster and principal component analysis. The possibility of combining both approaches into one process can increase reliability and speed of classification. Artificially created training sets with added random noise can be used for network training. The analysis was applied on the Streptomyces coelicolor developmental two-dimensional (2-D) gel database.
Collapse
Affiliation(s)
- J Vohradský
- Czech Academy of Sciences, Institute of Microbiology, Prague, Czech Republic.
| |
Collapse
|
36
|
Appel RD, Vargas JR, Palagi PM, Walther D, Hochstrasser DF. Melanie II--a third-generation software package for analysis of two-dimensional electrophoresis images: II. Algorithms. Electrophoresis 1997; 18:2735-48. [PMID: 9504805 DOI: 10.1002/elps.1150181507] [Citation(s) in RCA: 126] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
After two generations of software systems for the analysis of two-dimensional electrophoresis (2-DE) images, a third generation of such software packages has recently emerged that combines state-of-the-art graphical user interfaces with comprehensive spot data analysis capabilities. A key characteristic common to most of these software packages is that many of their tools are implementations of algorithms that resulted from research areas such as image processing, vision, artificial intelligence or machine learning. This article presents the main algorithms implemented in the Melanie II 2-D PAGE software package. The applications of these algorithms, embodied as the feature of the program, are explained in an accompanying article (R. D. Appel et al.; Electrophoresis 1997, 18, 2724-2734).
Collapse
Affiliation(s)
- R D Appel
- Medical Informatics Division, Geneva University Hospital, Switzerland.
| | | | | | | | | |
Collapse
|
37
|
|
38
|
Mulholland M, Preston P, Hibbert D, Haddad P, Compton P. Teaching a computer ion chromatography from a database of published methods. J Chromatogr A 1996. [DOI: 10.1016/0021-9673(96)00045-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
39
|
Wilkins MR, Sanchez JC, Williams KL, Hochstrasser DF. Current challenges and future applications for protein maps and post-translational vector maps in proteome projects. Electrophoresis 1996; 17:830-8. [PMID: 8783009 DOI: 10.1002/elps.1150170504] [Citation(s) in RCA: 131] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Affiliation(s)
- M R Wilkins
- Central Clinical Chemistry Laboratory, Geneva University Hospital, Switzerland.
| | | | | | | |
Collapse
|
40
|
Schmid HR, Schmitter D, Blum P, Miller M, Vonderschmitt D. Lung tumor cells: a multivariate approach to cell classification using two-dimensional protein pattern. Electrophoresis 1995; 16:1961-8. [PMID: 8586071 DOI: 10.1002/elps.11501601322] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
High resolution two-dimensional polyacrylamide gel electrophoresis (2-D PAGE) is a powerful research tool for the analytical separation of cellular proteins. The qualitative and quantitative pattern of polypeptides synthesized by a cell represents its phenotype and thus defines characteristics such as the morphology and the biological behavior of the cell. By analyzing and comparing the protein patterns of different cells it is possible to recognize the cell type and also to identify the most typical features of these cells. In applied pathology it is often difficult to identify the tissue of origin and the stage or grade of a neoplasia by cellular morphology analyzed by classical or immunostaining procedures. The protein pattern itself is the most characteristic feature of a cell and should therefore contribute to the identification of the cell type. For this reason we separated protein fractions originating from different lung tumor cell lines using 2-D PAGE and we compared the resulting patterns on a multivariate statistical level using correspondence analysis (CA) and ascendant hierarchical clustering (AHC). The results indicate that (i) protein patterns are highly typical for cells and that (ii) the comparison of the protein patterns of a set of interesting cell types allows the identification of potentially new marker proteins. 2-D PAGE is thus a unique and powerful tool for molecular cytology or histopathology, unveiling the protein expression level of tissues or cells.
Collapse
Affiliation(s)
- H R Schmid
- Institute for Clinical Chemistry, University Hospital, Zurich, Switzerland
| | | | | | | | | |
Collapse
|
41
|
Abstract
Methods for protein analysis, such as chromatography, electrophoresis, enzyme tests, receptor assays and immunological tests, have always been aimed in a classical reductionistic manner at investigating single proteins isolated from the complex protein composition of biological compartments. The complexity of the protein composition in biological systems was first visualized by two-dimensional electrophoresis (2-DE). Using 2-DE like a molecular microscope, protein variations between different biological situations may be detected by subtractive 2-DE analyses. Combining 2-DE with microsequencing, amino acid analysis and mass spectrometry protein spots on 2-DE gels may be identified. The sequence information can be used to find the gene. However, by 2-DE not only single protein changes can be detected and investigated on the gene level, but also complex changes of many proteins on a genomic scale.
Collapse
Affiliation(s)
- P Jungblut
- Wittmann Institute of Technology and Analysis of Biomolecules, Teltow, Germany
| | | |
Collapse
|
42
|
Klose J, Kobalz U. Two-dimensional electrophoresis of proteins: an updated protocol and implications for a functional analysis of the genome. Electrophoresis 1995; 16:1034-59. [PMID: 7498127 DOI: 10.1002/elps.11501601175] [Citation(s) in RCA: 548] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The two-dimensional electrophoresis (2-DE) technique developed by Klose in 1975 (Humangenetik 1975, 26, 211-234), independently of the technique developed by O'Farrell (J. Biol. Chem. 1975, 250, 4007-4021), has been revised in our laboratory and an updated protocol is presented. This protocol is the result of our experience in using this method since its introduction. Many modifications and suggestions found in the literature were also tested and then integrated into our original method if advantageous. Gel and buffer composition, size of gels, use of stacking gels or not, necessity of isoelectric focusing (IEF) gel incubation, freezing of IEF gels or immediate use, carrier ampholytes versus Immobilines, regulation of electric current, conditions for staining and drying the gels - these and other problems were the subject of our concern. Among the technical details and special equipment which constitute our 2-DE method presented here, a few features are of particular significance: (i) sample loading onto the acid side of the IEF gel with the result that both acidic and basic proteins are well resolved in the same gel; (ii) use of large (46 x 30 cm) gels to achieve high resolution, but without the need of unusually large, flat gel equipment; (iii) preparation of ready-made gel solutions which can be stored frozen, a prerequisite, among others, for high reproducibility. Using the 2-DE method described we demonstrate that protein patterns revealing more than 10 000 polypeptide spots can be obtained from mouse tissues. This is by far the highest resolution so far reported in the literature for 2-DE of complex protein mixtures. The 2-DE patterns were of high quality with regard to spot shape and background. The reproducibility of the protein patterns is demonstrated and shown to be thoroughly satisfactory. An example is given to show how effectively 2-DE of high resolution and reproducibility can be used to study the genetic variability of proteins in an interspecific mouse backcross (Mus musculus x Mus spretus) established by the European Backcross Collaborative Group for mapping the mouse genome. We outline our opinion that the structural analysis of the human genome, currently pursued most intensively on a worldwide scale, should be accompanied by a functional analysis of the genome that starts from the proteins of the organism.
Collapse
Affiliation(s)
- J Klose
- Institut für Toxikologie und Embryopharmakologie, Freie Universität Berlin, Germany
| | | |
Collapse
|
43
|
|
44
|
Celis JE, Rasmussen HH, Olsen E, Madsen P, Leffers H, Honoré B, Dejgaard K, Gromov P, Hoffmann HJ, Nielsen M. The human keratinocyte two-dimensional gel protein database: update 1993. Electrophoresis 1993; 14:1091-198. [PMID: 8313869 DOI: 10.1002/elps.11501401178] [Citation(s) in RCA: 58] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
The master two-dimensional gel database of human keratinocytes currently lists 3038 cellular proteins (2127 isoelectric focusing, IEF; and 911 nonequilibrium pH gradient electrophoresis, NEPHGE) many of which correspond to post-translational modifications. 763 proteins have been identified (protein name, organelle components, etc.) and they are listed both in alphabetical order and with increasing SSP number, together with their M(r), pI, cellular localization and credit to the investigator(s) that aided in the identification. Furthermore we have listed 176 proteins that have been microsequenced so far and that are recorded in this database. We also include synthetic images depicting some interesting sets of proteins identified so far; these include components of hnRNP's, proteasomes or prosomes, ribosomes, as well as assorted organelle markers, GTP-binding proteins, calcium binding proteins, stress proteins, autoantigens, differentiation markers and psoriasis upregulated proteins. The aim of the comprehensive database is to gather, through a systematic study of keratinocytes, qualitative and quantitative information on proteins and their genes that may allow us to identify abnormal patterns of gene expression and ultimately to pinpoint signaling pathways and components affected in various skin diseases, cancer included.
Collapse
Affiliation(s)
- J E Celis
- Institute of Medical Biochemistry, Aarhus University
| | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Celis JE, Rasmussen HH, Leffers H, Madsen P, Honoré B, Dejgaard K, Gromov P, Olsen E, Hoffmann HJ, Nielsen M. Human cellular protein patterns and their link to genome DNA mapping and sequencing data: towards an integrated approach to the study of gene expression. GENETIC ENGINEERING 1993; 15:21-40. [PMID: 7763841 DOI: 10.1007/978-1-4899-1666-2_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Analysis of cellular protein patterns by computer-aided two-dimensional gel electrophoresis together with recent advances in protein sequence analysis and expression systems have made possible the establishment of comprehensive two-dimensional gel protein databases that may link protein and DNA mapping and sequence information and that offer an integrated approach to the study of gene expression. With the integrated approach offered by two-dimensional gel protein databases it is now possible to reveal phenotype-specific protein(s), to microsequence them, to search for homology with previous identified proteins, to clone the cDNAs, to assign partial protein sequences to genes for which the full DNA sequence and the chromosome location are known, and to study the regulatory properties and function of groups of proteins that are coordinately expressed in a given biological process. Comprehensive two-dimensional gel protein databases will provide an integrated picture of the expression levels and properties of the thousands of protein components of organelles, pathways, and cytoskeletal systems, both under physiological and abnormal conditions, and are expected to lead to the identification of new regulatory networks. So far, about 20% (600 out of 2,980) of the total number of proteins recorded in the human keratinocyte protein database have been identified and we are actively gathering qualitative and quantitative biological data on all resolved proteins. Given the current improvements on microsequencing as well as the availability of specific antibodies, it seems feasible to expect that most known keratinocyte proteins will be identified in the very near future. This feast will reveal a wealth of new proteins that will become amenable to experimentation both at the biochemical and molecular biology level.
Collapse
Affiliation(s)
- J E Celis
- Institute of Medical Biochemistry, Aarhus University, Denmark
| | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Hochstrasser DF, Frutiger S, Paquet N, Bairoch A, Ravier F, Pasquali C, Sanchez JC, Tissot JD, Bjellqvist B, Vargas R. Human liver protein map: a reference database established by microsequencing and gel comparison. Electrophoresis 1992; 13:992-1001. [PMID: 1286669 DOI: 10.1002/elps.11501301201] [Citation(s) in RCA: 113] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
This publication establishes a reference human liver protein map obtained with immobilized pH gradients. By microsequencing, 57 spots or 42 polypeptide chains were identified. By protein map comparison and matching (liver, red blood cell and plasma sample maps), 8 additional proteins were identified. The new polypeptides and previously known proteins are listed in a table and/or labeled on the protein map, thus providing a human liver two-dimensional gel database. This reference map can be used to identify protein spots on other samples such as rectal cancer biopsies.
Collapse
|
47
|
Celis JE, Rasmussen HH, Madsen P, Leffers H, Honoré B, Dejgaard K, Gesser B, Olsen E, Gromov P, Hoffmann HJ. The human keratinocyte two-dimensional gel protein database (update 1992): towards an integrated approach to the study of cell proliferation, differentiation and skin diseases. Electrophoresis 1992; 13:893-959. [PMID: 1286666 DOI: 10.1002/elps.11501301198] [Citation(s) in RCA: 83] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
The master two-dimensional gel database of human keratinocytes currently lists 2980 cellular proteins (2098 isoelectric focusing, IEF; and 882 nonequilibrium pH gradient electrophoresis, NEPHGE) many of which correspond to posttranslational modifications. About 20% of all recorded proteins have been identified (protein name, organelle components, etc.) and they are listed in alphabetical order together with their M(r), pI, cellular localization and credit to the investigator(s) that aided in the identification. Also, we have listed 145 microsequenced proteins that are recorded in this database. As an aid in localizing the polypeptides we have included blow-ups of the master images (IEF, NEPHGE) displaying all the protein numbers. In the long run, the master keratinocyte database is expected to link protein and DNA sequencing and mapping information (Human Genome Program) and to provide an integrated picture of the expression levels and properties of the thousands of proteins that orchestrate various keratinocyte functions both in health and disease.
Collapse
Affiliation(s)
- J E Celis
- Institute of Medical Biochemistry, Aarhus University, Denmark
| | | | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Hughes GJ, Frutiger S, Paquet N, Ravier F, Pasquali C, Sanchez JC, James R, Tissot JD, Bjellqvist B, Hochstrasser DF. Plasma protein map: an update by microsequencing. Electrophoresis 1992; 13:707-14. [PMID: 1459097 DOI: 10.1002/elps.11501301150] [Citation(s) in RCA: 119] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The reference plasma protein map, obtained with immobilized pH gradients in the first dimension of two-dimensional electrophoresis, is presented. By microsequencing, more than 40 polypeptide chains were identified. The new polypeptides and previously known proteins are listed in a table and labeled on the protein map, thus providing an update of the human plasma two-dimensional gel database.
Collapse
Affiliation(s)
- G J Hughes
- Medical Biochemistry Department, Geneva University, Switzerland
| | | | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Wirth PJ, Luo LD, Fujimoto Y, Bisgaard HC, Olson AD. The rat liver epithelial (RLE) cell protein database. Electrophoresis 1991; 12:931-54. [PMID: 1794345 DOI: 10.1002/elps.1150121112] [Citation(s) in RCA: 25] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Computer databases of rat liver epithelial (RLE) cellular polypeptides have been established using high resolution two-dimensional gel electrophoresis and computer-assisted analysis. Databases have been constructed utilizing both [35S]methionine- and [32P]orthophosphate-labeled as well as silver-stained polypeptides from normal RLE cells. The RLE database, which contains both qualitative and quantitative annotations, includes experiments with normal, chemically and oncogene transformed as well as spontaneously transformed cell lines. A total of 2537 [35S]methionine-labeled polypeptides from whole cell lysates (1920 acidic and 617 basic, separated in the first dimension using isoelectric focusing and nonequilibrium pH gradient electrophoresis, respectively) were analyzed and databases constructed using the Elsie 5 gel analysis system. To increase the "viewing window" and hence the usefulness of the RLE database, subcellular fractionation of whole cell preparations was performed and high resolution two-dimensional maps of the individual subcellular components were constructed. Databases representing 1229 cytosolic, 1539 acidic and 674 basic nuclear, 1746 membrane-associated, 415 mitochondrial, 773 in vitro translated and 350 phosphoproteins were established from these maps. The RLE databases contain the Elsie 5 identification number, protein name (if known), molecular weight and pI information, quantitative and spot shape data, and specific information regarding transformation-sensitive, growth-related (exponentially proliferating versus confluent) cell populations as well as those polypeptides modulated by specific growth factors. The RLE databases represent initial efforts toward the establishment of comprehensive databases of rat liver proteins and serve as a vital resource for on-going as well as future studies regarding the regulation of growth and differentiation as well as transformation of RLE cells.
Collapse
Affiliation(s)
- P J Wirth
- Laboratory of Experimental Carcinogenesis, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892
| | | | | | | | | |
Collapse
|
50
|
Appel RD, Hochstrasser DF, Funk M, Vargas JR, Pellegrini C, Muller AF, Scherrer JR. The MELANIE project: from a biopsy to automatic protein map interpretation by computer. Electrophoresis 1991; 12:722-35. [PMID: 1802690 DOI: 10.1002/elps.1150121006] [Citation(s) in RCA: 122] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The goals of the MELANIE project are to determine if disease-associated patterns can be detected in high resolution two-dimensional polyacrylamide gel electrophoresis (HR 2D-PAGE) images and if a diagnosis can be established automatically by computer. The ELSIE/MELANIE system is a set of computer programs which automatically detect, quantify, and compare protein spots shown on HR 2D-PAGE images. Classification programs help the physician to find disease-associated patterns from a given set of two-dimensional gel electrophoresis images and to form diagnostic rules. Prototype expert systems that use these rules to establish a diagnosis from new HR 2D-PAGE images have been developed. They successfully diagnosed cirrhosis of the liver and were able to distinguish a variety of cancer types from biopsies known to be cancerous.
Collapse
Affiliation(s)
- R D Appel
- Numeric Imaging Group, Geneva University Hospital, Switzerland
| | | | | | | | | | | | | |
Collapse
|