1
|
Two-Dimensional Gel Electrophoresis Image Analysis. Methods Mol Biol 2021; 2361:3-13. [PMID: 34236652 DOI: 10.1007/978-1-0716-1641-3_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Gel-based proteomics is still quite widespread due to its high-resolution power; the experimental approach is based on differential analysis, where groups of samples (e.g., control vs diseased) are compared to identify panels of potential biomarkers. However, the reliability of the result of the differential analysis is deeply influenced by 2D-PAGE maps image analysis procedures. The analysis of 2D-PAGE images consists of several steps, such as image preprocessing, spot detection and quantitation, image warping and alignment, spot matching. Several approaches are present in literature, and classical or last-generation commercial software packages exploit different algorithms for each step of the analysis. Here, the most widespread approaches and a comparison of the different strategies are presented.
Collapse
|
2
|
Abstract
2D-DIGE is still a very widespread technique in proteomics for the identification of panels of biomarkers, allowing to tackle with some important drawback of classical two-dimensional gel-electrophoresis. However, once 2D-gels are obtained, they must undergo a quite articulated multistep image analysis procedure before the final differential analysis via statistical mono- and multivariate methods. Here, the main steps of image analysis software are described and the most recent procedures reported in the literature are briefly presented.
Collapse
Affiliation(s)
- Elisa Robotti
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, 15121, Alessandria, Italy.
| | - Emilio Marengo
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, 15121, Alessandria, Italy
| |
Collapse
|
3
|
Robotti E, Marengo E, Demartini M. GENOCOP Algorithm and Hierarchical Grid Transformation for Image Warping of Two-Dimensional Gel Electrophoretic Maps. Methods Mol Biol 2016; 1384:165-84. [PMID: 26611415 DOI: 10.1007/978-1-4939-3255-9_10] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
Abstract
Hierarchical grid transformation is a powerful hierarchical approach to 2-D map warping, able to model both global and local deformations. The algorithm can be stopped when a desired degree of accuracy in the images alignment is obtained. The deformed image is warped and aligned to the target image using a grid where the number of nodes increases in each step of the algorithm. The numerical optimization of the position of the nodes of the grid can be efficiently solved by genetic algorithms, ensuring the achievement of the optimal position of the nodes with a low computational cost with respect to other methods. Here, the optimization of the position of the nodes is carried out by GENOCOP (genetic algorithm for numerical optimization of constrained problems), refined by the following conjugate gradient optimization step. The modeling of the warped space is then achieved by a spline model where some constraints are introduced in the choice of the nodes that are moved. The whole procedure can be intended as an evolutionary method that models the deformation of the gel map at different levels of detail.
Collapse
Affiliation(s)
- Elisa Robotti
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, Alessandria, 15121, Italy.
| | - Emilio Marengo
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, Alessandria, 15121, Italy.
| | - Marco Demartini
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, Alessandria, 15121, Italy
| |
Collapse
|
4
|
Robotti E, Marengo E, Quasso F. Image Pretreatment Tools II: Normalization Techniques for 2-DE and 2-D DIGE. Methods Mol Biol 2016; 1384:91-107. [PMID: 26611411 DOI: 10.1007/978-1-4939-3255-9_6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Gel electrophoresis is usually applied to identify different protein expression profiles in biological samples (e.g., control vs. pathological, control vs. treated). Information about the effect to be investigated (a pathology, a drug, a ripening effect, etc.) is however generally confounded with experimental variability that is quite large in 2-DE and may arise from small variations in the sample preparation, reagents, sample loading, electrophoretic conditions, staining and image acquisition. Obtaining valid quantitative estimates of protein abundances in each map, before the differential analysis, is therefore fundamental to provide robust candidate biomarkers. Normalization procedures are applied to reduce experimental noise and make the images comparable, improving the accuracy of differential analysis. Certainly, they may deeply influence the final results, and to this respect they have to be applied with care. Here, the most widespread normalization procedures are described both for what regards the applications to 2-DE and 2D Difference Gel-electrophoresis (2-D DIGE) maps.
Collapse
Affiliation(s)
- Elisa Robotti
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, 15121, Alessandria, Italy.
| | - Emilio Marengo
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, 15121, Alessandria, Italy
| | - Fabio Quasso
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, 15121, Alessandria, Italy
| |
Collapse
|
5
|
Xin HM, Zhu Y. Spot Matching of 2-DE Images Using Distance, Intensity, and Pattern Information. Methods Mol Biol 2016; 1384:109-17. [PMID: 26611412 DOI: 10.1007/978-1-4939-3255-9_7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
The analysis of a large number of two-dimensional gel electrophoresis (2-DE) images requires developing automatic methods. In such analyses, spot matching plays a fundamental role, in particular for the identification of proteins. We describe a simple and accurate method which allows to automatically and accurately match spots in 2-DE images. The method consists of simultaneously exploiting the distance between the spots, their intensity, and the pattern formed by their spatial configuration.
Collapse
Affiliation(s)
- Hua-Mei Xin
- College of Physics & Electronics, Shandong Normal University, 250014, Shandong, China.,CREATIS, CNRS UMR 5220, Inserm U1044, INSA Lyon, University of Lyon, Bâtiment Blaise Pascal, 69621, Villeurbanne cedex, France
| | - Yuemin Zhu
- CREATIS, CNRS UMR 5220, Inserm U1044, INSA Lyon, University of Lyon, Bâtiment Blaise Pascal, 69621, Villeurbanne cedex, France.
| |
Collapse
|
6
|
Abstract
Software-based image analysis of 2-D PAGE maps is an important step for the investigation of proteome. Warping algorithms, which are employed to register spots among gels, are able to overcome the difficulties due to the low reproducibility of this analytical technique. Over the years, the research of new matching and warping mathematical methods has allowed the development of several routine applications of easy-to-use software. This chapter describes common and basic spatial transformations used for the alignment of protein spots present in different gel maps; some recently new approaches are also presented.
Collapse
Affiliation(s)
- Marcello Manfredi
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale T. Michel 11, 15121, Alessandria, Italy. .,High Resolution Mass Spectrometry Lab, ISALIT SRL, Spin-off of University of Piemonte Orientale, Politecnico di Torino, Viale T. Michel 11, 15121, Alessandria, Italy.
| | - Elisa Robotti
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale T. Michel 11, 15121, Alessandria, Italy
| | - Emilio Marengo
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale T. Michel 11, 15121, Alessandria, Italy
| |
Collapse
|
7
|
Bresolin de Souza K, Jutfelt F, Kling P, Förlin L, Sturve J. Effects of increased CO2 on fish gill and plasma proteome. PLoS One 2014; 9:e102901. [PMID: 25058324 PMCID: PMC4109940 DOI: 10.1371/journal.pone.0102901] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2013] [Accepted: 06/25/2014] [Indexed: 11/18/2022] Open
Abstract
Ocean acidification and warming are both primarily caused by increased levels of atmospheric CO2, and marine organisms are exposed to these two stressors simultaneously. Although the effects of temperature on fish have been investigated over the last century, the long-term effects of moderate CO2 exposure and the combination of both stressors are almost entirely unknown. A proteomics approach was used to assess the adverse physiological and biochemical changes that may occur from the exposure to these two environmental stressors. We analysed gills and blood plasma of Atlantic halibut (Hippoglossus hippoglossus) exposed to temperatures of 12°C (control) and 18°C (impaired growth) in combination with control (400 µatm) or high-CO2 water (1000 µatm) for 14 weeks. The proteomic analysis was performed using two-dimensional gel electrophoresis (2DE) followed by Nanoflow LC-MS/MS using a LTQ-Orbitrap. The high-CO2 treatment induced the up-regulation of immune system-related proteins, as indicated by the up-regulation of the plasma proteins complement component C3 and fibrinogen β chain precursor in both temperature treatments. Changes in gill proteome in the high-CO2 (18°C) group were mostly related to increased energy metabolism proteins (ATP synthase, malate dehydrogenase, malate dehydrogenase thermostable, and fructose-1,6-bisphosphate aldolase), possibly coupled to a higher energy demand. Gills from fish exposed to high-CO2 at both temperature treatments showed changes in proteins associated with increased cellular turnover and apoptosis signalling (annexin 5, eukaryotic translation elongation factor 1γ, receptor for protein kinase C, and putative ribosomal protein S27). This study indicates that moderate CO2-driven acidification, alone and combined with high temperature, can elicit biochemical changes that may affect fish health.
Collapse
Affiliation(s)
- Karine Bresolin de Souza
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- * E-mail:
| | - Fredrik Jutfelt
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Peter Kling
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Lars Förlin
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Joachim Sturve
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
| |
Collapse
|
8
|
Wheelock AM, Goto S. Effects of post-electrophoretic analysis on variance in gel-based proteomics. Expert Rev Proteomics 2014; 3:129-42. [PMID: 16445357 DOI: 10.1586/14789450.3.1.129] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
2D electrophoresis (2DE) is a prominent separation method for complex proteomes. Although recent advances have increased the utility of this method in quantitative proteomics studies, many sources of variance still exist. This review discusses the post-electrophoretic sources of variance in current 2DE analysis. The essential improvements in protein visualization and software algorithms that have made 2DE a leading quantitative proteomics method are briefly reviewed. A number of shortcomings in the post-electrophoretic analysis of 2DE data that require further attention are highlighted. Topics discussed include protein visualization and image acquisition, internal standards and normalization methods, background subtraction algorithms, normality of distribution, and the need for standardized tests for the evaluation of 2DE analysis software packages.
Collapse
Affiliation(s)
- Asa M Wheelock
- Kyoto University, Bioinformatics Center, Institute for Chemical Research, Uji, Kyoto, 611-0011, Japan.
| | | |
Collapse
|
9
|
Marengo E, Cocchi M, Demartini M, Robotti E, Cecconi D, Calabrese G. GENOCOP algorithm and hierarchical grid transformation for image warping of two dimensional gel eletrophoretic maps. MOLECULAR BIOSYSTEMS 2012; 8:975-84. [PMID: 22301843 DOI: 10.1039/c2mb05396a] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Hierarchical grid transformation is a powerful approach to SDS 2DPAGE maps warping. The hierarchy of the warping transformation is able to model both global and local deformations of the gels and the algorithm can be stopped when a certain degree of accuracy in the image alignment is obtained. The numerical optimization of the position of the nodes of the grid that are responsible for the image warping is a multivariate task that can be solved efficiently using Genetic Algorithms. The use of Genetic Algorithms ensures that an optimal position of the nodes can be defined with a low computational cost with respect to other methods. The optimal positions of the nodes of the grid can be successfully used for defining a good warping of the gels.
Collapse
Affiliation(s)
- Emilio Marengo
- Department of Science and Technological Innovation, University of Eastern Piedmont, Viale Teresa Michel 11, 15121 Alessandria, Italy.
| | | | | | | | | | | |
Collapse
|
10
|
Investigation of the applicability of Zernike moments to the classification of SDS 2D-PAGE maps. Anal Bioanal Chem 2011; 400:1419-31. [DOI: 10.1007/s00216-011-4851-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2010] [Revised: 02/23/2011] [Accepted: 02/23/2011] [Indexed: 10/18/2022]
|
11
|
Dowsey AW, English JA, Lisacek F, Morris JS, Yang GZ, Dunn MJ. Image analysis tools and emerging algorithms for expression proteomics. Proteomics 2010; 10:4226-57. [PMID: 21046614 PMCID: PMC3257807 DOI: 10.1002/pmic.200900635] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2009] [Accepted: 08/28/2010] [Indexed: 11/11/2022]
Abstract
Since their origins in academic endeavours in the 1970s, computational analysis tools have matured into a number of established commercial packages that underpin research in expression proteomics. In this paper we describe the image analysis pipeline for the established 2-DE technique of protein separation, and by first covering signal analysis for MS, we also explain the current image analysis workflow for the emerging high-throughput 'shotgun' proteomics platform of LC coupled to MS (LC/MS). The bioinformatics challenges for both methods are illustrated and compared, whereas existing commercial and academic packages and their workflows are described from both a user's and a technical perspective. Attention is given to the importance of sound statistical treatment of the resultant quantifications in the search for differential expression. Despite wide availability of proteomics software, a number of challenges have yet to be overcome regarding algorithm accuracy, objectivity and automation, generally due to deterministic spot-centric approaches that discard information early in the pipeline, propagating errors. We review recent advances in signal and image analysis algorithms in 2-DE, MS, LC/MS and Imaging MS. Particular attention is given to wavelet techniques, automated image-based alignment and differential analysis in 2-DE, Bayesian peak mixture models, and functional mixed modelling in MS, and group-wise consensus alignment methods for LC/MS.
Collapse
Affiliation(s)
- Andrew W. Dowsey
- Institute of Biomedical Engineering, Imperial College London, South Kensington, London SW7 2AZ, U.K
| | - Jane A. English
- Proteome Research Centre, UCD Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Belfield, Ireland
| | - Frederique Lisacek
- Proteome Informatics Group, Swiss Institute of Bioinformatics, CMU - 1, rue Michel Servet, CH-1211 Geneva, Switzerland
| | - Jeffrey S. Morris
- Department of Biostatistics, The University of Texas M.D. Anderson Cancer Center, Houston, Texas 77030-4009, U.S.A
| | - Guang-Zhong Yang
- Institute of Biomedical Engineering, Imperial College London, South Kensington, London SW7 2AZ, U.K
| | - Michael J. Dunn
- Proteome Research Centre, UCD Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Belfield, Ireland
| |
Collapse
|
12
|
Grzeskowiak JK, Tscheliessnig A, Wu MW, Toh PC, Chusainow J, Lee YY, Wong N, Jungbauer A. Two-dimensional difference fluorescence gel electrophoresis to verify the scale-up of a non-affinity-based downstream process for isolation of a therapeutic recombinant antibody. Electrophoresis 2010; 31:1862-72. [DOI: 10.1002/elps.200900781] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
13
|
Lasso G, Matthiesen R. Computational methods for analysis of two-dimensional gels. Methods Mol Biol 2010; 593:231-62. [PMID: 19957153 DOI: 10.1007/978-1-60327-194-3_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
Abstract
Two-dimensional gel electrophoresis (2D gels) is an essential quantitative proteomics technique that is frequently used to study differences between samples of clinical relevance. Although considered to have a low throughput, 2D gels can separate thousands of proteins in one gel, making it a good complementary method to MS-based protein quantification. The main drawback of the technique is the tendency of large and hydrophobic proteins such as membrane proteins to precipitate in the isoelectric focusing step. Furthermore, tests using different programs with distinct algorithms for 2D-gel analysis have shown inconsistent ratio values. The aim here is therefore to provide a discussion of algorithms described for the analysis of 2D gels.
Collapse
Affiliation(s)
- Gorka Lasso
- Bioinformatics, Parque Technológico de Bizkaia, Derio, Spain
| | | |
Collapse
|
14
|
Liu YS, Chen SY, Liu RS, Duh DJ, Chao YT, Tsai YC, Hsieh JS. Spot detection for a 2-DE gel image using a slice tree with confidence evaluation. ACTA ACUST UNITED AC 2009. [DOI: 10.1016/j.mcm.2008.11.017] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
15
|
Identifying calpain substrates in intact S2 cells of Drosophila. Arch Biochem Biophys 2008; 481:219-25. [PMID: 19038228 DOI: 10.1016/j.abb.2008.11.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2008] [Revised: 11/11/2008] [Accepted: 11/13/2008] [Indexed: 11/21/2022]
Abstract
Calpains are cysteine proteases involved in a number of physiological and pathological processes, yet our knowledge of substrates cleaved in vivo, in intact cells, is scarce. In this work we made an attempt to develop a technique for finding calpain substrates in intact Drosophila Schneider S2 cells. The procedure consists in comparative 2D gelelectrophoresis: three identical samples were treated in different ways: A (control, no addition), B, activated (Ca(2+) and ionomycin added), C, inactivated (additions as in B+specific calpain inhibitor). 2D gel pattern were analyzed by densitometry. Spots showing density relation A>B<<C were identified by mass spectroscopy. In a typical run, 11 candidate substrates were recognized; out of these, four were randomly selected: all four were verified to be calpain substrates, by digestion of the recombinant protein with recombinant calpain.
Collapse
|
16
|
Pérès S, Molina L, Salvetat N, Granier C, Molina F. A new method for 2D gel spot alignment: application to the analysis of large sample sets in clinical proteomics. BMC Bioinformatics 2008; 9:460. [PMID: 18957120 PMCID: PMC2628390 DOI: 10.1186/1471-2105-9-460] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2008] [Accepted: 10/28/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In current comparative proteomics studies, the large number of images generated by 2D gels is currently compared using spot matching algorithms. Unfortunately, differences in gel migration and sample variability make efficient spot alignment very difficult to obtain, and, as consequence most of the software alignments return noisy gel matching which needs to be manually adjusted by the user. RESULTS We present Sili2DGel an algorithm for automatic spot alignment that uses data from recursive gel matching and returns meaningful Spot Alignment Positions (SAP) for a given set of gels. In the algorithm, the data are represented by a graph and SAP by specific subgraphs. The results are returned under various forms (clickable synthetic gel, text file, etc.). We have applied Sili2DGel to study the variability of the urinary proteome from 20 healthy subjects. CONCLUSION Sili2DGel performs noiseless automatic spot alignment for variability studies (as well as classical differential expression studies) of biological samples. It is very useful for typical clinical proteomic studies with large number of experiments.
Collapse
Affiliation(s)
- Sabine Pérès
- Sysdiag CNRS FRE 3009 BIO-RAD, Cap delta/Parc Euromédecine, 1682 rue de la Valsière, CS 61003, 34184 Montpellier Cedex 4, France.
| | | | | | | | | |
Collapse
|
17
|
Rye MB, Faergestad EM, Alsberg BK. A new method for assigning common spot boundaries for multiple gels in two-dimensional gel electrophoresis. Electrophoresis 2008; 29:1359-68. [PMID: 18348212 DOI: 10.1002/elps.200700418] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The benefits of defining common spot boundaries when several gels from 2-DE are compared and analyzed have lately been stressed by both commercial software producers and users of this software. Though the importance of common spot boundaries is clearly stated, few reports exist that target this issue explicitly. In this study a method for defining common spots boundaries is developed, called the spot density method. The method consists of the following steps: segmentation and spot identification on each individual gel, transferring the spot-center coordinates for all gels onto a single new gel, collecting spot centers clustered together in the new gel and finally assigning pixels and new spot boundaries based on the spots in each cluster. The method is compared to a synthetic gel approach, and validated by visual inspection of three representative areas in the gels. The gel images need to be aligned prior to segmentation and spot identification, but the method can be used regardless of the choice of segmentation procedure. This makes the method an easy extension to existing methods for spot identification and matching. Conclusions based on the visual inspection are that the spot density method identifies partly overlapping spots and low-intensity spots better than the synthetic gel approach.
Collapse
Affiliation(s)
- Morten Beck Rye
- Department of Chemistry, Norwegian University of Science and Technology, Trondheim, Norway.
| | | | | |
Collapse
|
18
|
Möller B, Posch S. An integrated analysis concept for errors in image registration. PATTERN RECOGNITION AND IMAGE ANALYSIS 2008. [DOI: 10.1134/s105466180802003x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
19
|
Srinark T, Kambhamettu C. An image analysis suite for spot detection and spot matching in two-dimensional electrophoresis gels. Electrophoresis 2008; 29:706-15. [PMID: 18203251 DOI: 10.1002/elps.200700244] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We propose a suite of novel algorithms for image analysis of protein expression images obtained from 2-D electrophoresis. These algorithms are a segmentation algorithm for protein spot identification, and an algorithm for matching protein spots from two corresponding images for differential expression study. The proposed segmentation algorithm employs the watershed transformation, k-means analysis, and distance transform to locate the centroids and to extract the regions of the proteins spots. The proposed spot matching algorithm is an integration of the hierarchical-based and optimization-based methods. The hierarchical method is first used to find corresponding pairs of protein spots satisfying the local cross-correlation and overlapping constraints. The matching energy function based on local structure similarity, image similarity, and spatial constraints is then formulated and optimized. Our new algorithm suite has been extensively tested on synthetic and actual 2-D gel images from various biological experiments, and in quantitative comparisons with ImageMaster2D Platinum the proposed algorithms exhibit better spot detection and spot matching.
Collapse
Affiliation(s)
- Thitiwan Srinark
- Department of Computer Engineering, Faculty of Engineering, Kasetsart University, Bangkok, Thailand.
| | | |
Collapse
|
20
|
Dowsey AW, Dunn MJ, Yang GZ. Automated image alignment for 2D gel electrophoresis in a high-throughput proteomics pipeline. ACTA ACUST UNITED AC 2008; 24:950-7. [PMID: 18310057 DOI: 10.1093/bioinformatics/btn059] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION The quest for high-throughput proteomics has revealed a number of challenges in recent years. Whilst substantial improvements in automated protein separation with liquid chromatography and mass spectrometry (LC/MS), aka 'shotgun' proteomics, have been achieved, large-scale open initiatives such as the Human Proteome Organization (HUPO) Brain Proteome Project have shown that maximal proteome coverage is only possible when LC/MS is complemented by 2D gel electrophoresis (2-DE) studies. Moreover, both separation methods require automated alignment and differential analysis to relieve the bioinformatics bottleneck and so make high-throughput protein biomarker discovery a reality. The purpose of this article is to describe a fully automatic image alignment framework for the integration of 2-DE into a high-throughput differential expression proteomics pipeline. RESULTS The proposed method is based on robust automated image normalization (RAIN) to circumvent the drawbacks of traditional approaches. These use symbolic representation at the very early stages of the analysis, which introduces persistent errors due to inaccuracies in modelling and alignment. In RAIN, a third-order volume-invariant B-spline model is incorporated into a multi-resolution schema to correct for geometric and expression inhomogeneity at multiple scales. The normalized images can then be compared directly in the image domain for quantitative differential analysis. Through evaluation against an existing state-of-the-art method on real and synthetically warped 2D gels, the proposed analysis framework demonstrates substantial improvements in matching accuracy and differential sensitivity. High-throughput analysis is established through an accelerated GPGPU (general purpose computation on graphics cards) implementation. AVAILABILITY Supplementary material, software and images used in the validation are available at http://www.proteomegrid.org/rain/.
Collapse
Affiliation(s)
- Andrew W Dowsey
- Institute of Biomedical Engineering, Imperial College London, United Kingdom
| | | | | |
Collapse
|
21
|
Caesar R, Palmfeldt J, Gustafsson JS, Pettersson E, Hashemi SH, Blomberg A. Comparative proteomics of industrial lager yeast reveals differential expression of the cerevisiae and non-cerevisiae parts of their genomes. Proteomics 2008; 7:4135-47. [PMID: 17994632 DOI: 10.1002/pmic.200601020] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The proteomes of three industrial lager beer strains, CMBS33, OG2252 and A15, were analysed under standardised laboratory growth conditions. Protein spots in the 2-DE pattern of the lager strains were subjected to MS/MS to identify protein variants. We found the protein composition of the three lager strains to be qualitatively rather similar, while being substantially different from the Saccharomyces cerevisiae strain BY4742. Database searches using several fully sequenced genomes from the Saccharomyces genera indicated that the non-cerevisiae proteins in the 2-D pattern of lager strains were most closely related to S. bayanus. For many proteins the regulation of the bayanus-like protein and its cerevisiae counterpart varied in a strain-dependent manner, e.g. the bayanus-like form of Tdh3p was roughly eight-fold more abundant than the cerevisiae form in the OG2252 strain. We also found differential regulation of cerevisiae- and bayanus-like proteins during various stress conditions like low temperature growth, and adaptation to high temperatures or high salinity, e.g. for Arg1p, Sti1p and Pdc1p. Our data on the differential regulation of the two genomes in these hybrid strains may have important industrial implications for strain improvement and strain protection.
Collapse
Affiliation(s)
- Robert Caesar
- Department of Cell and Molecular Biology, Microbiology, Göteborg University, Göteborg, Sweden
| | | | | | | | | | | |
Collapse
|
22
|
Sorzano COS, Arganda-Carreras I, Thévenaz P, Beloso A, Morales G, Valdés I, Pérez-García C, Castillo C, Garrido E, Unser M. Elastic image registration of 2-D gels for differential and repeatability studies. Proteomics 2008; 8:62-5. [DOI: 10.1002/pmic.200700473] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
23
|
Berth M, Moser FM, Kolbe M, Bernhardt J. The state of the art in the analysis of two-dimensional gel electrophoresis images. Appl Microbiol Biotechnol 2007; 76:1223-43. [PMID: 17713763 PMCID: PMC2279157 DOI: 10.1007/s00253-007-1128-0] [Citation(s) in RCA: 140] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2007] [Revised: 07/13/2007] [Accepted: 07/14/2007] [Indexed: 11/21/2022]
Abstract
Software-based image analysis is a crucial step in the biological interpretation of two-dimensional gel electrophoresis experiments. Recent significant advances in image processing methods combined with powerful computing hardware have enabled the routine analysis of large experiments. We cover the process starting with the imaging of 2-D gels, quantitation of spots, creation of expression profiles to statistical expression analysis followed by the presentation of results. Challenges for analysis software as well as good practices are highlighted. We emphasize image warping and related methods that are able to overcome the difficulties that are due to varying migration positions of spots between gels. Spot detection, quantitation, normalization, and the creation of expression profiles are described in detail. The recent development of consensus spot patterns and complete expression profiles enables one to take full advantage of statistical methods for expression analysis that are well established for the analysis of DNA microarray experiments. We close with an overview of visualization and presentation methods (proteome maps) and current challenges in the field.
Collapse
Affiliation(s)
- Matthias Berth
- DECODON GmbH, Rathenau-Strasse 49a, 17489 Greifswald, Germany
| | | | - Markus Kolbe
- DECODON GmbH, Rathenau-Strasse 49a, 17489 Greifswald, Germany
| | - Jörg Bernhardt
- DECODON GmbH, Rathenau-Strasse 49a, 17489 Greifswald, Germany
- Institute of Microbiology, Greifswald University, Jahnstrasse 15, 17487 Greifswald, Germany
| |
Collapse
|
24
|
Chich JF, David O, Villers F, Schaeffer B, Lutomski D, Huet S. Statistics for proteomics: Experimental design and 2-DE differential analysis. J Chromatogr B Analyt Technol Biomed Life Sci 2007; 849:261-72. [PMID: 17081811 DOI: 10.1016/j.jchromb.2006.09.033] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2006] [Revised: 08/25/2006] [Accepted: 09/08/2006] [Indexed: 11/24/2022]
Abstract
Proteomics relies on the separation of complex protein mixtures using bidimensional electrophoresis. This approach is largely used to detect the expression variations of proteins prepared from two or more samples. Recently, attention was drawn on the reliability of the results published in literature. Among the critical points identified were experimental design, differential analysis and the problem of missing data, all problems where statistics can be of help. Using examples and terms understandable by biologists, we describe how a collaboration between biologists and statisticians can improve reliability of results and confidence in conclusions.
Collapse
Affiliation(s)
- Jean-François Chich
- INRA, Biologie Physico-Chimique des Prions, VIM 78352 Jouy-en-Josas Cedex, France.
| | | | | | | | | | | |
Collapse
|
25
|
Potra FA, Liu X. Aligning families of two-dimensional gels by a combined multiresolution forward-inverse transformation approach. J Comput Biol 2007; 13:1384-95. [PMID: 17037965 DOI: 10.1089/cmb.2006.13.1384] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
A new method for aligning families of two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) images arising in proteomics studies is presented. Forward piecewise bilinear transformations are used to determine an ideal gel and to obtain an initial alignment of the family of gels to this ideal gel. Both the ideal landmarks and the coefficients defining the transformations are obtained by solving a quadratic programming problem. The alignment is then improved by using inverse transformations on finer grids. Numerical results for a family of 123 gels are reported.
Collapse
Affiliation(s)
- Florian A Potra
- Department of Mathematics and Statistics, UMBC, 1000 Hilltop Circle, Baltimore, Maryland 21250, USA.
| | | |
Collapse
|
26
|
Dowsey AW, English J, Pennington K, Cotter D, Stuehler K, Marcus K, Meyer HE, Dunn MJ, Yang GZ. Examination of 2-DE in the Human Proteome Organisation Brain Proteome Project pilot studies with the new RAIN gel matching technique. Proteomics 2006; 6:5030-47. [PMID: 16927431 DOI: 10.1002/pmic.200600152] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The Human Proteome Organisation (HUPO) Brain Proteome Project (BPP) pilot studies have generated over 200 2-D gels from eight participating laboratories. This data includes 67 single-channel and 60 DIGE gels comparing 30 whole frozen C57/BL6 female mouse brains, ten each at embryonic day 16, postnatal day 7 (juvenile) and postnatal day 54-56 (adult); and ten single-channel and three DIGE gels comparing human epilepsy surgery of the temporal front lobe with a corresponding post-mortem specimen. The samples were generated centrally and distributed to the participating laboratories, but otherwise no restrictions were placed on sample preparation, running and staining protocols, nor on the 2-D gel analysis packages used. Spots were characterised by MS and the annotated gel images published on a ProteinScape web server. In order to examine the resultant differential expression and protein identifications, we have reprocessed a large subset of the gels using the newly developed RAIN (Robust Automated Image Normalisation) 2-D gel matching algorithm. Traditional approaches use symbolic representation of spots at the very early stages of the analysis, which introduces persistent errors due to inaccuracies in spot modelling and matching. With RAIN, image intensity distributions, rather than selected features, are used, where smooth geometric deformation and expression bias are modelled using multi-resolution image registration and bias-field correction. The method includes a new approach of volume-invariant warping which ensures the volume of protein expression under transformation is preserved. An image-based statistical expression analysis phase is then proposed, where small insignificant expression changes over one gel pair can be revealed when reinforced by the same consistent changes in others. Results of the proposed method as applied to the HUPO BPP data show significant intra-laboratory improvements in matching accuracy over a previous state-of-the-art technique, Multi-resolution Image Registration (MIR), and the commercial Progenesis PG240 package.
Collapse
Affiliation(s)
- Andrew W Dowsey
- Royal Society / Wolfson Foundation Medical Image Computing Laboratory, Department of Computing, Imperial College London, UK
| | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Potra FA, Liu X, Seillier-Moiseiwitsch F, Roy A, Hang Y, Marten MR, Raman B, Whisnant C. Protein Image Alignment via Piecewise Affine Transformations. J Comput Biol 2006; 13:614-30. [PMID: 16706715 DOI: 10.1089/cmb.2006.13.614] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We present a new approach for aligning families of 2D gels. Instead of choosing one of the gels as reference and performing a pairwise alignment, we construct an ideal gel that is representative of the entire family and obtain a set of piecewise affine transformations that optimally align each gel of the family to the ideal gel. The coefficients defining the transformations as well as the ideal landmarks are obtained as the solution of a large-scale quadratic programming problem that can be solved efficiently by interior-point methods.
Collapse
Affiliation(s)
- Florian A Potra
- Department of Mathematics and Statistics, University of Maryland, Baltimore, MD 21250, USA
| | | | | | | | | | | | | | | |
Collapse
|
28
|
Trifunović D, Radović M, Ristić Z, Guzvić M, Dimitrijević B. Analysis of electrophoretic patterns of arbitrarily primed PCR profiling. Electrophoresis 2006; 26:4277-86. [PMID: 16287184 DOI: 10.1002/elps.200500381] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We present a mathematical algorithm for the analysis of electrophoretic patterns resulting from arbitrarily primed PCR profiling. The algorithm is based on the established mathematical procedures applied to the analysis of digital images of gel patterns. The algorithm includes (a) transformation of the image into a matrix form, (b) identification of every electrophoretic lane as a set of matrix columns that are further mathematically processed, (c) averaging of matrix columns corresponding to electrophoretic lanes that define lane representatives, (d) elimination of "smiling" bands, (e) solving the problem of a lane offset, and (f) removal of the background. Representation of individual electrophoretic lanes in the form of functions allows interlane comparisons and further mathematical analysis. Direct comparison of selected lanes was obtained by employing correlation analysis. Gel images were those obtained after arbitrarily primed PCR analysis of DNA that underwent damage induced by gamma radiation from a (60)Co source. The applied method proved to be useful for elimination of subjectivity of visual inspection. It offers the possibility to avoid overlooking important differences in case of suboptimal electrophoretic resolution. In addition, higher precision is achieved in the assessment of quantitative differences due to better insight into experimental artifacts. These simple mathematical methods offer an open-type algorithm, i.e., this algorithm enables easy implementation of different parameters that may be useful for other analytical needs.
Collapse
Affiliation(s)
- Dragana Trifunović
- Laboratory for Radiobiology and Molecular Genetics, Institute of Nuclear Sciences, Vinca, Belgrade, Serbia and Montenegro
| | | | | | | | | |
Collapse
|
29
|
Barrett J, Brophy PM, Hamilton JV. Analysing proteomic data. Int J Parasitol 2005; 35:543-53. [PMID: 15826646 DOI: 10.1016/j.ijpara.2005.01.013] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2004] [Revised: 01/10/2005] [Accepted: 01/12/2005] [Indexed: 11/23/2022]
Abstract
The rapid growth of proteomics has been made possible by the development of reproducible 2D gels and biological mass spectrometry. However, despite technical improvements 2D gels are still less than perfectly reproducible and gels have to be aligned so spots for identical proteins appear in the same place. Gels can be warped by a variety of techniques to make them concordant. When gels are manipulated to improve registration, information is lost, so direct methods for gel registration which make use of all available data for spot matching are preferable to indirect ones. In order to identify proteins from gel spots a property or combination of properties that are unique to that protein are required. These can then be used to search databases for possible matches. Molecular mass, pI, amino acid composition and short sequence tags can all be used in database searches. Currently the method of choice for protein identification is mass spectrometry. Proteins are eluted from the gels and cleaved with specific endoproteases to produce a series of peptides of different molecular mass. In peptide mass fingerprinting, the peptide profile of the unknown protein is compared with theoretical peptide libraries generated from sequences in the different databases. Tandem mass spectroscopy (MS/MS) generates short amino acid sequence tags for the individual peptides. These partial sequences combined with the original peptide masses are then used for database searching, greatly improving specificity. Increasingly protein identification from MS/MS data is being fully or partially automated. When working with organisms, which do not have sequenced genomes (the case with most helminths), protein identification by database searching becomes problematical. A number of approaches to cross species protein identification have been suggested, but if the organism being studied is only distantly related to any organism with a sequenced genome then the likelihood of protein identification remains small. The dynamic nature of the proteome means that there really is no such thing as a single representative proteome and a complete set of metadata (data about the data) is going to be required if the full potential of database mining is to be realised in the future.
Collapse
Affiliation(s)
- J Barrett
- Institute of Biological Sciences, University of Wales, Penglais, Aberystwyth, Ceredigion, Wales SY23 3DA, UK.
| | | | | |
Collapse
|
30
|
Sorzano COS, Thévenaz P, Unser M. Elastic registration of biological images using vector-spline regularization. IEEE Trans Biomed Eng 2005; 52:652-63. [PMID: 15825867 DOI: 10.1109/tbme.2005.844030] [Citation(s) in RCA: 131] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We present an elastic registration algorithm for the alignment of biological images. Our method combines and extends some of the best techniques available in the context of medical imaging. We express the deformation field as a B-spline model, which allows us to deal with a rich variety of deformations. We solve the registration problem by minimizing a pixelwise mean-square distance measure between the target image and the warped source. The problem is further constrained by way of a vector-spline regularization which provides some control over two independent quantities that are intrinsic to the deformation: its divergence, and its curl. Our algorithm is also able to handle soft landmark constraints, which is particularly useful when parts of the images contain very little information or when its repartition is uneven. We provide an optimal analytical solution in the case when only landmarks and smoothness considerations are taken into account. We have applied our approach to perform the elastic registration of images such as electrophoretic gels and fly embryos. The validation of the results by experts has been favorable in all cases.
Collapse
Affiliation(s)
- Carlos O S Sorzano
- Biomedical Imaging Group, Swiss Federal Institute of Technology Lausanne (EPFL), Switzerland.
| | | | | |
Collapse
|
31
|
Dowsey AW, Dunn MJ, Yang GZ. ProteomeGRID: towards a high-throughput proteomics pipeline through opportunistic cluster image computing for two-dimensional gel electrophoresis. Proteomics 2005; 4:3800-12. [PMID: 15478217 DOI: 10.1002/pmic.200300894] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The quest for high-throughput proteomics has revealed a number of critical issues. Whilst improved two-dimensional gel electrophoresis (2-DE) sample preparation, staining and imaging issues are being actively pursued by industry, reliable high-throughput spot matching and quantification remains a significant bottleneck in the bioinformatics pipeline, thus restricting the flow of data to mass spectrometry through robotic spot excision and protein digestion. To this end, it is important to establish a full multi-site Grid infrastructure for the processing, archival, standardisation and retrieval of proteomic data and metadata. Particular emphasis needs to be placed on large-scale image mining and statistical cross-validation for reliable, fully automated differential expression analysis, and the development of a statistical 2-DE object model and ontology that underpins the emerging HUPO PSI GPS (Human Proteome Organization Proteomics Standards Initiative General Proteomics Standards). The first step towards this goal is to overcome the computational and communications burden entailed by the image analysis of 2-DE gels with Grid enabled cluster computing. This paper presents the proTurbo framework as part of the ProteomeGRID, which utilises Condor cluster management combined with CORBA communications and JPEG-LS lossless image compression for task farming. A novel probabilistic eager scheduler has been developed to minimise make-span, where tasks are duplicated in response to the likelihood of the Condor machines' owners evicting them. A 60 gel experiment was pair-wise image registered (3540 tasks) on a 40 machine Linux cluster. Real-world performance and network overhead was gauged, and Poisson distributed worker evictions were simulated. Our results show a 4:1 lossless and 9:1 near lossless image compression ratio and so network overhead did not affect other users. With 40 workers a 32x speed-up was seen (80% resource efficiency), and the eager scheduler reduced the impact of evictions by 58%.
Collapse
Affiliation(s)
- Andrew W Dowsey
- Royal Society/Wolfson Foundation Medical Image Computing Laboratory, Imperial College London, UK
| | | | | |
Collapse
|
32
|
Gottlieb DM, Schultz J, Bruun SW, Jacobsen S, Søndergaard I. Multivariate approaches in plant science. PHYTOCHEMISTRY 2004; 65:1531-1548. [PMID: 15276450 DOI: 10.1016/j.phytochem.2004.04.008] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2003] [Revised: 04/01/2004] [Indexed: 05/24/2023]
Abstract
The objective of proteomics is to get an overview of the proteins expressed at a given point in time in a given tissue and to identify the connection to the biochemical status of that tissue. Therefore sample throughput and analysis time are important issues in proteomics. The concept of proteomics is to encircle the identity of proteins of interest. However, the overall relation between proteins must also be explained. Classical proteomics consist of separation and characterization, based on two-dimensional electrophoresis, trypsin digestion, mass spectrometry and database searching. Characterization includes labor intensive work in order to manage, handle and analyze data. The field of classical proteomics should therefore be extended to also include handling of large datasets in an objective way. The separation obtained by two-dimensional electrophoresis and mass spectrometry gives rise to huge amount of data. We present a multivariate approach to the handling of data in proteomics with the advantage that protein patterns can be spotted at an early stage and consequently the proteins selected for sequencing can be selected intelligently. These methods can also be applied to other data generating protein analysis methods like mass spectrometry and near infrared spectroscopy and examples of application to these techniques are also presented. Multivariate data analysis can unravel complicated data structures and may thereby relieve the characterization phase in classical proteomics. Traditionally statistical methods are not suitable for analysis of the huge amounts of data, where the number of variables exceed the number of objects. Multivariate data analysis, on the other hand, may uncover the hidden structures present in these data. This study takes its starting point in the field of classical proteomics and shows how multivariate data analysis can lead to faster ways of finding interesting proteins. Multivariate analysis has shown interesting results as a supplement to classical proteomics and added a new dimension to the field of proteomics.
Collapse
Affiliation(s)
- David M Gottlieb
- Plasma Product Division, Statens Serum Institut, Artillerivej 5, DK-2300 Copenhagen S, Denmark
| | | | | | | | | |
Collapse
|