1
|
Synthetic Data Generation for the Development of 2D Gel Electrophoresis Protein Spot Models. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12094393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Two-dimensional electrophoresis gels (2DE, 2DEG) are the result of the procedure of separating, based on two molecular properties, a protein mixture on gel. Separated similar proteins concentrate in groups, and these groups appear as dark spots in the captured gel image. Gel images are analyzed to detect distinct spots and determine their peak intensity, background, integrated intensity, and other attributes of interest. One of the approaches to parameterizing the protein spots is spot modeling. Spot parameters of interest are obtained after the spot is approximated by a mathematical model. The development of the modeling algorithm requires a rich, diverse, representative dataset. The primary goal of this research is to develop a method for generating a synthetic protein spot dataset that can be used to develop 2DEG image analysis algorithms. The secondary objective is to evaluate the usefulness of the created dataset by developing a neural-network-based protein spot reconstruction algorithm that provides parameterization and denoising functionalities. In this research, a spot modeling algorithm based on autoencoders is developed using only the created synthetic dataset. The algorithm is evaluated on real and synthetic data. Evaluation results show that the created synthetic dataset is effective for the development of protein spot models. The developed algorithm outperformed all baseline algorithms in all experimental cases.
Collapse
|
2
|
Two-Dimensional Gel Electrophoresis Image Analysis. Methods Mol Biol 2021; 2361:3-13. [PMID: 34236652 DOI: 10.1007/978-1-0716-1641-3_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Gel-based proteomics is still quite widespread due to its high-resolution power; the experimental approach is based on differential analysis, where groups of samples (e.g., control vs diseased) are compared to identify panels of potential biomarkers. However, the reliability of the result of the differential analysis is deeply influenced by 2D-PAGE maps image analysis procedures. The analysis of 2D-PAGE images consists of several steps, such as image preprocessing, spot detection and quantitation, image warping and alignment, spot matching. Several approaches are present in literature, and classical or last-generation commercial software packages exploit different algorithms for each step of the analysis. Here, the most widespread approaches and a comparison of the different strategies are presented.
Collapse
|
3
|
Abstract
2D-DIGE is still a very widespread technique in proteomics for the identification of panels of biomarkers, allowing to tackle with some important drawback of classical two-dimensional gel-electrophoresis. However, once 2D-gels are obtained, they must undergo a quite articulated multistep image analysis procedure before the final differential analysis via statistical mono- and multivariate methods. Here, the main steps of image analysis software are described and the most recent procedures reported in the literature are briefly presented.
Collapse
Affiliation(s)
- Elisa Robotti
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, 15121, Alessandria, Italy.
| | - Emilio Marengo
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, 15121, Alessandria, Italy
| |
Collapse
|
4
|
Marczyk M. Mixture Modeling of 2-D Gel Electrophoresis Spots Enhances the Performance of Spot Detection. IEEE Trans Nanobioscience 2017; 16:91-99. [PMID: 28278480 DOI: 10.1109/tnb.2017.2676725] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
2-D gel electrophoresis is the most commonly used method in biomedicine to separate even thousands of proteins in a complex sample on a single gel. Even though the technique is quite known, there is still a need to find an efficient and reliable method for detection of protein spots on gel image. In this paper, a three-step algorithm based on mixture of 2-D normal distribution functions is introduced to improve the efficiency of spot detection performed by the existing algorithms, namely Pinnacle software and watershed segmentation method. Comparison of methods is based on using simulated and real data sets with known true spot positions and different number of spots. Fitting a mixture of components to gel image allows for achieving higher sensitivity in detecting spots, regardless the method used to find initial conditions for the model parameters, and it leads to better overall performance of spot detection. By using mixture model, location of spot centers can be estimated with higher accuracy than using the Pinnacle method. An application of spot shape modeling gives higher sensitivity of obtaining low-intensity spots than the watershed method, which is crucial in the discovery of novel biomarkers.
Collapse
|
5
|
Robotti E, Marengo E, Quasso F. Image Pretreatment Tools II: Normalization Techniques for 2-DE and 2-D DIGE. Methods Mol Biol 2016; 1384:91-107. [PMID: 26611411 DOI: 10.1007/978-1-4939-3255-9_6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Gel electrophoresis is usually applied to identify different protein expression profiles in biological samples (e.g., control vs. pathological, control vs. treated). Information about the effect to be investigated (a pathology, a drug, a ripening effect, etc.) is however generally confounded with experimental variability that is quite large in 2-DE and may arise from small variations in the sample preparation, reagents, sample loading, electrophoretic conditions, staining and image acquisition. Obtaining valid quantitative estimates of protein abundances in each map, before the differential analysis, is therefore fundamental to provide robust candidate biomarkers. Normalization procedures are applied to reduce experimental noise and make the images comparable, improving the accuracy of differential analysis. Certainly, they may deeply influence the final results, and to this respect they have to be applied with care. Here, the most widespread normalization procedures are described both for what regards the applications to 2-DE and 2D Difference Gel-electrophoresis (2-D DIGE) maps.
Collapse
Affiliation(s)
- Elisa Robotti
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, 15121, Alessandria, Italy.
| | - Emilio Marengo
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, 15121, Alessandria, Italy
| | - Fabio Quasso
- Department of Sciences and Technological Innovation, University of Piemonte Orientale, Viale Michel 11, 15121, Alessandria, Italy
| |
Collapse
|
6
|
Kostopoulou E, Zacharia E, Maroulis D. An effective approach for detection and segmentation of protein spots on 2-D gel images. IEEE J Biomed Health Inform 2014; 18:67-76. [PMID: 24403405 DOI: 10.1109/jbhi.2013.2259208] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Two-dimensional gel image analysis is widely recognized as a particularly challenging and arduous process in proteomics field. The detection and segmentation of protein spots are two significant stages of this process as they can considerably affect the final biological conclusions of a proteomic experiment. The available techniques and commercial software packages deal with the existing challenges of 2-D gel images in a different degree of success. Furthermore, they require extensive human intervention which not only limits the throughput but unavoidably questions the objectivity and reproducibility of results. This paper introduces a novel approach for the detection and segmentation of protein spots on 2-D gel images. The proposed approach is based on 2-D image histograms as well as on 3-D spots morphology. It is automatic and capable to deal with the most common deficiencies of existing software programs and techniques in an effective manner. Experimental evaluation includes tests on several real and synthetic 2-D gel images produced by different technology setups, containing a total of ∼ 21,400 spots. Furthermore, the proposed approach has been compared with two commercial software packages as well as with two state-of-the-art techniques. Results have demonstrated the effectiveness of the proposed approach and its superiority against compared software packages and techniques.
Collapse
|
7
|
Dowsey AW, English JA, Lisacek F, Morris JS, Yang GZ, Dunn MJ. Image analysis tools and emerging algorithms for expression proteomics. Proteomics 2010; 10:4226-57. [PMID: 21046614 PMCID: PMC3257807 DOI: 10.1002/pmic.200900635] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2009] [Accepted: 08/28/2010] [Indexed: 11/11/2022]
Abstract
Since their origins in academic endeavours in the 1970s, computational analysis tools have matured into a number of established commercial packages that underpin research in expression proteomics. In this paper we describe the image analysis pipeline for the established 2-DE technique of protein separation, and by first covering signal analysis for MS, we also explain the current image analysis workflow for the emerging high-throughput 'shotgun' proteomics platform of LC coupled to MS (LC/MS). The bioinformatics challenges for both methods are illustrated and compared, whereas existing commercial and academic packages and their workflows are described from both a user's and a technical perspective. Attention is given to the importance of sound statistical treatment of the resultant quantifications in the search for differential expression. Despite wide availability of proteomics software, a number of challenges have yet to be overcome regarding algorithm accuracy, objectivity and automation, generally due to deterministic spot-centric approaches that discard information early in the pipeline, propagating errors. We review recent advances in signal and image analysis algorithms in 2-DE, MS, LC/MS and Imaging MS. Particular attention is given to wavelet techniques, automated image-based alignment and differential analysis in 2-DE, Bayesian peak mixture models, and functional mixed modelling in MS, and group-wise consensus alignment methods for LC/MS.
Collapse
Affiliation(s)
- Andrew W. Dowsey
- Institute of Biomedical Engineering, Imperial College London, South Kensington, London SW7 2AZ, U.K
| | - Jane A. English
- Proteome Research Centre, UCD Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Belfield, Ireland
| | - Frederique Lisacek
- Proteome Informatics Group, Swiss Institute of Bioinformatics, CMU - 1, rue Michel Servet, CH-1211 Geneva, Switzerland
| | - Jeffrey S. Morris
- Department of Biostatistics, The University of Texas M.D. Anderson Cancer Center, Houston, Texas 77030-4009, U.S.A
| | - Guang-Zhong Yang
- Institute of Biomedical Engineering, Imperial College London, South Kensington, London SW7 2AZ, U.K
| | - Michael J. Dunn
- Proteome Research Centre, UCD Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Belfield, Ireland
| |
Collapse
|
8
|
Lasso G, Matthiesen R. Computational methods for analysis of two-dimensional gels. Methods Mol Biol 2010; 593:231-62. [PMID: 19957153 DOI: 10.1007/978-1-60327-194-3_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
Abstract
Two-dimensional gel electrophoresis (2D gels) is an essential quantitative proteomics technique that is frequently used to study differences between samples of clinical relevance. Although considered to have a low throughput, 2D gels can separate thousands of proteins in one gel, making it a good complementary method to MS-based protein quantification. The main drawback of the technique is the tendency of large and hydrophobic proteins such as membrane proteins to precipitate in the isoelectric focusing step. Furthermore, tests using different programs with distinct algorithms for 2D-gel analysis have shown inconsistent ratio values. The aim here is therefore to provide a discussion of algorithms described for the analysis of 2D gels.
Collapse
Affiliation(s)
- Gorka Lasso
- Bioinformatics, Parque Technológico de Bizkaia, Derio, Spain
| | | |
Collapse
|
9
|
Grove H, Faergestad EM, Hollung K, Martens H. Improved dynamic range of protein quantification in silver-stained gels by modelling gel images over time. Electrophoresis 2009; 30:1856-62. [PMID: 19517441 DOI: 10.1002/elps.200800568] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Silver staining is a commonly used protein stain to visualise proteins separated by 2-DE. Despite this, the technique suffers from a limited dynamic range, making the simultaneous quantification of high- and low-abundant proteins difficult. In this paper we take advantage of the fact that silver staining is not an end-point stain by photographing the gels during development. This procedure provides information about the change in measured absorbance for each pixel in the protein spots on the gel. The maximum rate of change was found to be correlated with the amount of applied protein, providing a new way of estimating protein amount in 2-DE gels. We observed an improvement in the dynamic range of silver staining by up to two orders of magnitude.
Collapse
Affiliation(s)
- Harald Grove
- Nofima Mat, Norwegian Institute of Food, Fisheries and Aquaculture Research, As, Norway.
| | | | | | | |
Collapse
|
10
|
Langella O, Zivy M. A method based on bead flows for spot detection on 2-D gel images. Proteomics 2008; 8:4914-8. [DOI: 10.1002/pmic.200800644] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
11
|
Rye MB, Alsberg BK. A multivariate spot filtering model for two-dimensional gel electrophoresis. Electrophoresis 2008; 29:1369-81. [DOI: 10.1002/elps.200700417] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
12
|
Barrett J, Brophy PM, Hamilton JV. Analysing proteomic data. Int J Parasitol 2005; 35:543-53. [PMID: 15826646 DOI: 10.1016/j.ijpara.2005.01.013] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2004] [Revised: 01/10/2005] [Accepted: 01/12/2005] [Indexed: 11/23/2022]
Abstract
The rapid growth of proteomics has been made possible by the development of reproducible 2D gels and biological mass spectrometry. However, despite technical improvements 2D gels are still less than perfectly reproducible and gels have to be aligned so spots for identical proteins appear in the same place. Gels can be warped by a variety of techniques to make them concordant. When gels are manipulated to improve registration, information is lost, so direct methods for gel registration which make use of all available data for spot matching are preferable to indirect ones. In order to identify proteins from gel spots a property or combination of properties that are unique to that protein are required. These can then be used to search databases for possible matches. Molecular mass, pI, amino acid composition and short sequence tags can all be used in database searches. Currently the method of choice for protein identification is mass spectrometry. Proteins are eluted from the gels and cleaved with specific endoproteases to produce a series of peptides of different molecular mass. In peptide mass fingerprinting, the peptide profile of the unknown protein is compared with theoretical peptide libraries generated from sequences in the different databases. Tandem mass spectroscopy (MS/MS) generates short amino acid sequence tags for the individual peptides. These partial sequences combined with the original peptide masses are then used for database searching, greatly improving specificity. Increasingly protein identification from MS/MS data is being fully or partially automated. When working with organisms, which do not have sequenced genomes (the case with most helminths), protein identification by database searching becomes problematical. A number of approaches to cross species protein identification have been suggested, but if the organism being studied is only distantly related to any organism with a sequenced genome then the likelihood of protein identification remains small. The dynamic nature of the proteome means that there really is no such thing as a single representative proteome and a complete set of metadata (data about the data) is going to be required if the full potential of database mining is to be realised in the future.
Collapse
Affiliation(s)
- J Barrett
- Institute of Biological Sciences, University of Wales, Penglais, Aberystwyth, Ceredigion, Wales SY23 3DA, UK.
| | | | | |
Collapse
|
13
|
Pleissner KP, Hoffmann F, Kriegel K, Wenk C, Wegner S, Sahlström A, Oswald H, Alt H, Fleck E. New algorithmic approaches to protein spot detection and pattern matching in two-dimensional electrophoresis gel databases. Electrophoresis 1999; 20:755-65. [PMID: 10344245 DOI: 10.1002/(sici)1522-2683(19990101)20:4/5<755::aid-elps755>3.0.co;2-6] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Protein spot identification in two-dimensional electrophoresis gels can be supported by the comparison of gel images accessible in different World Wide Web two-dimensional electrophoresis (2-DE) gel protein databases. The comparison may be performed either by visual cross-matching between gel images or by automatic recognition of similar protein spot patterns. A prerequisite for the automatic point pattern matching approach is the detection of protein spots yielding the x(s),y(s) coordinates and integrated spot intensities i(s). For this purpose an algorithm is developed based on a combination of hierarchical watershed transformation and feature extraction methods. This approach reduces the strong over-segmentation of spot regions normally produced by watershed transformation. Measures for the ellipticity and curvature are determined as features of spot regions. The resulting spot lists containing x(s),y(s),i(s)-triplets are calculated for a source as well as for a target gel image accessible in 2-DE gel protein databases. After spot detection a matching procedure is applied. Both the matching of a local pattern vs. a full 2-DE gel image and the global matching between full images are discussed. Preset slope and length tolerances of pattern edges serve as matching criteria. The local matching algorithm relies on a data structure derived from the incremental Delaunay triangulation of a point set and a two-step hashing technique. For the incremental construction of triangles the spot intensities are considered in decreasing order. The algorithm needs neither landmarks nor an a priori image alignment. A graphical user interface for spot detection and gel matching is written in the Java programming language for the Internet. The software package called CAROL (http://gelmatching.inf.fu-berlin.de) is realized in a client-server architecture.
Collapse
Affiliation(s)
- K P Pleissner
- Department of Internal Medicine/Cardiology, Charité, Campus Virchow-Clinic, Humboldt University and German Heart Institute, Berlin.
| | | | | | | | | | | | | | | | | |
Collapse
|