1
|
Salazar OR, Chen K, Melino VJ, Reddy MP, Hřibová E, Čížková J, Beránková D, Arciniegas Vega JP, Cáceres Leal LM, Aranda M, Jaremko L, Jaremko M, Fedoroff NV, Tester M, Schmöckel SM. SOS1 tonoplast neo-localization and the RGG protein SALTY are important in the extreme salinity tolerance of Salicornia bigelovii. Nat Commun 2024; 15:4279. [PMID: 38769297 PMCID: PMC11106269 DOI: 10.1038/s41467-024-48595-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 05/07/2024] [Indexed: 05/22/2024] Open
Abstract
The identification of genes involved in salinity tolerance has primarily focused on model plants and crops. However, plants naturally adapted to highly saline environments offer valuable insights into tolerance to extreme salinity. Salicornia plants grow in coastal salt marshes, stimulated by NaCl. To understand this tolerance, we generated genome sequences of two Salicornia species and analyzed the transcriptomic and proteomic responses of Salicornia bigelovii to NaCl. Subcellular membrane proteomes reveal that SbiSOS1, a homolog of the well-known SALT-OVERLY-SENSITIVE 1 (SOS1) protein, appears to localize to the tonoplast, consistent with subcellular localization assays in tobacco. This neo-localized protein can pump Na+ into the vacuole, preventing toxicity in the cytosol. We further identify 11 proteins of interest, of which SbiSALTY, substantially improves yeast growth on saline media. Structural characterization using NMR identified it as an intrinsically disordered protein, localizing to the endoplasmic reticulum in planta, where it can interact with ribosomes and RNA, stabilizing or protecting them during salt stress.
Collapse
Affiliation(s)
- Octavio R Salazar
- Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
- Red Sea Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Ke Chen
- Rice Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou, 510640, China
| | - Vanessa J Melino
- Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Muppala P Reddy
- Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Eva Hřibová
- Institute of Experimental Botany of the Czech Academy of Sciences, Centre of Plant Structural and Functional Genomics, Šlechtitelů 31, 77900, Olomouc, Czech Republic
| | - Jana Čížková
- Institute of Experimental Botany of the Czech Academy of Sciences, Centre of Plant Structural and Functional Genomics, Šlechtitelů 31, 77900, Olomouc, Czech Republic
| | - Denisa Beránková
- Institute of Experimental Botany of the Czech Academy of Sciences, Centre of Plant Structural and Functional Genomics, Šlechtitelů 31, 77900, Olomouc, Czech Republic
| | - Juan Pablo Arciniegas Vega
- Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Lina María Cáceres Leal
- Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Manuel Aranda
- Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
- Red Sea Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Lukasz Jaremko
- Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Mariusz Jaremko
- Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Nina V Fedoroff
- Department of Biology, Penn State University, University Park, PA, 16801, US
| | - Mark Tester
- Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia.
- Center for Desert Agriculture, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia.
| | - Sandra M Schmöckel
- Department Physiology of Yield Stability, Institute of Crop Science, University of Hohenheim, Fruwirthstr. 21, 70599, Stuttgart, Germany
| |
Collapse
|
2
|
Moloney NM, Barylyuk K, Tromer E, Crook OM, Breckels LM, Lilley KS, Waller RF, MacGregor P. Mapping diversity in African trypanosomes using high resolution spatial proteomics. Nat Commun 2023; 14:4401. [PMID: 37479728 PMCID: PMC10361982 DOI: 10.1038/s41467-023-40125-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 07/06/2023] [Indexed: 07/23/2023] Open
Abstract
African trypanosomes are dixenous eukaryotic parasites that impose a significant human and veterinary disease burden on sub-Saharan Africa. Diversity between species and life-cycle stages is concomitant with distinct host and tissue tropisms within this group. Here, the spatial proteomes of two African trypanosome species, Trypanosoma brucei and Trypanosoma congolense, are mapped across two life-stages. The four resulting datasets provide evidence of expression of approximately 5500 proteins per cell-type. Over 2500 proteins per cell-type are classified to specific subcellular compartments, providing four comprehensive spatial proteomes. Comparative analysis reveals key routes of parasitic adaptation to different biological niches and provides insight into the molecular basis for diversity within and between these pathogen species.
Collapse
Affiliation(s)
- Nicola M Moloney
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK
| | | | - Eelco Tromer
- Cell Biochemistry, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, 9747 AG, Groningen, Netherlands
| | - Oliver M Crook
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK
- Department of Statistics, University of Oxford, Oxford, OX1 3LB, UK
| | - Lisa M Breckels
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK
| | - Kathryn S Lilley
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK
| | - Ross F Waller
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK
| | - Paula MacGregor
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK.
- School of Biological Sciences, University of Bristol, Bristol, BS8 1TQ, UK.
| |
Collapse
|
3
|
Guérin A, Strelau KM, Barylyuk K, Wallbank BA, Berry L, Crook OM, Lilley KS, Waller RF, Striepen B. Cryptosporidium uses multiple distinct secretory organelles to interact with and modify its host cell. Cell Host Microbe 2023; 31:650-664.e6. [PMID: 36958336 DOI: 10.1016/j.chom.2023.03.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 02/09/2023] [Accepted: 02/28/2023] [Indexed: 03/25/2023]
Abstract
Cryptosporidium is a leading cause of diarrheal disease in children and an important contributor to early childhood mortality. The parasite invades and extensively remodels intestinal epithelial cells, building an elaborate interface structure. How this occurs at the molecular level and the contributing parasite factors are largely unknown. Here, we generated a whole-cell spatial proteome of the Cryptosporidium sporozoite and used genetic and cell biological experimentation to discover the Cryptosporidium-secreted effector proteome. These findings reveal multiple organelles, including an original secretory organelle, and generate numerous compartment markers by tagging native gene loci. We show that secreted proteins are delivered to the parasite-host interface, where they assemble into different structures including a ring that anchors the parasite into its unique epicellular niche. Cryptosporidium thus uses a complex set of secretion systems during and following invasion that act in concert to subjugate its host cell.
Collapse
Affiliation(s)
- Amandine Guérin
- Department of Pathobiology, School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Katherine M Strelau
- Department of Pathobiology, School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | | | - Bethan A Wallbank
- Department of Pathobiology, School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Laurence Berry
- LPHI, CNRS, Université de Montpellier, Montpellier 34095, France
| | - Oliver M Crook
- Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
| | - Kathryn S Lilley
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1QW, UK
| | - Ross F Waller
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1QW, UK
| | - Boris Striepen
- Department of Pathobiology, School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
4
|
Mou M, Pan Z, Lu M, Sun H, Wang Y, Luo Y, Zhu F. Application of Machine Learning in Spatial Proteomics. J Chem Inf Model 2022; 62:5875-5895. [PMID: 36378082 DOI: 10.1021/acs.jcim.2c01161] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Spatial proteomics is an interdisciplinary field that investigates the localization and dynamics of proteins, and it has gained extensive attention in recent years, especially the subcellular proteomics. Numerous evidence indicate that the subcellular localization of proteins is associated with various cellular processes and disease progression. Mass spectrometry (MS)-based and imaging-based experimental approaches have been developed to acquire large-scale spatial proteomic data. To allow the reliable analysis of increasingly complex spatial proteomics data, machine learning (ML) methods have been widely used in both MS-based and imaging-based spatial proteomic data analysis pipelines. Here, we comprehensively survey the applications of ML in spatial proteomics from following aspects: (1) data resources for spatial proteome are comprehensively introduced; (2) the roles of different ML algorithms in data analysis pipelines are elaborated; (3) successful applications of spatial proteomics and several analytical tools integrating ML methods are presented; (4) challenges existing in modern ML-based spatial proteomics studies are discussed. This review provides guidelines for researchers seeking to apply ML methods to analyze spatial proteomic data and can facilitate insightful understanding of cell biology as well as the future research in medical and drug discovery communities.
Collapse
Affiliation(s)
- Minjie Mou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Mingkun Lu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Huaicheng Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yunxia Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yongchao Luo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
5
|
Braccia C, Christopher JA, Crook OM, Breckels LM, Queiroz RML, Liessi N, Tomati V, Capurro V, Bandiera T, Baldassari S, Pedemonte N, Lilley KS, Armirotti A. CFTR Rescue by Lumacaftor (VX-809) Induces an Extensive Reorganization of Mitochondria in the Cystic Fibrosis Bronchial Epithelium. Cells 2022; 11:1938. [PMID: 35741067 PMCID: PMC9222197 DOI: 10.3390/cells11121938] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 06/07/2022] [Accepted: 06/12/2022] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Cystic Fibrosis (CF) is a genetic disorder affecting around 1 in every 3000 newborns. In the most common mutation, F508del, the defective anion channel, CFTR, is prevented from reaching the plasma membrane (PM) by the quality check control of the cell. Little is known about how CFTR pharmacological rescue impacts the cell proteome. METHODS We used high-resolution mass spectrometry, differential ultracentrifugation, machine learning and bioinformatics to investigate both changes in the expression and localization of the human bronchial epithelium CF model (F508del-CFTR CFBE41o-) proteome following treatment with VX-809 (Lumacaftor), a drug able to improve the trafficking of CFTR. RESULTS The data suggested no stark changes in protein expression, yet subtle localization changes of proteins of the mitochondria and peroxisomes were detected. We then used high-content confocal microscopy to further investigate the morphological and compositional changes of peroxisomes and mitochondria under these conditions, as well as in patient-derived primary cells. We profiled several thousand proteins and we determined the subcellular localization data for around 5000 of them using the LOPIT-DC spatial proteomics protocol. CONCLUSIONS We observed that treatment with VX-809 induces extensive structural and functional remodelling of mitochondria and peroxisomes that resemble the phenotype of healthy cells. Our data suggest additional rescue mechanisms of VX-809 beyond the correction of aberrant folding of F508del-CFTR and subsequent trafficking to the PM.
Collapse
Affiliation(s)
- Clarissa Braccia
- D3 PharmaChemistry, Istituto Italiano di Tecnologia, Via Morego 30, 16163 Genova, Italy; (C.B.); (T.B.)
| | - Josie A. Christopher
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK; (J.A.C.); (O.M.C.); (L.M.B.); (R.M.L.Q.)
| | - Oliver M. Crook
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK; (J.A.C.); (O.M.C.); (L.M.B.); (R.M.L.Q.)
- Department of Statistics, University of Oxford, 29 St Giles’, Oxford OX1 3LB, UK
| | - Lisa M. Breckels
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK; (J.A.C.); (O.M.C.); (L.M.B.); (R.M.L.Q.)
| | - Rayner M. L. Queiroz
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK; (J.A.C.); (O.M.C.); (L.M.B.); (R.M.L.Q.)
| | - Nara Liessi
- Analytical Chemistry Facility, Istituto Italiano di Tecnologia, Via Morego 30, 16163 Genova, Italy;
| | - Valeria Tomati
- UOC Genetica Medica, IRCCS Istituto Giannina Gaslini, Via Gerolamo Gaslini 5, 16147 Genova, Italy; (V.T.); (V.C.); (S.B.)
| | - Valeria Capurro
- UOC Genetica Medica, IRCCS Istituto Giannina Gaslini, Via Gerolamo Gaslini 5, 16147 Genova, Italy; (V.T.); (V.C.); (S.B.)
| | - Tiziano Bandiera
- D3 PharmaChemistry, Istituto Italiano di Tecnologia, Via Morego 30, 16163 Genova, Italy; (C.B.); (T.B.)
| | - Simona Baldassari
- UOC Genetica Medica, IRCCS Istituto Giannina Gaslini, Via Gerolamo Gaslini 5, 16147 Genova, Italy; (V.T.); (V.C.); (S.B.)
| | - Nicoletta Pedemonte
- UOC Genetica Medica, IRCCS Istituto Giannina Gaslini, Via Gerolamo Gaslini 5, 16147 Genova, Italy; (V.T.); (V.C.); (S.B.)
| | - Kathryn S. Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK; (J.A.C.); (O.M.C.); (L.M.B.); (R.M.L.Q.)
| | - Andrea Armirotti
- Analytical Chemistry Facility, Istituto Italiano di Tecnologia, Via Morego 30, 16163 Genova, Italy;
| |
Collapse
|
6
|
Mulvey CM, Breckels LM, Crook OM, Sanders DJ, Ribeiro ALR, Geladaki A, Christoforou A, Britovšek NK, Hurrell T, Deery MJ, Gatto L, Smith AM, Lilley KS. Spatiotemporal proteomic profiling of the pro-inflammatory response to lipopolysaccharide in the THP-1 human leukaemia cell line. Nat Commun 2021; 12:5773. [PMID: 34599159 PMCID: PMC8486773 DOI: 10.1038/s41467-021-26000-9] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 09/07/2021] [Indexed: 02/07/2023] Open
Abstract
Protein localisation and translocation between intracellular compartments underlie almost all physiological processes. The hyperLOPIT proteomics platform combines mass spectrometry with state-of-the-art machine learning to map the subcellular location of thousands of proteins simultaneously. We combine global proteome analysis with hyperLOPIT in a fully Bayesian framework to elucidate spatiotemporal proteomic changes during a lipopolysaccharide (LPS)-induced inflammatory response. We report a highly dynamic proteome in terms of both protein abundance and subcellular localisation, with alterations in the interferon response, endo-lysosomal system, plasma membrane reorganisation and cell migration. Proteins not previously associated with an LPS response were found to relocalise upon stimulation, the functional consequences of which are still unclear. By quantifying proteome-wide uncertainty through Bayesian modelling, a necessary role for protein relocalisation and the importance of taking a holistic overview of the LPS-driven immune response has been revealed. The data are showcased as an interactive application freely available for the scientific community.
Collapse
Affiliation(s)
- Claire M Mulvey
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK
- Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Cambridge, CB2 0RE, UK
| | - Lisa M Breckels
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK
| | - Oliver M Crook
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK
- MRC Biostatistics Unit, Cambridge Institute for Public Health, Forvie Site, Robinson Way, Cambridge, CB2 0SR, UK
| | - David J Sanders
- Department of Microbial Diseases, Eastman Dental Institute, University College London, Royal Free Campus, Rowland Hill Street, London, NW3 2PF, UK
| | - Andre L R Ribeiro
- Department of Microbial Diseases, Eastman Dental Institute, University College London, Royal Free Campus, Rowland Hill Street, London, NW3 2PF, UK
| | - Aikaterini Geladaki
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK
| | | | - Nina Kočevar Britovšek
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK
- Lek d.d., Kolodvorska 27, Mengeš, 1234, Slovenia
| | - Tracey Hurrell
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK
| | - Michael J Deery
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK
| | - Laurent Gatto
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK
- de Duve Institute, UCLouvain, Avenue Hippocrate 75, Brussels, 1200, Belgium
| | - Andrew M Smith
- Department of Microbial Diseases, Eastman Dental Institute, University College London, Royal Free Campus, Rowland Hill Street, London, NW3 2PF, UK.
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre and Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK.
| |
Collapse
|
7
|
Rahmatbakhsh M, Gagarinova A, Babu M. Bioinformatic Analysis of Temporal and Spatial Proteome Alternations During Infections. Front Genet 2021; 12:667936. [PMID: 34276775 PMCID: PMC8283032 DOI: 10.3389/fgene.2021.667936] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 06/08/2021] [Indexed: 12/13/2022] Open
Abstract
Microbial pathogens have evolved numerous mechanisms to hijack host's systems, thus causing disease. This is mediated by alterations in the combined host-pathogen proteome in time and space. Mass spectrometry-based proteomics approaches have been developed and tailored to map disease progression. The result is complex multidimensional data that pose numerous analytic challenges for downstream interpretation. However, a systematic review of approaches for the downstream analysis of such data has been lacking in the field. In this review, we detail the steps of a typical temporal and spatial analysis, including data pre-processing steps (i.e., quality control, data normalization, the imputation of missing values, and dimensionality reduction), different statistical and machine learning approaches, validation, interpretation, and the extraction of biological information from mass spectrometry data. We also discuss current best practices for these steps based on a collection of independent studies to guide users in selecting the most suitable strategies for their dataset and analysis objectives. Moreover, we also compiled the list of commonly used R software packages for each step of the analysis. These could be easily integrated into one's analysis pipeline. Furthermore, we guide readers through various analysis steps by applying these workflows to mock and host-pathogen interaction data from public datasets. The workflows presented in this review will serve as an introduction for data analysis novices, while also helping established users update their data analysis pipelines. We conclude the review by discussing future directions and developments in temporal and spatial proteomics and data analysis approaches. Data analysis codes, prepared for this review are available from https://github.com/BabuLab-UofR/TempSpac, where guidelines and sample datasets are also offered for testing purposes.
Collapse
Affiliation(s)
| | - Alla Gagarinova
- Department of Biochemistry, Microbiology, & Immunology, University of Saskatchewan, Saskatoon, SK, Canada
| | - Mohan Babu
- Department of Biochemistry, University of Regina, Regina, SK, Canada
| |
Collapse
|
8
|
Richards AL, Eckhardt M, Krogan NJ. Mass spectrometry-based protein-protein interaction networks for the study of human diseases. Mol Syst Biol 2021; 17:e8792. [PMID: 33434350 PMCID: PMC7803364 DOI: 10.15252/msb.20188792] [Citation(s) in RCA: 88] [Impact Index Per Article: 29.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Revised: 09/23/2020] [Accepted: 11/03/2020] [Indexed: 12/13/2022] Open
Abstract
A better understanding of the molecular mechanisms underlying disease is key for expediting the development of novel therapeutic interventions. Disease mechanisms are often mediated by interactions between proteins. Insights into the physical rewiring of protein-protein interactions in response to mutations, pathological conditions, or pathogen infection can advance our understanding of disease etiology, progression, and pathogenesis and can lead to the identification of potential druggable targets. Advances in quantitative mass spectrometry (MS)-based approaches have allowed unbiased mapping of these disease-mediated changes in protein-protein interactions on a global scale. Here, we review MS techniques that have been instrumental for the identification of protein-protein interactions at a system-level, and we discuss the challenges associated with these methodologies as well as novel MS advancements that aim to address these challenges. An overview of examples from diverse disease contexts illustrates the potential of MS-based protein-protein interaction mapping approaches for revealing disease mechanisms, pinpointing new therapeutic targets, and eventually moving toward personalized applications.
Collapse
Affiliation(s)
- Alicia L Richards
- Quantitative Biosciences Institute (QBI)University of California San FranciscoSan FranciscoCAUSA
- J. David Gladstone InstitutesSan FranciscoCAUSA
- Department of Cellular and Molecular PharmacologyUniversity of California San FranciscoSan FranciscoCAUSA
| | - Manon Eckhardt
- Quantitative Biosciences Institute (QBI)University of California San FranciscoSan FranciscoCAUSA
- J. David Gladstone InstitutesSan FranciscoCAUSA
- Department of Cellular and Molecular PharmacologyUniversity of California San FranciscoSan FranciscoCAUSA
| | - Nevan J Krogan
- Quantitative Biosciences Institute (QBI)University of California San FranciscoSan FranciscoCAUSA
- J. David Gladstone InstitutesSan FranciscoCAUSA
- Department of Cellular and Molecular PharmacologyUniversity of California San FranciscoSan FranciscoCAUSA
| |
Collapse
|
9
|
Shin JJH, Crook OM, Borgeaud AC, Cattin-Ortolá J, Peak-Chew SY, Breckels LM, Gillingham AK, Chadwick J, Lilley KS, Munro S. Spatial proteomics defines the content of trafficking vesicles captured by golgin tethers. Nat Commun 2020; 11:5987. [PMID: 33239640 PMCID: PMC7689464 DOI: 10.1038/s41467-020-19840-4] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 10/27/2020] [Indexed: 02/07/2023] Open
Abstract
Intracellular traffic between compartments of the secretory and endocytic pathways is mediated by vesicle-based carriers. The proteomes of carriers destined for many organelles are ill-defined because the vesicular intermediates are transient, low-abundance and difficult to purify. Here, we combine vesicle relocalisation with organelle proteomics and Bayesian analysis to define the content of different endosome-derived vesicles destined for the trans-Golgi network (TGN). The golgin coiled-coil proteins golgin-97 and GCC88, shown previously to capture endosome-derived vesicles at the TGN, were individually relocalised to mitochondria and the content of the subsequently re-routed vesicles was determined by organelle proteomics. Our findings reveal 45 integral and 51 peripheral membrane proteins re-routed by golgin-97, evidence for a distinct class of vesicles shared by golgin-97 and GCC88, and various cargoes specific to individual golgins. These results illustrate a general strategy for analysing intracellular sub-proteomes by combining acute cellular re-wiring with high-resolution spatial proteomics.
Collapse
Affiliation(s)
- John J H Shin
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK.
| | - Oliver M Crook
- The Milner Therapeutics Institute, University of Cambridge, Cambridge, CB2 0AW, UK
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SR, UK
| | - Alicia C Borgeaud
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Jérôme Cattin-Ortolá
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Sew Y Peak-Chew
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Lisa M Breckels
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK
| | - Alison K Gillingham
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Jessica Chadwick
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK
| | - Kathryn S Lilley
- The Milner Therapeutics Institute, University of Cambridge, Cambridge, CB2 0AW, UK
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK
| | - Sean Munro
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK.
| |
Collapse
|
10
|
Gatto L, Breckels LM, Lilley KS. Assessing sub-cellular resolution in spatial proteomics experiments. Curr Opin Chem Biol 2019; 48:123-149. [PMID: 30711721 PMCID: PMC6391913 DOI: 10.1016/j.cbpa.2018.11.015] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Revised: 11/09/2018] [Accepted: 11/19/2018] [Indexed: 12/04/2022]
Abstract
The sub-cellular localisation of a protein is vital in defining its function, and a protein's mis-localisation is known to lead to adverse effect. As a result, numerous experimental techniques and datasets have been published, with the aim of deciphering the localisation of proteins at various scales and resolutions, including high profile mass spectrometry-based efforts. Here, we present a meta-analysis assessing and comparing the sub-cellular resolution of 29 such mass spectrometry-based spatial proteomics experiments using a newly developed tool termed QSep. Our goal is to provide a simple quantitative report of how well spatial proteomics resolve the sub-cellular niches they describe to inform and guide developers and users of such methods.
Collapse
Affiliation(s)
- Laurent Gatto
- Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK; Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK; de Duve Institute, UCLouvain, Avenue Hippocrate 75, 1200 Brussels, Belgium.
| | - Lisa M Breckels
- Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK; Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| |
Collapse
|
11
|
Nightingale DJ, Geladaki A, Breckels LM, Oliver SG, Lilley KS. The subcellular organisation of Saccharomyces cerevisiae. Curr Opin Chem Biol 2019; 48:86-95. [PMID: 30503867 PMCID: PMC6391909 DOI: 10.1016/j.cbpa.2018.10.026] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 10/29/2018] [Accepted: 10/31/2018] [Indexed: 01/06/2023]
Abstract
Subcellular protein localisation is essential for the mechanisms that govern cellular homeostasis. The ability to understand processes leading to this phenomenon will therefore enhance our understanding of cellular function. Here we review recent developments in this field with regard to mass spectrometry, fluorescence microscopy and computational prediction methods. We highlight relative strengths and limitations of current methodologies focussing particularly on studies in the yeast Saccharomyces cerevisiae. We further present the first cell-wide spatial proteome map of S. cerevisiae, generated using hyperLOPIT, a mass spectrometry-based protein correlation profiling technique. We compare protein subcellular localisation assignments from this map, with two published fluorescence microscopy studies and show that confidence in localisation assignment is attained using multiple orthogonal methods that provide complementary data.
Collapse
Affiliation(s)
- Daniel Jh Nightingale
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, United Kingdom; Cambridge Systems Biology Centre, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1GA, United Kingdom
| | - Aikaterini Geladaki
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, United Kingdom; Cambridge Systems Biology Centre, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1GA, United Kingdom; Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, United Kingdom
| | - Lisa M Breckels
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, United Kingdom; Cambridge Systems Biology Centre, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1GA, United Kingdom
| | - Stephen G Oliver
- Cambridge Systems Biology Centre, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1GA, United Kingdom
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, United Kingdom; Cambridge Systems Biology Centre, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1GA, United Kingdom.
| |
Collapse
|
12
|
Pankow S, Martínez-Bartolomé S, Bamberger C, Yates JR. Understanding molecular mechanisms of disease through spatial proteomics. Curr Opin Chem Biol 2018; 48:19-25. [PMID: 30308467 DOI: 10.1016/j.cbpa.2018.09.016] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2018] [Revised: 09/17/2018] [Accepted: 09/19/2018] [Indexed: 02/07/2023]
Abstract
Mammalian cells are organized into different compartments that separate and facilitate physiological processes by providing specialized local environments and allowing different, otherwise incompatible biological processes to be carried out simultaneously. Proteins are targeted to these subcellular locations where they fulfill specialized, compartment-specific functions. Spatial proteomics aims to localize and quantify proteins within subcellular structures.
Collapse
Affiliation(s)
- Sandra Pankow
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, 92037, United States
| | | | - Casimir Bamberger
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, 92037, United States
| | - John R Yates
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, 92037, United States.
| |
Collapse
|
13
|
Impact of Dietary, Socioeconomic, and Physical Factors on Obese and Overweight Schoolchildren Living in Sidi-Bel-Abbes (West of Algeria) and Ain Defla (Centre). ROMANIAN JOURNAL OF DIABETES NUTRITION AND METABOLIC DISEASES 2018. [DOI: 10.2478/rjdnmd-2018-0004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Abstract
Background and aims: The aim of the current study was to assess the impact of environmental factors; food, socio-economic, and physical activity, on a group of obese children living in Ain-Defla (Center Algeria) and Sidi-Bel-Abbes (West Algeria).
Material and methods: The protocol was carried out on a cohort of 125 school children aged of 5 to 11 years, including 64 boys and 61 girls, and 139 school children, including 93 boys and 46 girls in Ain Defla and Sidi-Bel-Abbes respectively. Concerning the classification of obesity and overweight, we referred to the International Obesity Task Force and the French References' curves.
Results: Regarding dietary intake our results showed that 34% of students from both regions took their breakfast, compared to 66% who did not take. Furthermore, 73% of students skipped at least one meal, however 23% respected meals frequency i.e. 4 meals a day. Regarding socio-economic factors and physical activity, our findings showed that obesity rates were high (36%) among children whose fathers are workers. However, for mothers who are housewives, obesity increases among their children (88%). The relationship was reversed between the parents' education level and the Body Mass Index. We found an opposite relationship between Body Mass Index and physical activity, and investigated children use screen devices for long time periods.
Conclusions: Our study showed a positive relationship between obesity and overweight and environmental factors.
Collapse
|
14
|
Abstract
The organization of eukaryotic cells into distinct subcompartments is vital for all functional processes, and aberrant protein localization is a hallmark of many diseases. Microscopy methods, although powerful, are usually low-throughput and dependent on the availability of fluorescent fusion proteins or highly specific and sensitive antibodies. One method that provides a global picture of the cell is localization of organelle proteins by isotope tagging (LOPIT), which combines biochemical cell fractionation using density gradient ultracentrifugation with multiplexed quantitative proteomics mass spectrometry, allowing simultaneous determination of the steady-state distribution of hundreds of proteins within organelles. Proteins are assigned to organelles based on the similarity of their gradient distribution to those of well-annotated organelle marker proteins. We have substantially re-developed our original LOPIT protocol (published by Nature Protocols in 2006) to enable the subcellular localization of thousands of proteins per experiment (hyperLOPIT), including spatial resolution at the suborganelle and large protein complex level. This Protocol Extension article integrates all elements of the hyperLOPIT pipeline, including an additional enrichment strategy for chromatin, extended multiplexing capacity of isobaric mass tags, state-of-the-art mass spectrometry methods and multivariate machine-learning approaches for analysis of spatial proteomics data. We have also created an open-source infrastructure to support analysis of quantitative mass-spectrometry-based spatial proteomics data (http://bioconductor.org/packages/pRoloc) and an accompanying interactive visualization framework (http://www. bioconductor.org/packages/pRolocGUI). The procedure we outline here is applicable to any cell culture system and requires ∼1 week to complete sample preparation steps, ∼2 d for mass spectrometry data acquisition and 1-2 d for data analysis and downstream informatics.
Collapse
|
15
|
Thul PJ, Åkesson L, Wiking M, Mahdessian D, Geladaki A, Ait Blal H, Alm T, Asplund A, Björk L, Breckels LM, Bäckström A, Danielsson F, Fagerberg L, Fall J, Gatto L, Gnann C, Hober S, Hjelmare M, Johansson F, Lee S, Lindskog C, Mulder J, Mulvey CM, Nilsson P, Oksvold P, Rockberg J, Schutten R, Schwenk JM, Sivertsson Å, Sjöstedt E, Skogs M, Stadler C, Sullivan DP, Tegel H, Winsnes C, Zhang C, Zwahlen M, Mardinoglu A, Pontén F, von Feilitzen K, Lilley KS, Uhlén M, Lundberg E. A subcellular map of the human proteome. Science 2017; 356:science.aal3321. [PMID: 28495876 DOI: 10.1126/science.aal3321] [Citation(s) in RCA: 1649] [Impact Index Per Article: 235.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2016] [Accepted: 03/31/2017] [Indexed: 12/13/2022]
Abstract
Resolving the spatial distribution of the human proteome at a subcellular level can greatly increase our understanding of human biology and disease. Here we present a comprehensive image-based map of subcellular protein distribution, the Cell Atlas, built by integrating transcriptomics and antibody-based immunofluorescence microscopy with validation by mass spectrometry. Mapping the in situ localization of 12,003 human proteins at a single-cell level to 30 subcellular structures enabled the definition of the proteomes of 13 major organelles. Exploration of the proteomes revealed single-cell variations in abundance or spatial distribution and localization of about half of the proteins to multiple compartments. This subcellular map can be used to refine existing protein-protein interaction networks and provides an important resource to deconvolute the highly complex architecture of the human cell.
Collapse
Affiliation(s)
- Peter J Thul
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Lovisa Åkesson
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Mikaela Wiking
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Diana Mahdessian
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Aikaterini Geladaki
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK
| | - Hammou Ait Blal
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Tove Alm
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Anna Asplund
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, SE-751 85 Uppsala, Sweden
| | - Lars Björk
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Lisa M Breckels
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
- Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Anna Bäckström
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Frida Danielsson
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Linn Fagerberg
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Jenny Fall
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Laurent Gatto
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
- Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Christian Gnann
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Sophia Hober
- Department of Proteomics, School of Biotechnology, KTH Royal Institute of Technology, SE-106 91 Stockholm, Sweden
| | - Martin Hjelmare
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Fredric Johansson
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Sunjae Lee
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Cecilia Lindskog
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, SE-751 85 Uppsala, Sweden
| | - Jan Mulder
- Science for Life Laboratory, Department of Neuroscience, Karolinska Institute, SE-171 77 Stockholm, Sweden
| | - Claire M Mulvey
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Peter Nilsson
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Per Oksvold
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Johan Rockberg
- Department of Proteomics, School of Biotechnology, KTH Royal Institute of Technology, SE-106 91 Stockholm, Sweden
| | - Rutger Schutten
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Jochen M Schwenk
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Åsa Sivertsson
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Evelina Sjöstedt
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, SE-751 85 Uppsala, Sweden
| | - Marie Skogs
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Charlotte Stadler
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Devin P Sullivan
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Hanna Tegel
- Department of Proteomics, School of Biotechnology, KTH Royal Institute of Technology, SE-106 91 Stockholm, Sweden
| | - Casper Winsnes
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Cheng Zhang
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Martin Zwahlen
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Adil Mardinoglu
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Fredrik Pontén
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, SE-751 85 Uppsala, Sweden
| | - Kalle von Feilitzen
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Mathias Uhlén
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden.
| | - Emma Lundberg
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden.
| |
Collapse
|
16
|
Breckels LM, Mulvey CM, Lilley KS, Gatto L. A Bioconductor workflow for processing and analysing spatial proteomics data. F1000Res 2016; 5:2926. [PMID: 30079225 PMCID: PMC6053703 DOI: 10.12688/f1000research.10411.2] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/31/2018] [Indexed: 01/16/2023] Open
Abstract
Spatial proteomics is the systematic study of protein sub-cellular localisation. In this workflow, we describe the analysis of a typical quantitative mass spectrometry-based spatial proteomics experiment using the MSnbase and pRoloc Bioconductor package suite. To walk the user through the computational pipeline, we use a recently published experiment predicting protein sub-cellular localisation in pluripotent embryonic mouse stem cells. We describe the software infrastructure at hand, importing and processing data, quality control, sub-cellular marker definition, visualisation and interactive exploration. We then demonstrate the application and interpretation of statistical learning methods, including novelty detection using semi-supervised learning, classification, clustering and transfer learning and conclude the pipeline with data export. The workflow is aimed at beginners who are familiar with proteomics in general and spatial proteomics in particular.
Collapse
Affiliation(s)
- Lisa M. Breckels
- Computational Proteomics Unit, Cambridge Systems Biology Centre, University of Cambridge, Cambridge, UK
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Claire M. Mulvey
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Kathryn S. Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Laurent Gatto
- Computational Proteomics Unit, Cambridge Systems Biology Centre, University of Cambridge, Cambridge, UK
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
17
|
Breckels LM, Mulvey CM, Lilley KS, Gatto L. A Bioconductor workflow for processing and analysing spatial proteomics data. F1000Res 2016; 5:2926. [PMID: 30079225 PMCID: PMC6053703 DOI: 10.12688/f1000research.10411.1] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/19/2016] [Indexed: 12/16/2022] Open
Abstract
Spatial proteomics is the systematic study of protein sub-cellular localisation. In this workflow, we describe the analysis of a typical quantitative mass spectrometry-based spatial proteomics experiment using the MSnbase and pRoloc Bioconductor package suite. To walk the user through the computational pipeline, we use a recently published experiment predicting protein sub-cellular localisation in pluripotent embryonic mouse stem cells. We describe the software infrastructure at hand, importing and processing data, quality control, sub-cellular marker definition, visualisation and interactive exploration. We then demonstrate the application and interpretation of statistical learning methods, including novelty detection using semi-supervised learning, classification, clustering and transfer learning and conclude the pipeline with data export. The workflow is aimed at beginners who are familiar with proteomics in general and spatial proteomics in particular.
Collapse
Affiliation(s)
- Lisa M. Breckels
- Computational Proteomics Unit, Cambridge Systems Biology Centre, University of Cambridge, Cambridge, UK
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Claire M. Mulvey
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Kathryn S. Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Laurent Gatto
- Computational Proteomics Unit, Cambridge Systems Biology Centre, University of Cambridge, Cambridge, UK
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
18
|
Mardakheh FK, Sailem HZ, Kümper S, Tape CJ, McCully RR, Paul A, Anjomani-Virmouni S, Jørgensen C, Poulogiannis G, Marshall CJ, Bakal C. Proteomics profiling of interactome dynamics by colocalisation analysis (COLA). MOLECULAR BIOSYSTEMS 2016; 13:92-105. [PMID: 27824369 PMCID: PMC5315029 DOI: 10.1039/c6mb00701e] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2016] [Accepted: 11/01/2016] [Indexed: 12/27/2022]
Abstract
Localisation and protein function are intimately linked in eukaryotes, as proteins are localised to specific compartments where they come into proximity of other functionally relevant proteins. Significant co-localisation of two proteins can therefore be indicative of their functional association. We here present COLA, a proteomics based strategy coupled with a bioinformatics framework to detect protein-protein co-localisations on a global scale. COLA reveals functional interactions by matching proteins with significant similarity in their subcellular localisation signatures. The rapid nature of COLA allows mapping of interactome dynamics across different conditions or treatments with high precision.
Collapse
Affiliation(s)
- Faraz K Mardakheh
- Institute of Cancer Research, Division of Cancer Biology, 237 Fulham Road, London SW3 6JB, UK.
| | - Heba Z Sailem
- Institute of Cancer Research, Division of Cancer Biology, 237 Fulham Road, London SW3 6JB, UK. and Institute of Biomedical Engineering, University of Oxford, Old Road Campus Research Building, Oxford OX3 7DQ, UK
| | - Sandra Kümper
- Institute of Cancer Research, Division of Cancer Biology, 237 Fulham Road, London SW3 6JB, UK.
| | - Christopher J Tape
- Institute of Cancer Research, Division of Cancer Biology, 237 Fulham Road, London SW3 6JB, UK. and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Ryan R McCully
- Institute of Cancer Research, Division of Cancer Biology, 237 Fulham Road, London SW3 6JB, UK.
| | - Angela Paul
- Institute of Cancer Research, Division of Cancer Biology, 237 Fulham Road, London SW3 6JB, UK.
| | - Sara Anjomani-Virmouni
- Institute of Cancer Research, Division of Cancer Biology, 237 Fulham Road, London SW3 6JB, UK.
| | - Claus Jørgensen
- Cancer Research UK Manchester Institute, University of Manchester, Wilmslow Road, Manchester M20 4BX, UK
| | - George Poulogiannis
- Institute of Cancer Research, Division of Cancer Biology, 237 Fulham Road, London SW3 6JB, UK.
| | - Christopher J Marshall
- Institute of Cancer Research, Division of Cancer Biology, 237 Fulham Road, London SW3 6JB, UK.
| | - Chris Bakal
- Institute of Cancer Research, Division of Cancer Biology, 237 Fulham Road, London SW3 6JB, UK.
| |
Collapse
|
19
|
Breckels LM, Holden SB, Wojnar D, Mulvey CM, Christoforou A, Groen A, Trotter MWB, Kohlbacher O, Lilley KS, Gatto L. Learning from Heterogeneous Data Sources: An Application in Spatial Proteomics. PLoS Comput Biol 2016; 12:e1004920. [PMID: 27175778 PMCID: PMC4866734 DOI: 10.1371/journal.pcbi.1004920] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2015] [Accepted: 04/16/2016] [Indexed: 11/19/2022] Open
Abstract
Sub-cellular localisation of proteins is an essential post-translational regulatory mechanism that can be assayed using high-throughput mass spectrometry (MS). These MS-based spatial proteomics experiments enable us to pinpoint the sub-cellular distribution of thousands of proteins in a specific system under controlled conditions. Recent advances in high-throughput MS methods have yielded a plethora of experimental spatial proteomics data for the cell biology community. Yet, there are many third-party data sources, such as immunofluorescence microscopy or protein annotations and sequences, which represent a rich and vast source of complementary information. We present a unique transfer learning classification framework that utilises a nearest-neighbour or support vector machine system, to integrate heterogeneous data sources to considerably improve on the quantity and quality of sub-cellular protein assignment. We demonstrate the utility of our algorithms through evaluation of five experimental datasets, from four different species in conjunction with four different auxiliary data sources to classify proteins to tens of sub-cellular compartments with high generalisation accuracy. We further apply the method to an experiment on pluripotent mouse embryonic stem cells to classify a set of previously unknown proteins, and validate our findings against a recent high resolution map of the mouse stem cell proteome. The methodology is distributed as part of the open-source Bioconductor pRoloc suite for spatial proteomics data analysis. Sub-cellular localisation of proteins is critical to their function in all cellular processes; proteins localising to their intended micro-environment, e.g organelles, vesicles or macro-molecular complexes, will meet the interaction partners and biochemical conditions suitable to pursue their molecular function. Therefore, sound data and methods to reliably and systematically study protein localisation, and hence their mis-localisation and the disruption of protein trafficking, that are relied upon by the cell biology community, are essential. Here we present a method to infer protein localisation relying on the optimal integration of experimental mass spectrometry-based data and auxiliary sources, such as GO annotation, outputs from third-party software, protein-protein interactions or immunocytochemistry data. We found that the application of transfer learning algorithms across these diverse data sources considerably improves on the quantity and reliability of sub-cellular protein assignment, compared to single data classifiers previously applied to infer sub-cellular localisation using experimental data only. We show how our method does not compromise biologically relevant experimental-specific signal after integration with heterogeneous freely available third-party resources. The integration of different data sources is an important challenge in the data intensive world of biology and we anticipate the transfer learning methods presented here will prove useful to many areas of biology, to unify data obtained from different but complimentary sources.
Collapse
Affiliation(s)
- Lisa M. Breckels
- Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Sean B. Holden
- Computer Laboratory, University of Cambridge, Cambridge, United Kingdom
| | - David Wojnar
- Quantitative Biology Center, Universität Tübingen, Tübingen, Germany
| | - Claire M. Mulvey
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Andy Christoforou
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Arnoud Groen
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | | | - Oliver Kohlbacher
- Quantitative Biology Center, Universität Tübingen, Tübingen, Germany
- Center for Bioinformatics, Universität Tübingen, Tübingen, Germany
- Biomolecular Interactions, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Kathryn S. Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Laurent Gatto
- Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
20
|
Christoforou A, Mulvey CM, Breckels LM, Geladaki A, Hurrell T, Hayward PC, Naake T, Gatto L, Viner R, Martinez Arias A, Lilley KS. A draft map of the mouse pluripotent stem cell spatial proteome. Nat Commun 2016; 7:8992. [PMID: 26754106 PMCID: PMC4729960 DOI: 10.1038/ncomms9992] [Citation(s) in RCA: 151] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2015] [Accepted: 10/22/2015] [Indexed: 12/18/2022] Open
Abstract
Knowledge of the subcellular distribution of proteins is vital for understanding cellular mechanisms. Capturing the subcellular proteome in a single experiment has proven challenging, with studies focusing on specific compartments or assigning proteins to subcellular niches with low resolution and/or accuracy. Here we introduce hyperLOPIT, a method that couples extensive fractionation, quantitative high-resolution accurate mass spectrometry with multivariate data analysis. We apply hyperLOPIT to a pluripotent stem cell population whose subcellular proteome has not been extensively studied. We provide localization data on over 5,000 proteins with unprecedented spatial resolution to reveal the organization of organelles, sub-organellar compartments, protein complexes, functional networks and steady-state dynamics of proteins and unexpected subcellular locations. The method paves the way for characterizing the impact of post-transcriptional and post-translational modification on protein location and studies involving proteome-level locational changes on cellular perturbation. An interactive open-source resource is presented that enables exploration of these data.
Collapse
Affiliation(s)
- Andy Christoforou
- Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK.,Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK
| | - Claire M Mulvey
- Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK.,Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK
| | - Lisa M Breckels
- Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Aikaterini Geladaki
- Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK.,Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK
| | - Tracey Hurrell
- Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK.,Department of Pharmacology, University of Pretoria, Arcadia 0007, Republic of South Africa
| | - Penelope C Hayward
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK
| | - Thomas Naake
- Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Laurent Gatto
- Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Rosa Viner
- Thermo Fisher Scientific, 355 River Oaks Pkwy, San Jose, California 95314, USA
| | | | - Kathryn S Lilley
- Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
| |
Collapse
|
21
|
Gatto L, Breckels LM, Naake T, Gibb S. Visualization of proteomics data using R and bioconductor. Proteomics 2016; 15:1375-89. [PMID: 25690415 PMCID: PMC4510819 DOI: 10.1002/pmic.201400392] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2014] [Revised: 02/05/2015] [Accepted: 02/09/2015] [Indexed: 12/30/2022]
Abstract
Data visualization plays a key role in high-throughput biology. It is an essential tool for data exploration allowing to shed light on data structure and patterns of interest. Visualization is also of paramount importance as a form of communicating data to a broad audience. Here, we provided a short overview of the application of the R software to the visualization of proteomics data. We present a summary of R's plotting systems and how they are used to visualize and understand raw and processed MS-based proteomics data.
Collapse
Affiliation(s)
- Laurent Gatto
- Department of Biochemistry, Cambridge Centre for Proteomics, University of Cambridge, Cambridge, UK; Department of Biochemistry, Computational Proteomics Unit, University of Cambridge, Cambridge, UK
| | | | | | | |
Collapse
|
22
|
de Araújo MEG, Lamberti G, Huber LA. Purification of Early and Late Endosomes. Cold Spring Harb Protoc 2015; 2015:pdb.top074443. [PMID: 26631131 DOI: 10.1101/pdb.top074443] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Proteomic analysis of early and late endosomes has been constrained by the limited purity of the endosomal fractions that can be achieved by biochemical methods. Here we briefly review endocytic pathways, and then introduce fractionation strategies that have been used to improve the purity of isolated endosomes. In addition, we describe innovative proteomics analysis methods that have been shown to partially circumvent the limitations found in the enrichment steps.
Collapse
Affiliation(s)
- Mariana E G de Araújo
- Biocenter, Division of Cell Biology, Innsbruck Medical University, A-6020 Innsbruck, Austria
| | - Giorgia Lamberti
- Biocenter, Division of Cell Biology, Innsbruck Medical University, A-6020 Innsbruck, Austria
| | - Lukas A Huber
- Biocenter, Division of Cell Biology, Innsbruck Medical University, A-6020 Innsbruck, Austria
| |
Collapse
|
23
|
Sandin M, Antberg L, Levander F, James P. A Breast Cell Atlas: Organelle analysis of the MDA-MB-231 cell line by density-gradient fractionation using isotopic marking and label-free analysis. EUPA OPEN PROTEOMICS 2015. [DOI: 10.1016/j.euprot.2015.06.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
24
|
Munawar N, Olivero G, Jerman E, Doyle B, Streubel G, Wynne K, Bracken A, Cagney G. Native gel analysis of macromolecular protein complexes in cultured mammalian cells. Proteomics 2015. [PMID: 26223664 DOI: 10.1002/pmic.201500045] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Native gel electrophoresis enables separation of cellular proteins in their non-denatured state. In experiments aimed at analysing proteins in higher order or multimeric assemblies (i.e. protein complexes) it offers some advantages over rival approaches, particularly as an interface technology with mass spectrometry. Here we separated fractions from HEK293 cells by native electrophoresis in order to survey protein complexes in the cytoplasmic, nuclear and chromatin environments, finding 689 proteins distributed among 217 previously described complexes. As expected, different fractions contained distinct combinations of macromolecular complexes, with subunits of the same complex tending to co-migrate. Exceptions to this observation could often be explained by the presence of subunits shared among different complexes. We investigated one identified complex, the Polycomb Repressor Complex 2 (PRC2), in more detail following affinity purification of the EZH2 subunit. This approach resulted in the identification of all previously reported members of PRC2. Overall, this work demonstrates that the use of native gel electrophoresis as an upstream separating step is an effective approach for analysis of the components and cellular distribution of protein complexes.
Collapse
Affiliation(s)
- Nayla Munawar
- School of Biomolecular and Biomedical Research, University College Dublin, Belfield, Ireland
| | - Giorgio Olivero
- School of Biomolecular and Biomedical Research, University College Dublin, Belfield, Ireland
| | - Emilia Jerman
- Smurfit Institute of Genetics, Trinity College, Dublin, Ireland
| | - Benjamin Doyle
- School of Biomolecular and Biomedical Research, University College Dublin, Belfield, Ireland
| | | | - Kieran Wynne
- School of Biomolecular and Biomedical Research, University College Dublin, Belfield, Ireland
| | - Adrian Bracken
- Smurfit Institute of Genetics, Trinity College, Dublin, Ireland
| | - Gerard Cagney
- School of Biomolecular and Biomedical Research, University College Dublin, Belfield, Ireland
| |
Collapse
|
25
|
Munawar N, Olivero G, Jerman E, Doyle B, Streubel G, Wynne K, Bracken A, Cagney G. Native gel analysis of macromolecular protein complexes in cultured mammalian cells. Proteomics 2015. [DOI: https://doi.org/10.1002/pmic.201500045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Nayla Munawar
- School of Biomolecular and Biomedical Research; University College Dublin; Belfield Ireland
| | - Giorgio Olivero
- School of Biomolecular and Biomedical Research; University College Dublin; Belfield Ireland
| | - Emilia Jerman
- Smurfit Institute of Genetics; Trinity College; Dublin Ireland
| | - Benjamin Doyle
- School of Biomolecular and Biomedical Research; University College Dublin; Belfield Ireland
| | | | - Kieran Wynne
- School of Biomolecular and Biomedical Research; University College Dublin; Belfield Ireland
| | - Adrian Bracken
- Smurfit Institute of Genetics; Trinity College; Dublin Ireland
| | - Gerard Cagney
- School of Biomolecular and Biomedical Research; University College Dublin; Belfield Ireland
| |
Collapse
|
26
|
Scott NE, Brown LM, Kristensen AR, Foster LJ. Development of a computational framework for the analysis of protein correlation profiling and spatial proteomics experiments. J Proteomics 2014; 118:112-29. [PMID: 25464368 DOI: 10.1016/j.jprot.2014.10.024] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2014] [Revised: 10/18/2014] [Accepted: 10/27/2014] [Indexed: 01/12/2023]
Abstract
UNLABELLED Standard approaches to studying an interactome do not easily allow conditional experiments but in recent years numerous groups have demonstrated the potential for co-fractionation/co-migration based approaches to assess an interactome at a similar sensitivity and specificity yet significantly lower cost and higher speed than traditional approaches. Unfortunately, there is as yet no implementation of the bioinformatics tools required to robustly analyze co-fractionation data in a way that can also integrate the valuable information contained in biological replicates. Here we have developed a freely available, integrated bioinformatics solution for the analysis of protein correlation profiling SILAC data. This modular solution allows the deconvolution of protein chromatograms into individual Gaussian curves enabling the use of these chromatography features to align replicates and assemble a consensus map of features observed across replicates; the chromatograms and individual curves are then used to quantify changes in protein interactions and construct the interactome. We have applied this workflow to the analysis of HeLa cells infected with a Salmonella enterica serovar Typhimurium infection model where we can identify specific interactions that are affected by the infection. These bioinformatics tools simplify the analysis of co-fractionation/co-migration data to the point where there is no specialized knowledge required to measure an interactome in this way. BIOLOGICAL SIGNIFICANCE We describe a set of software tools for the bioinformatics analysis of co-migration/co-fractionation data that integrates the results from multiple replicates to generate an interactome, including the impact on individual interactions of any external perturbation. This article is part of a Special Issue entitled: Protein dynamics in health and disease. Guest Editors: Pierre Thibault and Anne-Claude Gingras.
Collapse
Affiliation(s)
- Nichollas E Scott
- Centre for High-throughput Biology, University of British Columbia, Vancouver V6T 1Z4, British Columbia, Canada.
| | - Lyda M Brown
- Centre for High-throughput Biology, University of British Columbia, Vancouver V6T 1Z4, British Columbia, Canada
| | - Anders R Kristensen
- Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver V5Z 4S6, British Columbia, Canada
| | - Leonard J Foster
- Centre for High-throughput Biology, University of British Columbia, Vancouver V6T 1Z4, British Columbia, Canada.
| |
Collapse
|
27
|
Rauniyar N, Yates JR. Isobaric labeling-based relative quantification in shotgun proteomics. J Proteome Res 2014; 13:5293-309. [PMID: 25337643 PMCID: PMC4261935 DOI: 10.1021/pr500880b] [Citation(s) in RCA: 421] [Impact Index Per Article: 42.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
![]()
Mass spectrometry plays a key role
in relative quantitative comparisons
of proteins in order to understand their functional role in biological
systems upon perturbation. In this review, we review studies that
examine different aspects of isobaric labeling-based relative quantification
for shotgun proteomic analysis. In particular, we focus on different
types of isobaric reagents and their reaction chemistry (e.g., amine-,
carbonyl-, and sulfhydryl-reactive). Various factors, such as ratio
compression, reporter ion dynamic range, and others, cause an underestimation
of changes in relative abundance of proteins across samples, undermining
the ability of the isobaric labeling approach to be truly quantitative.
These factors that affect quantification and the suggested combinations
of experimental design and optimal data acquisition methods to increase
the precision and accuracy of the measurements will be discussed.
Finally, the extended application of isobaric labeling-based approach
in hyperplexing strategy, targeted quantification, and phosphopeptide
analysis are also examined.
Collapse
Affiliation(s)
- Navin Rauniyar
- Department of Chemical Physiology, The Scripps Research Institute , 10550 North Torrey Pines Road, La Jolla, California 92037, United States
| | | |
Collapse
|
28
|
Gatto L, Breckels LM, Burger T, Nightingale DJH, Groen AJ, Campbell C, Nikolovski N, Mulvey CM, Christoforou A, Ferro M, Lilley KS. A foundation for reliable spatial proteomics data analysis. Mol Cell Proteomics 2014; 13:1937-52. [PMID: 24846987 DOI: 10.1074/mcp.m113.036350] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Quantitative mass-spectrometry-based spatial proteomics involves elaborate, expensive, and time-consuming experimental procedures, and considerable effort is invested in the generation of such data. Multiple research groups have described a variety of approaches for establishing high-quality proteome-wide datasets. However, data analysis is as critical as data production for reliable and insightful biological interpretation, and no consistent and robust solutions have been offered to the community so far. Here, we introduce the requirements for rigorous spatial proteomics data analysis, as well as the statistical machine learning methodologies needed to address them, including supervised and semi-supervised machine learning, clustering, and novelty detection. We present freely available software solutions that implement innovative state-of-the-art analysis pipelines and illustrate the use of these tools through several case studies involving multiple organisms, experimental designs, mass spectrometry platforms, and quantitation techniques. We also propose sound analysis strategies for identifying dynamic changes in subcellular localization by comparing and contrasting data describing different biological conditions. We conclude by discussing future needs and developments in spatial proteomics data analysis.
Collapse
Affiliation(s)
- Laurent Gatto
- From the ‡Cambridge Centre for Proteomics, Department of Biochemistry, Tennis Court Road, University of Cambridge, Cambridge, CB2 1QR, United Kingdom; §Computational Proteomics Unit, Department of Biochemistry, Tennis Court Road, University of Cambridge, Cambridge, CB2 1QR, United Kingdom
| | - Lisa M Breckels
- From the ‡Cambridge Centre for Proteomics, Department of Biochemistry, Tennis Court Road, University of Cambridge, Cambridge, CB2 1QR, United Kingdom; §Computational Proteomics Unit, Department of Biochemistry, Tennis Court Road, University of Cambridge, Cambridge, CB2 1QR, United Kingdom
| | - Thomas Burger
- ¶Université Grenoble-Alpes, CEA (iRSTV/BGE), INSERM (U1038), CNRS (FR3425), F-38054 Grenoble, France
| | - Daniel J H Nightingale
- From the ‡Cambridge Centre for Proteomics, Department of Biochemistry, Tennis Court Road, University of Cambridge, Cambridge, CB2 1QR, United Kingdom
| | - Arnoud J Groen
- From the ‡Cambridge Centre for Proteomics, Department of Biochemistry, Tennis Court Road, University of Cambridge, Cambridge, CB2 1QR, United Kingdom
| | - Callum Campbell
- From the ‡Cambridge Centre for Proteomics, Department of Biochemistry, Tennis Court Road, University of Cambridge, Cambridge, CB2 1QR, United Kingdom
| | - Nino Nikolovski
- From the ‡Cambridge Centre for Proteomics, Department of Biochemistry, Tennis Court Road, University of Cambridge, Cambridge, CB2 1QR, United Kingdom
| | - Claire M Mulvey
- From the ‡Cambridge Centre for Proteomics, Department of Biochemistry, Tennis Court Road, University of Cambridge, Cambridge, CB2 1QR, United Kingdom
| | - Andy Christoforou
- From the ‡Cambridge Centre for Proteomics, Department of Biochemistry, Tennis Court Road, University of Cambridge, Cambridge, CB2 1QR, United Kingdom
| | - Myriam Ferro
- ¶Université Grenoble-Alpes, CEA (iRSTV/BGE), INSERM (U1038), CNRS (FR3425), F-38054 Grenoble, France
| | - Kathryn S Lilley
- From the ‡Cambridge Centre for Proteomics, Department of Biochemistry, Tennis Court Road, University of Cambridge, Cambridge, CB2 1QR, United Kingdom;
| |
Collapse
|
29
|
Groen AJ, Sancho-Andrés G, Breckels LM, Gatto L, Aniento F, Lilley KS. Identification of trans-golgi network proteins in Arabidopsis thaliana root tissue. J Proteome Res 2014; 13:763-76. [PMID: 24344820 PMCID: PMC3929368 DOI: 10.1021/pr4008464] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
![]()
Knowledge of protein
subcellular localization assists in the elucidation
of protein function and understanding of different biological mechanisms
that occur at discrete subcellular niches. Organelle-centric proteomics
enables localization of thousands of proteins simultaneously. Although
such techniques have successfully allowed organelle protein catalogues
to be achieved, they rely on the purification or significant enrichment
of the organelle of interest, which is not achievable for many organelles.
Incomplete separation of organelles leads to false discoveries, with
erroneous assignments. Proteomics methods that measure the distribution
patterns of specific organelle markers along density gradients are
able to assign proteins of unknown localization based on comigration
with known organelle markers, without the need for organelle purification.
These methods are greatly enhanced when coupled to sophisticated computational
tools. Here we apply and compare multiple approaches to establish
a high-confidence data set of Arabidopsis root tissue
trans-Golgi network (TGN) proteins. The method employed involves immunoisolations
of the TGN, coupled to probability-based organelle proteomics techniques.
Specifically, the technique known as LOPIT (localization of organelle
protein by isotope tagging), couples density centrifugation with quantitative
mass-spectometry-based proteomics using isobaric labeling and targeted
methods with semisupervised machine learning methods. We demonstrate
that while the immunoisolation method gives rise to a significant
data set, the approach is unable to distinguish cargo proteins and
persistent contaminants from full-time residents of the TGN. The LOPIT
approach, however, returns information about many subcellular niches
simultaneously and the steady-state location of proteins. Importantly,
therefore, it is able to dissect proteins present in more than one
organelle and cargo proteins en route to other cellular destinations
from proteins whose steady-state location favors the TGN. Using this
approach, we present a robust list of Arabidopsis TGN proteins.
Collapse
Affiliation(s)
- Arnoud J Groen
- Cambridge Centre for Proteomics, Cambridge Systems Biology Centre, Department of Biochemistry, University of Cambridge , 80 Tennis Court Road, Cambridge CB2 1GA, United Kingdom
| | | | | | | | | | | |
Collapse
|
30
|
Gatto L, Breckels LM, Wieczorek S, Burger T, Lilley KS. Mass-spectrometry-based spatial proteomics data analysis using pRoloc and pRolocdata. Bioinformatics 2014; 30:1322-4. [PMID: 24413670 PMCID: PMC3998135 DOI: 10.1093/bioinformatics/btu013] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
MOTIVATION Experimental spatial proteomics, i.e. the high-throughput assignment of proteins to sub-cellular compartments based on quantitative proteomics data, promises to shed new light on many biological processes given adequate computational tools. RESULTS Here we present pRoloc, a complete infrastructure to support and guide the sound analysis of quantitative mass-spectrometry-based spatial proteomics data. It provides functionality for unsupervised and supervised machine learning for data exploration and protein classification and novelty detection to identify new putative sub-cellular clusters. The software builds upon existing infrastructure for data management and data processing.
Collapse
Affiliation(s)
- Laurent Gatto
- Computational Proteomics Unit and Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, CB2 1QR, Cambridge, UK and Université Grenoble-Alpes, CEA (iRSTV/BGE), INSERM (U1038), CNRS (FR3425), 38054 Grenoble, France
| | | | | | | | | |
Collapse
|
31
|
Christoforou A, Arias AM, Lilley KS. Determining protein subcellular localization in mammalian cell culture with biochemical fractionation and iTRAQ 8-plex quantification. Methods Mol Biol 2014; 1156:157-174. [PMID: 24791987 DOI: 10.1007/978-1-4939-0685-7_10] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Protein subcellular localization is a fundamental feature of posttranslational functional regulation. Traditional microscopy based approaches to study protein localization are typically of limited throughput, and dependent on the availability of antibodies with high specificity and sensitivity, or fluorescent fusion proteins. In this chapter we describe how Localization of Organelle Proteins by Isotope Tagging (LOPIT), a mass spectrometry based workflow coupling biochemical fractionation and iTRAQ™ 8-plex quantification, can be applied for the high-throughput characterization of protein localization in a mammalian cell culture line.
Collapse
Affiliation(s)
- Andy Christoforou
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK,
| | | | | |
Collapse
|
32
|
Drissi R, Dubois ML, Boisvert FM. Proteomics methods for subcellular proteome analysis. FEBS J 2013; 280:5626-34. [PMID: 24034475 DOI: 10.1111/febs.12502] [Citation(s) in RCA: 70] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2013] [Revised: 08/14/2013] [Accepted: 08/22/2013] [Indexed: 01/29/2023]
Abstract
The elucidation of the subcellular distribution of proteins under different conditions is a major challenge in cell biology. This challenge is further complicated by the multicompartmental and dynamic nature of protein localization. To address this issue, quantitative proteomics workflows have been developed to reliably identify the protein complement of whole organelles, as well as for protein assignment to subcellular location and relative protein quantification based on different cell culture conditions. Here, we review quantitative MS-based approaches that combine cellular fractionation with proteomic analysis. The application of these methods to the characterization of organellar composition and to the determination of the dynamic nature of protein complexes is improving our understanding of protein functions and dynamics.
Collapse
Affiliation(s)
- Romain Drissi
- Department of Anatomy and Cell Biology, Université de Sherbrooke, Québec, Canada
| | | | | |
Collapse
|
33
|
Breckels LM, Gatto L, Christoforou A, Groen AJ, Lilley KS, Trotter MWB. The effect of organelle discovery upon sub-cellular protein localisation. J Proteomics 2013; 88:129-40. [PMID: 23523639 DOI: 10.1016/j.jprot.2013.02.019] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2012] [Revised: 02/16/2013] [Accepted: 02/21/2013] [Indexed: 12/31/2022]
Abstract
UNLABELLED Prediction of protein sub-cellular localisation by employing quantitative mass spectrometry experiments is an expanding field. Several methods have led to the assignment of proteins to specific subcellular localisations by partial separation of organelles across a fractionation scheme coupled with computational analysis. Methods developed to analyse organelle data have largely employed supervised machine learning algorithms to map unannotated abundance profiles to known protein-organelle associations. Such approaches are likely to make association errors if organelle-related groupings present in experimental output are not included in data used to create a protein-organelle classifier. Currently, there is no automated way to detect organelle-specific clusters within such datasets. In order to address the above issues we adapted a phenotype discovery algorithm, originally created to filter image-based output for RNAi screens, to identify putative subcellular groupings in organelle proteomics experiments. We were able to mine datasets to a deeper level and extract interesting phenotype clusters for more comprehensive evaluation in an unbiased fashion upon application of this approach. Organelle-related protein clusters were identified beyond those sufficiently annotated for use as training data. Furthermore, we propose avenues for the incorporation of observations made into general practice for the classification of protein-organelle membership from quantitative MS experiments. BIOLOGICAL SIGNIFICANCE Protein sub-cellular localisation plays an important role in molecular interactions, signalling and transport mechanisms. The prediction of protein localisation by quantitative mass-spectrometry (MS) proteomics is a growing field and an important endeavour in improving protein annotation. Several such approaches use gradient-based separation of cellular organelle content to measure relative protein abundance across distinct gradient fractions. The distribution profiles are commonly mapped in silico to known protein-organelle associations via supervised machine learning algorithms, to create classifiers that associate unannotated proteins to specific organelles. These strategies are prone to error, however, if organelle-related groupings present in experimental output are not represented, for example owing to the lack of existing annotation, when creating the protein-organelle mapping. Here, the application of a phenotype discovery approach to LOPIT gradient-based MS data identifies candidate organelle phenotypes for further evaluation in an unbiased fashion. Software implementation and usage guidelines are provided for application to wider protein-organelle association experiments. In the wider context, semi-supervised organelle discovery is discussed as a paradigm with which to generate new protein annotations from MS-based organelle proteomics experiments.
Collapse
Affiliation(s)
- L M Breckels
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, CB2 1QR, UK
| | | | | | | | | | | |
Collapse
|
34
|
Martin SF, Falkenberg H, Dyrlund TF, Khoudoli GA, Mageean CJ, Linding R. PROTEINCHALLENGE: crowd sourcing in proteomics analysis and software development. J Proteomics 2012; 88:41-6. [PMID: 23220569 DOI: 10.1016/j.jprot.2012.11.014] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2012] [Revised: 11/08/2012] [Accepted: 11/13/2012] [Indexed: 10/27/2022]
Abstract
In large-scale proteomics studies there is a temptation, after months of experimental work, to plug resulting data into a convenient-if poorly implemented-set of tools, which may neither do the data justice nor help answer the scientific question. In this paper we have captured key concerns, including arguments for community-wide open source software development and "big data" compatible solutions for the future. For the meantime, we have laid out ten top tips for data processing. With these at hand, a first large-scale proteomics analysis hopefully becomes less daunting to navigate. However there is clearly a real need for robust tools, standard operating procedures and general acceptance of best practises. Thus we submit to the proteomics community a call for a community-wide open set of proteomics analysis challenges--PROTEINCHALLENGE--that directly target and compare data analysis workflows, with the aim of setting a community-driven gold standard for data handling, reporting and sharing.
Collapse
Affiliation(s)
- Sarah F Martin
- Kinetic Parameter Facility, Centre for Synthetic and Systems Biology-SynthSys, University of Edinburgh, UK
| | | | | | | | | | | |
Collapse
|
35
|
Nikolovski N, Rubtsov D, Segura MP, Miles GP, Stevens TJ, Dunkley TP, Munro S, Lilley KS, Dupree P. Putative glycosyltransferases and other plant Golgi apparatus proteins are revealed by LOPIT proteomics. PLANT PHYSIOLOGY 2012; 160:1037-51. [PMID: 22923678 PMCID: PMC3461528 DOI: 10.1104/pp.112.204263] [Citation(s) in RCA: 124] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Accepted: 08/22/2012] [Indexed: 05/18/2023]
Abstract
The Golgi apparatus is the central organelle in the secretory pathway and plays key roles in glycosylation, protein sorting, and secretion in plants. Enzymes involved in the biosynthesis of complex polysaccharides, glycoproteins, and glycolipids are located in this organelle, but the majority of them remain uncharacterized. Here, we studied the Arabidopsis (Arabidopsis thaliana) membrane proteome with a focus on the Golgi apparatus using localization of organelle proteins by isotope tagging. By applying multivariate data analysis to a combined data set of two new and two previously published localization of organelle proteins by isotope tagging experiments, we identified the subcellular localization of 1,110 proteins with high confidence. These include 197 Golgi apparatus proteins, 79 of which have not been localized previously by a high-confidence method, as well as the localization of 304 endoplasmic reticulum and 208 plasma membrane proteins. Comparison of the hydrophobic domains of the localized proteins showed that the single-span transmembrane domains have unique properties in each organelle. Many of the novel Golgi-localized proteins belong to uncharacterized protein families. Structure-based homology analysis identified 12 putative Golgi glycosyltransferase (GT) families that have no functionally characterized members and, therefore, are not yet assigned to a Carbohydrate-Active Enzymes database GT family. The substantial numbers of these putative GTs lead us to estimate that the true number of plant Golgi GTs might be one-third above those currently annotated. Other newly identified proteins are likely to be involved in the transport and interconversion of nucleotide sugar substrates as well as polysaccharide and protein modification.
Collapse
|
36
|
Isobaric tagging approaches in quantitative proteomics: the ups and downs. Anal Bioanal Chem 2012; 404:1029-37. [DOI: 10.1007/s00216-012-6012-9] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2012] [Revised: 03/24/2012] [Accepted: 04/02/2012] [Indexed: 12/27/2022]
|
37
|
Pilkington NCV, Trotter MWB, Holden SB. Multiple Kernel Learning for Drug Discovery. Mol Inform 2012; 31:313-22. [PMID: 27477100 DOI: 10.1002/minf.201100146] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2011] [Accepted: 03/12/2012] [Indexed: 01/04/2023]
Abstract
The support vector machine (SVM) methodology has become a popular and well-used component of present chemometric analysis. We assess a relatively recent development of the algorithm, multiple kernel learning (MKL), on published structure-property relationship (SPR) data. The MKL algorithm learns a weighting across multiple kernel-based representations of the data during supervised classifier creation and, thereby, may be used to describe the influence of distinct groups of structural descriptors upon a single structure-property classifier without explicitly omitting any of them. We observe a statistically significant performance improvement over a conventional, single kernel SVM on all three SPR data sets analysed. Furthermore, MKL output is observed to provide useful information regarding the relative influence of five distinct descriptor subsets present in each data set.
Collapse
Affiliation(s)
- Nicholas C V Pilkington
- University of Cambridge Computer Laboratory, 15 JJ Thomson Avenue, Cambridge, CB3 0FD, UK phone: +44 (0)1223 763725
| | - Matthew W B Trotter
- Anne McLaren Laboratory for Regenerative Medicine & Department of Surgery, University of Cambridge, UK.,Celgene Institute for Translational Research Europe (CITRE), Sevilla, Spain
| | - Sean B Holden
- University of Cambridge Computer Laboratory, 15 JJ Thomson Avenue, Cambridge, CB3 0FD, UK phone: +44 (0)1223 763725.
| |
Collapse
|
38
|
Satori CP, Kostal V, Arriaga EA. Individual organelle pH determinations of magnetically enriched endocytic organelles via laser-induced fluorescence detection. Anal Chem 2011; 83:7331-9. [PMID: 21863795 PMCID: PMC3184341 DOI: 10.1021/ac201196n] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The analysis of biotransformations that occur in lysosomes and other endocytic organelles is critical to studies on intracellular degradation, nutrient recycling, and lysosomal storage disorders. Such analyses require bioactive organelle preparations that are devoid of other contaminating organelles. Commonly used differential centrifugation techniques produce impure fractions and may not be compatible with microscale separation platforms. Density gradient centrifugation procedures reduce the level of impurities but may compromise bioactivity. Here we report on simple magnetic setup and a procedure that produce highly enriched bioactive organelles based on their magnetic capture as they traveled through open tubes. Following capture, in-line laser-induced fluorecence detection (LIF) determined for the first time the pH of each magnetically retained individual endocytic organelle. Unlike bulk measurements, this method was suitable to describe the distributions of pH values in endocytic organelles from L6 rat myoblasts treated with dextran-coated iron oxide nanoparticles (for magnetic retention) and fluorescein/TMRM-conjugated dextran (for pH measurements by LIF). Their individual pH values ranged from 4 to 6, which is typical of bioactive endocytic organelles. These analytical procedures are of high relevance to evaluate lysosomal-related degradation pathways in aging, storage disorders, and drug development.
Collapse
Affiliation(s)
- Chad P. Satori
- University of Minnesota; Department of Chemistry, 207 Pleasant St. SE; Minneapolis MN 55455-0431
| | | | - Edgar A. Arriaga
- University of Minnesota; Department of Chemistry, 207 Pleasant St. SE; Minneapolis MN 55455-0431
| |
Collapse
|
39
|
Ryngajllo M, Childs L, Lohse M, Giorgi FM, Lude A, Selbig J, Usadel B. SLocX: Predicting Subcellular Localization of Arabidopsis Proteins Leveraging Gene Expression Data. FRONTIERS IN PLANT SCIENCE 2011; 2:43. [PMID: 22639594 PMCID: PMC3355584 DOI: 10.3389/fpls.2011.00043] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2011] [Accepted: 08/12/2011] [Indexed: 05/08/2023]
Abstract
Despite the growing volume of experimentally validated knowledge about the subcellular localization of plant proteins, a well performing in silico prediction tool is still a necessity. Existing tools, which employ information derived from protein sequence alone, offer limited accuracy and/or rely on full sequence availability. We explored whether gene expression profiling data can be harnessed to enhance prediction performance. To achieve this, we trained several support vector machines to predict the subcellular localization of Arabidopsis thaliana proteins using sequence derived information, expression behavior, or a combination of these data and compared their predictive performance through a cross-validation test. We show that gene expression carries information about the subcellular localization not available in sequence information, yielding dramatic benefits for plastid localization prediction, and some notable improvements for other compartments such as the mitochondrion, the Golgi, and the plasma membrane. Based on these results, we constructed a novel subcellular localization prediction engine, SLocX, combining gene expression profiling data with protein sequence-based information. We then validated the results of this engine using an independent test set of annotated proteins and a transient expression of GFP fusion proteins. Here, we present the prediction framework and a website of predicted localizations for Arabidopsis. The relatively good accuracy of our prediction engine, even in cases where only partial protein sequence is available (e.g., in sequences lacking the N-terminal region), offers a promising opportunity for similar application to non-sequenced or poorly annotated plant species. Although the prediction scope of our method is currently limited by the availability of expression information on the ATH1 array, we believe that the advances in measuring gene expression technology will make our method applicable for all Arabidopsis proteins.
Collapse
Affiliation(s)
| | - Liam Childs
- Max Planck Institute of Molecular Plant PhysiologyPotsdam, Germany
| | - Marc Lohse
- Max Planck Institute of Molecular Plant PhysiologyPotsdam, Germany
| | | | - Anja Lude
- Max Planck Institute of Molecular Plant PhysiologyPotsdam, Germany
| | - Joachim Selbig
- Department of Bioinformatics, Institute of Biochemistry and Biology, University of PotsdamPotsdam, Germany
| | - Björn Usadel
- Max Planck Institute of Molecular Plant PhysiologyPotsdam, Germany
- *Correspondence: Björn Usadel, Max Planck Institute of Molecular Plant Physiology, Am Muehlenberg 1, Golm, 14476 Potsdam, Germany. e-mail:
| |
Collapse
|