1
|
Cox RM, Papoulas O, Shril S, Lee C, Gardner T, Battenhouse AM, Lee M, Drew K, McWhite CD, Yang D, Leggere JC, Durand D, Hildebrandt F, Wallingford JB, Marcotte EM. Ancient eukaryotic protein interactions illuminate modern genetic traits and disorders. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.26.595818. [PMID: 38853926 PMCID: PMC11160598 DOI: 10.1101/2024.05.26.595818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
All eukaryotes share a common ancestor from roughly 1.5 - 1.8 billion years ago, a single-celled, swimming microbe known as LECA, the Last Eukaryotic Common Ancestor. Nearly half of the genes in modern eukaryotes were present in LECA, and many current genetic diseases and traits stem from these ancient molecular systems. To better understand these systems, we compared genes across modern organisms and identified a core set of 10,092 shared protein-coding gene families likely present in LECA, a quarter of which are uncharacterized. We then integrated >26,000 mass spectrometry proteomics analyses from 31 species to infer how these proteins interact in higher-order complexes. The resulting interactome describes the biochemical organization of LECA, revealing both known and new assemblies. We analyzed these ancient protein interactions to find new human gene-disease relationships for bone density and congenital birth defects, demonstrating the value of ancestral protein interactions for guiding functional genetics today.
Collapse
Affiliation(s)
- Rachael M Cox
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Ophelia Papoulas
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Shirlee Shril
- Division of Nephrology, Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, MA 02215, USA
| | - Chanjae Lee
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Tynan Gardner
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Anna M Battenhouse
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Muyoung Lee
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Kevin Drew
- Department of Biological Sciences, University of Illinois at Chicago, Chicago, IL 60607, USA
| | - Claire D McWhite
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - David Yang
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Janelle C Leggere
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Dannie Durand
- Department of Biological Sciences, Carnegie Mellon University, 4400 5th Avenue Pittsburgh, PA 15213, USA
| | - Friedhelm Hildebrandt
- Division of Nephrology, Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, MA 02215, USA
| | - John B Wallingford
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| | - Edward M Marcotte
- Department of Molecular Biosciences, The University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
2
|
Leggere JC, Hibbard JV, Papoulas O, Lee C, Pearson CG, Marcotte EM, Wallingford JB. Label-free proteomic comparison reveals ciliary and nonciliary phenotypes of IFT-A mutants. Mol Biol Cell 2024; 35:ar39. [PMID: 38170584 PMCID: PMC10916875 DOI: 10.1091/mbc.e23-03-0084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 12/11/2023] [Accepted: 12/18/2023] [Indexed: 01/05/2024] Open
Abstract
DIFFRAC is a powerful method for systematically comparing proteome content and organization between samples in a high-throughput manner. By subjecting control and experimental protein extracts to native chromatography and quantifying the contents of each fraction using mass spectrometry, it enables the quantitative detection of alterations to protein complexes and abundances. Here, we applied DIFFRAC to investigate the consequences of genetic loss of Ift122, a subunit of the intraflagellar transport-A (IFT-A) protein complex that plays a vital role in the formation and function of cilia and flagella, on the proteome of Tetrahymena thermophila. A single DIFFRAC experiment was sufficient to detect changes in protein behavior that mirrored known effects of IFT-A loss and revealed new biology. We uncovered several novel IFT-A-regulated proteins, which we validated through live imaging in Xenopus multiciliated cells, shedding new light on both the ciliary and non-ciliary functions of IFT-A. Our findings underscore the robustness of DIFFRAC for revealing proteomic changes in response to genetic or biochemical perturbation.
Collapse
Affiliation(s)
- Janelle C. Leggere
- Department of Molecular Biosciences, University of Texas at Austin, TX 78712
| | - Jaime V.K. Hibbard
- Department of Molecular Biosciences, University of Texas at Austin, TX 78712
| | - Ophelia Papoulas
- Department of Molecular Biosciences, University of Texas at Austin, TX 78712
| | - Chanjae Lee
- Department of Molecular Biosciences, University of Texas at Austin, TX 78712
| | - Chad G. Pearson
- Anschutz Medical Campus, Department of Cell and Developmental Biology, University of Colorado, Aurora, CO 80045
| | - Edward M. Marcotte
- Department of Molecular Biosciences, University of Texas at Austin, TX 78712
| | - John B. Wallingford
- Department of Molecular Biosciences, University of Texas at Austin, TX 78712
| |
Collapse
|
3
|
Zilocchi M, Rahmatbakhsh M, Moutaoufik MT, Broderick K, Gagarinova A, Jessulat M, Phanse S, Aoki H, Aly KA, Babu M. Co-fractionation-mass spectrometry to characterize native mitochondrial protein assemblies in mammalian neurons and brain. Nat Protoc 2023; 18:3918-3973. [PMID: 37985878 DOI: 10.1038/s41596-023-00901-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 08/09/2023] [Indexed: 11/22/2023]
Abstract
Human mitochondrial (mt) protein assemblies are vital for neuronal and brain function, and their alteration contributes to many human disorders, e.g., neurodegenerative diseases resulting from abnormal protein-protein interactions (PPIs). Knowledge of the composition of mt protein complexes is, however, still limited. Affinity purification mass spectrometry (MS) and proximity-dependent biotinylation MS have defined protein partners of some mt proteins, but are too technically challenging and laborious to be practical for analyzing large numbers of samples at the proteome level, e.g., for the study of neuronal or brain-specific mt assemblies, as well as altered mtPPIs on a proteome-wide scale for a disease of interest in brain regions, disease tissues or neurons derived from patients. To address this challenge, we adapted a co-fractionation-MS platform to survey native mt assemblies in adult mouse brain and in human NTERA-2 embryonal carcinoma stem cells or differentiated neuronal-like cells. The workflow consists of orthogonal separations of mt extracts isolated from chemically cross-linked samples to stabilize PPIs, data-dependent acquisition MS to identify co-eluted mt protein profiles from collected fractions and a computational scoring pipeline to predict mtPPIs, followed by network partitioning to define complexes linked to mt functions as well as those essential for neuronal and brain physiological homeostasis. We developed an R/CRAN software package, Macromolecular Assemblies from Co-elution Profiles for automated scoring of co-fractionation-MS data to define complexes from mtPPI networks. Presently, the co-fractionation-MS procedure takes 1.5-3.5 d of proteomic sample preparation, 31 d of MS data acquisition and 8.5 d of data analyses to produce meaningful biological insights.
Collapse
Affiliation(s)
- Mara Zilocchi
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | | | | | - Kirsten Broderick
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Alla Gagarinova
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
- Department of Biology, University of New Brunswick, Fredericton, New Brunswick, Canada
| | - Matthew Jessulat
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Sadhna Phanse
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Hiroyuki Aoki
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Khaled A Aly
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Mohan Babu
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada.
| |
Collapse
|
4
|
Postoenko VI, Garibova LA, Levitsky LI, Bubis JA, Gorshkov MV, Ivanov MV. IQMMA: Efficient MS1 Intensity Extraction Pipeline Using Multiple Feature Detection Algorithms for DDA Proteomics. J Proteome Res 2023; 22:2827-2835. [PMID: 37579078 DOI: 10.1021/acs.jproteome.3c00075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/16/2023]
Abstract
One of the key steps in data dependent acquisition (DDA) proteomics is detection of peptide isotopic clusters, also called "features", in MS1 spectra and matching them to MS/MS-based peptide identifications. A number of peptide feature detection tools became available in recent years, each relying on its own matching algorithm. Here, we provide an integrated solution, the intensity-based Quantitative Mix and Match Approach (IQMMA), which integrates a number of untargeted peptide feature detection algorithms and returns the most probable intensity values for the MS/MS-based identifications. IQMMA was tested using available proteomic data acquired for both well-characterized (ground truth) and real-world biological samples, including a mix of Yeast and E. coli digests spiked at different concentrations into the Human K562 digest used as a background, and a set of glioblastoma cell lines. Three open-source feature detection algorithms were integrated: Dinosaur, biosaur2, and OpenMS FeatureFinder. None of them was found optimal when applied individually to all the data sets employed in this work; however, their combined use in IQMMA improved efficiency of subsequent protein quantitation. The software implementing IQMMA is freely available at https://github.com/PostoenkoVI/IQMMA under Apache 2.0 license.
Collapse
Affiliation(s)
- Valeriy I Postoenko
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
- Moscow Institute of Physics and Technology, National Research University, G. Dolgoprudny, Institutsky Lane 9, Dolgoprudny 141701, Russia
| | - Leyla A Garibova
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
- Moscow Institute of Physics and Technology, National Research University, G. Dolgoprudny, Institutsky Lane 9, Dolgoprudny 141701, Russia
| | - Lev I Levitsky
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| | - Julia A Bubis
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| | - Mikhail V Gorshkov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| | - Mark V Ivanov
- V. L. Talrose Institute for Energy Problems of Chemical Physics, N. N. Semenov Federal Research Center of Chemical Physics, Russian Academy of Sciences, Moscow 119334, Russia
| |
Collapse
|
5
|
Leggere JC, Hibbard JVK, Papoulas O, Lee C, Pearson CG, Marcotte EM, Wallingford JB. Label-free proteomic comparison reveals ciliary and non-ciliary phenotypes of IFT-A mutants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.08.531778. [PMID: 36945534 PMCID: PMC10028850 DOI: 10.1101/2023.03.08.531778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2023]
Abstract
DIFFRAC is a powerful method for systematically comparing proteome content and organization between samples in a high-throughput manner. By subjecting control and experimental protein extracts to native chromatography and quantifying the contents of each fraction using mass spectrometry, it enables the quantitative detection of alterations to protein complexes and abundances. Here, we applied DIFFRAC to investigate the consequences of genetic loss of Ift122, a subunit of the intraflagellar transport-A (IFT-A) protein complex that plays a vital role in the formation and function of cilia and flagella, on the proteome of Tetrahymena thermophila . A single DIFFRAC experiment was sufficient to detect changes in protein behavior that mirrored known effects of IFT-A loss and revealed new biology. We uncovered several novel IFT-A-regulated proteins, which we validated through live imaging in Xenopus multiciliated cells, shedding new light on both the ciliary and non-ciliary functions of IFT-A. Our findings underscore the robustness of DIFFRAC for revealing proteomic changes in response to genetic or biochemical perturbation.
Collapse
|
6
|
Caira S, Picariello G, Renzone G, Arena S, Troise AD, De Pascale S, Ciaravolo V, Pinto G, Addeo F, Scaloni A. Recent developments in peptidomics for the quali-quantitative analysis of food-derived peptides in human body fluids and tissues. Trends Food Sci Technol 2022. [DOI: 10.1016/j.tifs.2022.06.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
7
|
Sae-Lee W, McCafferty CL, Verbeke EJ, Havugimana PC, Papoulas O, McWhite CD, Houser JR, Vanuytsel K, Murphy GJ, Drew K, Emili A, Taylor DW, Marcotte EM. The protein organization of a red blood cell. Cell Rep 2022; 40:111103. [PMID: 35858567 PMCID: PMC9764456 DOI: 10.1016/j.celrep.2022.111103] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 04/18/2022] [Accepted: 06/24/2022] [Indexed: 11/28/2022] Open
Abstract
Red blood cells (RBCs) (erythrocytes) are the simplest primary human cells, lacking nuclei and major organelles and instead employing about a thousand proteins to dynamically control cellular function and morphology in response to physiological cues. In this study, we define a canonical RBC proteome and interactome using quantitative mass spectrometry and machine learning. Our data reveal an RBC interactome dominated by protein homeostasis, redox biology, cytoskeletal dynamics, and carbon metabolism. We validate protein complexes through electron microscopy and chemical crosslinking and, with these data, build 3D structural models of the ankyrin/Band 3/Band 4.2 complex that bridges the spectrin cytoskeleton to the RBC membrane. The model suggests spring-like compression of ankyrin may contribute to the characteristic RBC cell shape and flexibility. Taken together, our study provides an in-depth view of the global protein organization of human RBCs and serves as a comprehensive resource for future research.
Collapse
Affiliation(s)
- Wisath Sae-Lee
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Caitlyn L McCafferty
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Eric J Verbeke
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Pierre C Havugimana
- Center for Network Systems Biology, Boston University, Boston, MA 02118, USA
| | - Ophelia Papoulas
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Claire D McWhite
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - John R Houser
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Kim Vanuytsel
- Center for Regenerative Medicine, Boston University School of Medicine, 670 Albany Street, Boston, MA 02118, USA
| | - George J Murphy
- Center for Regenerative Medicine, Boston University School of Medicine, 670 Albany Street, Boston, MA 02118, USA
| | - Kevin Drew
- Department of Biological Sciences, University of Illinois at Chicago, 900 S. Ashland Avenue, Chicago, IL 60607, USA
| | - Andrew Emili
- Center for Network Systems Biology, Boston University, Boston, MA 02118, USA
| | - David W Taylor
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Edward M Marcotte
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA.
| |
Collapse
|
8
|
A systematic evaluation of yeast sample preparation protocols for spectral identifications, proteome coverage and post-isolation modifications. J Proteomics 2022; 261:104576. [DOI: 10.1016/j.jprot.2022.104576] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 03/17/2022] [Accepted: 03/17/2022] [Indexed: 11/20/2022]
|
9
|
Svecla M, Garrone G, Faré F, Aletti G, Norata GD, Beretta G. DDASSQ: an open-source, multiple peptide sequencing strategy for label free quantification based on an OpenMS pipeline in the KNIME analytics platform. Proteomics 2021; 21:e2000319. [PMID: 34312990 PMCID: PMC8459258 DOI: 10.1002/pmic.202000319] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 07/08/2021] [Accepted: 07/12/2021] [Indexed: 11/16/2022]
Abstract
In this study we investigated the performance of a computational pipeline for protein identification and label free quantification (LFQ) of LC–MS/MS data sets from experimental animal tissue samples, as well as the impact of its specific peptide search combinatorial approach. The full pipeline workflow was composed of peptide search engine adapters based on different identification algorithms, in the frame of the open‐source OpenMS software running within the KNIME analytics platform. Two different in silico tryptic digestion, database‐search assisted approaches (X!Tandem and MS‐GF+), de novo peptide sequencing based on Novor and consensus library search (SpectraST), were tested for the processing of LC‐MS/MS raw data files obtained from proteomic LC‐MS experiments done on proteolytic extracts from mouse ex vivo liver samples. The results from proteomic LFQ were compared to those based on the application of the two software tools MaxQuant and Proteome Discoverer for protein inference and label‐free data analysis in shotgun proteomics. Data are available via ProteomeXchange with identifier PXD025097.
Collapse
Affiliation(s)
- Monika Svecla
- Department of Excellence of Pharmacological and Biomolecular Sciences, University of Milan, Milan, Italy
| | | | | | - Giacomo Aletti
- Department of Environmental Science and Policy, University of Milan, Milan, Italy
| | - Giuseppe Danilo Norata
- Department of Excellence of Pharmacological and Biomolecular Sciences, University of Milan, Milan, Italy.,Centro Studio Aterosclerosi, Bassini Hospital, Cinisello Balsamo, Milan, Italy
| | - Giangiacomo Beretta
- Department of Environmental Science and Policy, University of Milan, Milan, Italy
| |
Collapse
|
10
|
Drew K, Wallingford JB, Marcotte EM. hu.MAP 2.0: integration of over 15,000 proteomic experiments builds a global compendium of human multiprotein assemblies. Mol Syst Biol 2021; 17:e10016. [PMID: 33973408 PMCID: PMC8111494 DOI: 10.15252/msb.202010016] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 04/08/2021] [Accepted: 04/09/2021] [Indexed: 12/30/2022] Open
Abstract
A general principle of biology is the self‐assembly of proteins into functional complexes. Characterizing their composition is, therefore, required for our understanding of cellular functions. Unfortunately, we lack knowledge of the comprehensive set of identities of protein complexes in human cells. To address this gap, we developed a machine learning framework to identify protein complexes in over 15,000 mass spectrometry experiments which resulted in the identification of nearly 7,000 physical assemblies. We show our resource, hu.MAP 2.0, is more accurate and comprehensive than previous state of the art high‐throughput protein complex resources and gives rise to many new hypotheses, including for 274 completely uncharacterized proteins. Further, we identify 253 promiscuous proteins that participate in multiple complexes pointing to possible moonlighting roles. We have made hu.MAP 2.0 easily searchable in a web interface (http://humap2.proteincomplexes.org/), which will be a valuable resource for researchers across a broad range of interests including systems biology, structural biology, and molecular explanations of disease.
Collapse
Affiliation(s)
- Kevin Drew
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, USA
| | - John B Wallingford
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, USA
| | - Edward M Marcotte
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, USA
| |
Collapse
|
11
|
Schulze S, Igiraneza AB, Kösters M, Leufken J, Leidel SA, Garcia BA, Fufezan C, Pohlschroder M. Enhancing Open Modification Searches via a Combined Approach Facilitated by Ursgal. J Proteome Res 2021; 20:1986-1996. [PMID: 33514075 DOI: 10.1021/acs.jproteome.0c00799] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
The identification of peptide sequences and their post-translational modifications (PTMs) is a crucial step in the analysis of bottom-up proteomics data. The recent development of open modification search (OMS) engines allows virtually all PTMs to be searched for. This not only increases the number of spectra that can be matched to peptides but also greatly advances the understanding of the biological roles of PTMs through the identification, and the thereby facilitated quantification, of peptidoforms (peptide sequences and their potential PTMs). Whereas the benefits of combining results from multiple protein database search engines have been previously established, similar approaches for OMS results have been missing so far. Here we compare and combine results from three different OMS engines, demonstrating an increase in peptide spectrum matches of 8-18%. The unification of search results furthermore allows for the combined downstream processing of search results, including the mapping to potential PTMs. Finally, we test for the ability of OMS engines to identify glycosylated peptides. The implementation of these engines in the Python framework Ursgal facilitates the straightforward application of the OMS with unified parameters and results files, thereby enabling yet unmatched high-throughput, large-scale data analysis.
Collapse
Affiliation(s)
- Stefan Schulze
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Aime Bienfait Igiraneza
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Manuel Kösters
- Department of Chemistry and Biochemistry, University of Bern, 3012 Bern, Switzerland
| | - Johannes Leufken
- Department of Chemistry and Biochemistry, University of Bern, 3012 Bern, Switzerland
| | - Sebastian A Leidel
- Department of Chemistry and Biochemistry, University of Bern, 3012 Bern, Switzerland
| | - Benjamin A Garcia
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Christian Fufezan
- Institute of Pharmacy and Molecular Biotechnology, Heidelberg University, 69120 Heidelberg, Germany
| | - Mechthild Pohlschroder
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| |
Collapse
|
12
|
Floyd BM, Drew K, Marcotte EM. Systematic Identification of Protein Phosphorylation-Mediated Interactions. J Proteome Res 2021; 20:1359-1370. [PMID: 33476154 DOI: 10.1021/acs.jproteome.0c00750] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein phosphorylation is a key regulatory mechanism involved in nearly every eukaryotic cellular process. Increasingly sensitive mass spectrometry approaches have identified hundreds of thousands of phosphorylation sites, but the functions of a vast majority of these sites remain unknown, with fewer than 5% of sites currently assigned a function. To increase our understanding of functional protein phosphorylation we developed an approach (phospho-DIFFRAC) for identifying the phosphorylation-dependence of protein assemblies in a systematic manner. A combination of nonspecific protein phosphatase treatment, size-exclusion chromatography, and mass spectrometry allowed us to identify changes in protein interactions after the removal of phosphate modifications. With this approach we were able to identify 316 proteins involved in phosphorylation-sensitive interactions. We recovered known phosphorylation-dependent interactors such as the FACT complex and spliceosome, as well as identified novel interactions such as the tripeptidyl peptidase TPP2 and the supraspliceosome component ZRANB2. More generally, we find phosphorylation-dependent interactors to be strongly enriched for RNA-binding proteins, providing new insight into the role of phosphorylation in RNA binding. By searching directly for phosphorylated amino acid residues in mass spectrometry data, we identified the likely regulatory phosphosites on ZRANB2 and FACT complex subunit SSRP1. This study provides both a method and resource for obtaining a better understanding of the role of phosphorylation in native macromolecular assemblies. All mass spectrometry data are available through PRIDE (accession #PXD021422).
Collapse
Affiliation(s)
- Brendan M Floyd
- Department of Molecular Biosciences Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Kevin Drew
- Department of Molecular Biosciences Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Edward M Marcotte
- Department of Molecular Biosciences Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, Texas 78712, United States
| |
Collapse
|
13
|
Drew K, Lee C, Cox RM, Dang V, Devitt CC, McWhite CD, Papoulas O, Huizar RL, Marcotte EM, Wallingford JB. A systematic, label-free method for identifying RNA-associated proteins in vivo provides insights into vertebrate ciliary beating machinery. Dev Biol 2020; 467:108-117. [PMID: 32898505 DOI: 10.1016/j.ydbio.2020.08.008] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 08/18/2020] [Indexed: 01/06/2023]
Abstract
Cell-type specific RNA-associated proteins are essential for development and homeostasis in animals. Despite a massive recent effort to systematically identify RNA-associated proteins, we currently have few comprehensive rosters of cell-type specific RNA-associated proteins in vertebrate tissues. Here, we demonstrate the feasibility of determining the RNA-associated proteome of a defined vertebrate embryonic tissue using DIF-FRAC, a systematic and universal (i.e., label-free) method. Application of DIF-FRAC to cultured tissue explants of Xenopus mucociliary epithelium identified dozens of known RNA-associated proteins as expected, but also several novel RNA-associated proteins, including proteins related to assembly of the mitotic spindle and regulation of ciliary beating. In particular, we show that the inner dynein arm tether Cfap44 is an RNA-associated protein that localizes not only to axonemes, but also to liquid-like organelles in the cytoplasm called DynAPs. This result led us to discover that DynAPs are generally enriched for RNA. Together, these data provide a useful resource for a deeper understanding of mucociliary epithelia and demonstrate that DIF-FRAC will be broadly applicable for systematic identification of RNA-associated proteins from embryonic tissues.
Collapse
Affiliation(s)
- Kevin Drew
- Dept. of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA
| | - Chanjae Lee
- Dept. of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA
| | - Rachael M Cox
- Dept. of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA
| | - Vy Dang
- Dept. of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA
| | - Caitlin C Devitt
- Dept. of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA
| | - Claire D McWhite
- Dept. of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA
| | - Ophelia Papoulas
- Dept. of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA
| | - Ryan L Huizar
- Dept. of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA
| | - Edward M Marcotte
- Dept. of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA.
| | - John B Wallingford
- Dept. of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA.
| |
Collapse
|
14
|
Agten A, Van Houtven J, Askenazi M, Burzykowski T, Laukens K, Valkenborg D. Visualizing the agreement of peptide assignments between different search engines. JOURNAL OF MASS SPECTROMETRY : JMS 2020; 55:e4471. [PMID: 31713933 DOI: 10.1002/jms.4471] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 10/23/2019] [Accepted: 10/28/2019] [Indexed: 06/10/2023]
Abstract
There is a trend in the analysis of shotgun proteomics data that aims to combine information from multiple search engines to increase the number of peptide annotations in an experiment. Typically, the degree of search engine complementarity and search engine agreement is visually illustrated by means of Venn diagrams that present the findings of a database search on the level of the nonredundant peptide annotations. We argue this practice to be not fit-for-purpose since the diagrams do not take into account and often conceal the information on complementarity and agreement at the level of the spectrum identification. We promote a new type of visualization that provides insight on the peptide sequence agreement at the level of the peptide-spectrum match (PSM) as a measure of consensus between two search engines with nominal outcomes. We applied the visualizations and percentage sequence agreement to an in-house data set of our benchmark organism, Caenorhabditis elegans, and illustrated that when assessing the agreement between search engine, one should disentangle the notion of PSM confidence and PSM identity. The visualizations presented in this manuscript provide a more informative assessment of pairs of search engines and are made available as an R function in the Supporting Information.
Collapse
Affiliation(s)
- Annelies Agten
- Interuniversity Institute of Biostatistics and Statistical Bioinformatics, Hasselt University, Hasselt, Belgium
| | - Joris Van Houtven
- Interuniversity Institute of Biostatistics and Statistical Bioinformatics, Hasselt University, Hasselt, Belgium
- UA-VITO Center for Proteomics, University of Antwerp, Antwerp, Belgium
- Applied Bio and Molecular Systems, Flemish Institute for Technological Research (VITO), Mol, Belgium
| | | | - Tomasz Burzykowski
- Interuniversity Institute of Biostatistics and Statistical Bioinformatics, Hasselt University, Hasselt, Belgium
| | - Kris Laukens
- Adrem Data Lab, Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Network Antwerp (biomina), University of Antwerp, Antwerp, Belgium
| | - Dirk Valkenborg
- Interuniversity Institute of Biostatistics and Statistical Bioinformatics, Hasselt University, Hasselt, Belgium
- UA-VITO Center for Proteomics, University of Antwerp, Antwerp, Belgium
- Applied Bio and Molecular Systems, Flemish Institute for Technological Research (VITO), Mol, Belgium
| |
Collapse
|
15
|
McWhite CD, Papoulas O, Drew K, Cox RM, June V, Dong OX, Kwon T, Wan C, Salmi ML, Roux SJ, Browning KS, Chen ZJ, Ronald PC, Marcotte EM. A Pan-plant Protein Complex Map Reveals Deep Conservation and Novel Assemblies. Cell 2020; 181:460-474.e14. [PMID: 32191846 DOI: 10.1016/j.cell.2020.02.049] [Citation(s) in RCA: 108] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 01/08/2020] [Accepted: 02/21/2020] [Indexed: 01/11/2023]
Abstract
Plants are foundational for global ecological and economic systems, but most plant proteins remain uncharacterized. Protein interaction networks often suggest protein functions and open new avenues to characterize genes and proteins. We therefore systematically determined protein complexes from 13 plant species of scientific and agricultural importance, greatly expanding the known repertoire of stable protein complexes in plants. By using co-fractionation mass spectrometry, we recovered known complexes, confirmed complexes predicted to occur in plants, and identified previously unknown interactions conserved over 1.1 billion years of green plant evolution. Several novel complexes are involved in vernalization and pathogen defense, traits critical for agriculture. We also observed plant analogs of animal complexes with distinct molecular assemblies, including a megadalton-scale tRNA multi-synthetase complex. The resulting map offers a cross-species view of conserved, stable protein assemblies shared across plant cells and provides a mechanistic, biochemical framework for interpreting plant genetics and mutant phenotypes.
Collapse
Affiliation(s)
- Claire D McWhite
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Ophelia Papoulas
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Kevin Drew
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Rachael M Cox
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Viviana June
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Oliver Xiaoou Dong
- Department of Plant Pathology and The Genome Center, University of California, Davis, Davis, CA 95616, USA; Joint Bioenergy Institute, Emeryville, CA 94608, USA
| | - Taejoon Kwon
- Department of Biomedical Engineering, School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), 50 UNIST-gil, Ulju-gun, Ulsan 44919, Republic of Korea
| | - Cuihong Wan
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA; Hubei Key Lab of Genetic Regulation and Integrative Biology, School of Life Sciences, Central China Normal University, No. 152 Luoyu Road, Wuhan 430079, P.R. China
| | - Mari L Salmi
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Stanley J Roux
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Karen S Browning
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Z Jeffrey Chen
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA
| | - Pamela C Ronald
- Department of Plant Pathology and The Genome Center, University of California, Davis, Davis, CA 95616, USA; Joint Bioenergy Institute, Emeryville, CA 94608, USA
| | - Edward M Marcotte
- Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX 78712, USA.
| |
Collapse
|
16
|
Abstract
Mass spectrometry is extremely efficient for sequencing small peptides generated by, for example, a trypsin digestion of a complex mixture. Current instruments have the capacity to generate 50-100 K MSMS spectra from a single run. Of these ~30-50% is typically assigned to peptide matches on a 1% FDR threshold. The remaining spectra need more research to explain. We address here whether the 30-50% matched spectra provide consensus matches when using different database-dependent search pipelines. Although the majority of the spectra peptide assignments concur across search engines, our conclusion is that database-dependent search engines still require improvements.
Collapse
Affiliation(s)
- Rune Matthiesen
- Computational and Experimental Biology Group, CEDOC, Chronic Diseases Research Centre, NOVA Medical School, Faculdade de Ciências Médicas, Universidade NOVA de Lisboa, Lisboa, Portugal.
| | - Gorka Prieto
- Department of Communications Engineering, Faculty of Engineering of Bilbao, University of the Basque Country (UPV/EHU), Bilbao, Spain
| | - Hans Christian Beck
- Department of Clinical Biochemistry and Pharmacology, Odense University Hospital, Odense C, Denmark
| |
Collapse
|
17
|
Sim HJ, Yun S, Kim HE, Kwon KY, Kim GH, Yun S, Kim BG, Myung K, Park TJ, Kwon T. Simple Method To Characterize the Ciliary Proteome of Multiciliated Cells. J Proteome Res 2019; 19:391-400. [DOI: 10.1021/acs.jproteome.9b00589] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
| | | | | | | | - Gun-Hwa Kim
- Drug & Disease Target Group, Korea Basic Science Institute (KSBI), Cheongju-si, Chungcheongbuk-do 28119, Republic of Korea
- Tunneling Nanotube Research Center, Division of Life Science, Korea University, Seoul 02841, Republic of Korea
| | - Sungho Yun
- Drug & Disease Target Group, Korea Basic Science Institute (KSBI), Cheongju-si, Chungcheongbuk-do 28119, Republic of Korea
| | - Byung Gyu Kim
- Center for Genomic Integrity, Institute for Basic Science, Ulsan 44919, Republic of Korea
| | - Kyungjae Myung
- Center for Genomic Integrity, Institute for Basic Science, Ulsan 44919, Republic of Korea
| | - Tae Joo Park
- Center for Genomic Integrity, Institute for Basic Science, Ulsan 44919, Republic of Korea
| | - Taejoon Kwon
- Center for Genomic Integrity, Institute for Basic Science, Ulsan 44919, Republic of Korea
| |
Collapse
|
18
|
Mallam AL, Sae-Lee W, Schaub JM, Tu F, Battenhouse A, Jang YJ, Kim J, Wallingford JB, Finkelstein IJ, Marcotte EM, Drew K. Systematic Discovery of Endogenous Human Ribonucleoprotein Complexes. Cell Rep 2019; 29:1351-1368.e5. [PMID: 31665645 PMCID: PMC6873818 DOI: 10.1016/j.celrep.2019.09.060] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Revised: 08/30/2019] [Accepted: 09/18/2019] [Indexed: 12/16/2022] Open
Abstract
RNA-binding proteins (RBPs) play essential roles in biology and are frequently associated with human disease. Although recent studies have systematically identified individual RNA-binding proteins, their higher-order assembly into ribonucleoprotein (RNP) complexes has not been systematically investigated. Here, we describe a proteomics method for systematic identification of RNP complexes in human cells. We identify 1,428 protein complexes that associate with RNA, indicating that more than 20% of known human protein complexes contain RNA. To explore the role of RNA in the assembly of each complex, we identify complexes that dissociate, change composition, or form stable protein-only complexes in the absence of RNA. We use our method to systematically identify cell-type-specific RNA-associated proteins in mouse embryonic stem cells and finally, distribute our resource, rna.MAP, in an easy-to-use online interface (rna.proteincomplexes.org). Our system thus provides a methodology for explorations across human tissues, disease states, and throughout all domains of life.
Collapse
Affiliation(s)
- Anna L Mallam
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA.
| | - Wisath Sae-Lee
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Jeffrey M Schaub
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Fan Tu
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Anna Battenhouse
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Yu Jin Jang
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA
| | - Jonghwan Kim
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA
| | - John B Wallingford
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Ilya J Finkelstein
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Edward M Marcotte
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA.
| | - Kevin Drew
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA; Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA; Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA.
| |
Collapse
|
19
|
Moutaoufik MT, Malty R, Amin S, Zhang Q, Phanse S, Gagarinova A, Zilocchi M, Hoell L, Minic Z, Gagarinova M, Aoki H, Stockwell J, Jessulat M, Goebels F, Broderick K, Scott NE, Vlasblom J, Musso G, Prasad B, Lamantea E, Garavaglia B, Rajput A, Murayama K, Okazaki Y, Foster LJ, Bader GD, Cayabyab FS, Babu M. Rewiring of the Human Mitochondrial Interactome during Neuronal Reprogramming Reveals Regulators of the Respirasome and Neurogenesis. iScience 2019; 19:1114-1132. [PMID: 31536960 PMCID: PMC6831851 DOI: 10.1016/j.isci.2019.08.057] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Revised: 06/28/2019] [Accepted: 08/29/2019] [Indexed: 12/13/2022] Open
Abstract
Mitochondrial protein (MP) assemblies undergo alterations during neurogenesis, a complex process vital in brain homeostasis and disease. Yet which MP assemblies remodel during differentiation remains unclear. Here, using mass spectrometry-based co-fractionation profiles and phosphoproteomics, we generated mitochondrial interaction maps of human pluripotent embryonal carcinoma stem cells and differentiated neuronal-like cells, which presented as two discrete cell populations by single-cell RNA sequencing. The resulting networks, encompassing 6,442 high-quality associations among 600 MPs, revealed widespread changes in mitochondrial interactions and site-specific phosphorylation during neuronal differentiation. By leveraging the networks, we show the orphan C20orf24 as a respirasome assembly factor whose disruption markedly reduces respiratory chain activity in patients deficient in complex IV. We also find that a heme-containing neurotrophic factor, neuron-derived neurotrophic factor [NENF], couples with Parkinson disease-related proteins to promote neurotrophic activity. Our results provide insights into the dynamic reorganization of mitochondrial networks during neuronal differentiation and highlights mechanisms for MPs in respirasome, neuronal function, and mitochondrial diseases. Rewiring of mitochondrial (mt) protein interaction network in distinct cell states Dramatic changes in site-specific phosphorylation during neuronal differentiation C20orf24 is a respirasome assembly factor depleted in patients deficient in CIV NENF binding with DJ-1/PINK1 promotes neurotrophic activity and neuronal survival
Collapse
Affiliation(s)
| | - Ramy Malty
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Shahreen Amin
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Qingzhou Zhang
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Sadhna Phanse
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Alla Gagarinova
- Department of Biochemistry, University of Saskatchewan, Saskatoon, SK S7N 5E5, Canada
| | - Mara Zilocchi
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Larissa Hoell
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Zoran Minic
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Maria Gagarinova
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Hiroyuki Aoki
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Jocelyn Stockwell
- Department of Surgery, Neuroscience Research Group, College of Medicine, University of Saskatchewan, Saskatoon, SK S7N 5E5, Canada
| | - Matthew Jessulat
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Florian Goebels
- The Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Kirsten Broderick
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Nichollas E Scott
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| | - James Vlasblom
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada
| | - Gabriel Musso
- Department of Medicine, Harvard Medical School and Cardiovascular Division, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Bhanu Prasad
- Department of Medicine, Regina Qu'Appelle Health Region, Regina, SK S4P 0W5, Canada
| | - Eleonora Lamantea
- Medical Genetics and Neurogenetics Unit, Fondazione IRCCS Instituto Neurologico Carlo Besta, via L. Temolo, 4, 20126 Milan, Italy
| | - Barbara Garavaglia
- Medical Genetics and Neurogenetics Unit, Fondazione IRCCS Instituto Neurologico Carlo Besta, via L. Temolo, 4, 20126 Milan, Italy
| | - Alex Rajput
- Department of Medicine, Division of Neurology, College of Medicine, University of Saskatchewan, Saskatoon, SK S7N 5E5, Canada
| | - Kei Murayama
- Department of Metabolism, Chiba Children's Hospital, 579-1 Heta-cho, Midori, Chiba 266-0007, Japan
| | - Yasushi Okazaki
- Graduate School of Medicine, Intractable Disease Research Center, Juntendo University, Hongo 2-1-1, Bunkyo-ku, Tokyo 113-8421, Japan
| | - Leonard J Foster
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| | - Gary D Bader
- The Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Francisco S Cayabyab
- Department of Surgery, Neuroscience Research Group, College of Medicine, University of Saskatchewan, Saskatoon, SK S7N 5E5, Canada
| | - Mohan Babu
- Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada.
| |
Collapse
|
20
|
Hu LZ, Goebels F, Tan JH, Wolf E, Kuzmanov U, Wan C, Phanse S, Xu C, Schertzberg M, Fraser AG, Bader GD, Emili A. EPIC: software toolkit for elution profile-based inference of protein complexes. Nat Methods 2019; 16:737-742. [PMID: 31308550 PMCID: PMC7995176 DOI: 10.1038/s41592-019-0461-4] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Accepted: 05/15/2019] [Indexed: 11/08/2022]
Abstract
Protein complexes are key macromolecular machines of the cell, but their description remains incomplete. We and others previously reported an experimental strategy for global characterization of native protein assemblies based on chromatographic fractionation of biological extracts coupled to precision mass spectrometry analysis (chromatographic fractionation-mass spectrometry, CF-MS), but the resulting data are challenging to process and interpret. Here, we describe EPIC (elution profile-based inference of complexes), a software toolkit for automated scoring of large-scale CF-MS data to define high-confidence multi-component macromolecules from diverse biological specimens. As a case study, we used EPIC to map the global interactome of Caenorhabditis elegans, defining 612 putative worm protein complexes linked to diverse biological processes. These included novel subunits and assemblies unique to nematodes that we validated using orthogonal methods. The open source EPIC software is freely available as a Jupyter notebook packaged in a Docker container (https://hub.docker.com/r/baderlab/bio-epic/).
Collapse
Affiliation(s)
- Lucas ZhongMing Hu
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Florian Goebels
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - June H Tan
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Eric Wolf
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Uros Kuzmanov
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Cuihong Wan
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- School of Life Science, Central China Normal University, Wuhan, China
| | - Sadhna Phanse
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Changjiang Xu
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Mike Schertzberg
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Andrew G Fraser
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Gary D Bader
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
| | - Andrew Emili
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
- Departments of Biochemistry and Biology, Boston University, Boston, MA, USA.
| |
Collapse
|
21
|
Maes E, Oeyen E, Boonen K, Schildermans K, Mertens I, Pauwels P, Valkenborg D, Baggerman G. The challenges of peptidomics in complementing proteomics in a clinical context. MASS SPECTROMETRY REVIEWS 2019; 38:253-264. [PMID: 30372792 DOI: 10.1002/mas.21581] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 10/01/2018] [Indexed: 06/08/2023]
Abstract
Naturally occurring peptides, including growth factors, hormones, and neurotransmitters, represent an important class of biomolecules and have crucial roles in human physiology. The study of these peptides in clinical samples is therefore as relevant as ever. Compared to more routine proteomics applications in clinical research, peptidomics research questions are more challenging and have special requirements with regard to sample handling, experimental design, and bioinformatics. In this review, we describe the issues that confront peptidomics in a clinical context. After these hurdles are (partially) overcome, peptidomics will be ready for a successful translation into medical practice.
Collapse
Affiliation(s)
- Evelyne Maes
- Flemish Institute for Technological Research (VITO), Mol, Belgium
- Centre for Proteomics, University of Antwerp, Antwerp, Belgium
- Food and Bio-Based Products, AgResearch Ltd., Lincoln, New Zealand
| | - Eline Oeyen
- Flemish Institute for Technological Research (VITO), Mol, Belgium
- Centre for Proteomics, University of Antwerp, Antwerp, Belgium
| | - Kurt Boonen
- Flemish Institute for Technological Research (VITO), Mol, Belgium
- Centre for Proteomics, University of Antwerp, Antwerp, Belgium
| | - Karin Schildermans
- Flemish Institute for Technological Research (VITO), Mol, Belgium
- Centre for Proteomics, University of Antwerp, Antwerp, Belgium
| | - Inge Mertens
- Flemish Institute for Technological Research (VITO), Mol, Belgium
- Centre for Proteomics, University of Antwerp, Antwerp, Belgium
| | - Patrick Pauwels
- Molecular Pathology Unit, Department of Pathology, Antwerp University Hospital, Edegem, Belgium
| | - Dirk Valkenborg
- Flemish Institute for Technological Research (VITO), Mol, Belgium
- Centre for Proteomics, University of Antwerp, Antwerp, Belgium
- Center for Statistics, Hasselt University, Diepenbeek, Belgium
| | - Geert Baggerman
- Flemish Institute for Technological Research (VITO), Mol, Belgium
- Centre for Proteomics, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
22
|
Schiebenhoefer H, Van Den Bossche T, Fuchs S, Renard BY, Muth T, Martens L. Challenges and promise at the interface of metaproteomics and genomics: an overview of recent progress in metaproteogenomic data analysis. Expert Rev Proteomics 2019; 16:375-390. [PMID: 31002542 DOI: 10.1080/14789450.2019.1609944] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
INTRODUCTION The study of microbial communities based on the combined analysis of genomic and proteomic data - called metaproteogenomics - has gained increased research attention in recent years. This relatively young field aims to elucidate the functional and taxonomic interplay of proteins in microbiomes and its implications on human health and the environment. Areas covered: This article reviews bioinformatics methods and software tools dedicated to the analysis of data from metaproteomics and metaproteogenomics experiments. In particular, it focuses on the creation of tailored protein sequence databases, on the optimal use of database search algorithms including methods of error rate estimation, and finally on taxonomic and functional annotation of peptide and protein identifications. Expert opinion: Recently, various promising strategies and software tools have been proposed for handling typical data analysis issues in metaproteomics. However, severe challenges remain that are highlighted and discussed in this article; these include: (i) robust false-positive assessment of peptide and protein identifications, (ii) complex protein inference against a background of highly redundant data, (iii) taxonomic and functional post-processing of identification data, and finally, (iv) the assessment and provision of metrics and tools for quantitative analysis.
Collapse
Affiliation(s)
- Henning Schiebenhoefer
- a Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure , Robert Koch Institute , Berlin , Germany
| | - Tim Van Den Bossche
- b VIB - UGent Center for Medical Biotechnology, VIB , Ghent , Belgium.,c Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences , Ghent University , Ghent , Belgium
| | - Stephan Fuchs
- d FG13 Division of Nosocomial Pathogens and Antibiotic Resistances , Robert Koch Institute , Wernigerode , Germany
| | - Bernhard Y Renard
- a Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure , Robert Koch Institute , Berlin , Germany
| | - Thilo Muth
- a Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure , Robert Koch Institute , Berlin , Germany
| | - Lennart Martens
- b VIB - UGent Center for Medical Biotechnology, VIB , Ghent , Belgium.,c Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences , Ghent University , Ghent , Belgium
| |
Collapse
|
23
|
Wu X, Xing X, Dowlut D, Zeng Y, Liu J, Liu X. Integrating phosphoproteomics into kinase-targeted cancer therapies in precision medicine. J Proteomics 2019; 191:68-79. [DOI: 10.1016/j.jprot.2018.03.033] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 03/20/2018] [Accepted: 03/31/2018] [Indexed: 12/12/2022]
|
24
|
Lin A, Howbert JJ, Noble WS. Combining High-Resolution and Exact Calibration To Boost Statistical Power: A Well-Calibrated Score Function for High-Resolution MS2 Data. J Proteome Res 2018; 17:3644-3656. [PMID: 30221945 DOI: 10.1021/acs.jproteome.8b00206] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
To achieve accurate assignment of peptide sequences to observed fragmentation spectra, a shotgun proteomics database search tool must make good use of the very high-resolution information produced by state-of-the-art mass spectrometers. However, making use of this information while also ensuring that the search engine's scores are well calibrated, that is, that the score assigned to one spectrum can be meaningfully compared to the score assigned to a different spectrum, has proven to be challenging. Here we describe a database search score function, the "residue evidence" (res-ev) score, that achieves both of these goals simultaneously. We also demonstrate how to combine calibrated res-ev scores with calibrated XCorr scores to produce a "combined p value" score function. We provide a benchmark consisting of four mass spectrometry data sets, which we use to compare the combined p value to the score functions used by several existing search engines. Our results suggest that the combined p value achieves state-of-the-art performance, generally outperforming MS Amanda and Morpheus and performing comparably to MS-GF+. The res-ev and combined p-value score functions are freely available as part of the Tide search engine in the Crux mass spectrometry toolkit ( http://crux.ms ).
Collapse
Affiliation(s)
- Andy Lin
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
| | - J Jeffry Howbert
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
| | - William Stafford Noble
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States.,Department of Computer Science and Engineering , University of Washington , Seattle , Washington 98195 , United States
| |
Collapse
|
25
|
Guo X, Li Z, Yao Q, Mueller RS, Eng JK, Tabb DL, Hervey WJ, Pan C. Sipros Ensemble improves database searching and filtering for complex metaproteomics. Bioinformatics 2018; 34:795-802. [PMID: 29028897 PMCID: PMC6192206 DOI: 10.1093/bioinformatics/btx601] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2017] [Accepted: 09/19/2017] [Indexed: 01/14/2023] Open
Abstract
Motivation Complex microbial communities can be characterized by metagenomics and metaproteomics.
However, metagenome assemblies often generate enormous, and yet incomplete, protein
databases, which undermines the identification of peptides and proteins in
metaproteomics. This challenge calls for increased discrimination of true
identifications from false identifications by database searching and filtering
algorithms in metaproteomics. Results Sipros Ensemble was developed here for metaproteomics using an ensemble approach. Three
diverse scoring functions from MyriMatch, Comet and the original Sipros were
incorporated within a single database searching engine. Supervised classification with
logistic regression was used to filter database searching results. Benchmarking with
soil and marine microbial communities demonstrated a higher number of peptide and
protein identifications by Sipros Ensemble than MyriMatch/Percolator, Comet/Percolator,
MS-GF+/Percolator, Comet & MyriMatch/iProphet and Comet & MyriMatch &
MS-GF+/iProphet. Sipros Ensemble was computationally efficient and scalable on
supercomputers. Availability and implementation Freely available under the GNU GPL license at http://sipros.omicsbio.org. Supplementary information Supplementary data are
available at Bioinformatics online.
Collapse
Affiliation(s)
- Xuan Guo
- Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37996, USA.,Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA.,Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, USA
| | - Zhou Li
- Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37996, USA.,Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Qiuming Yao
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Ryan S Mueller
- Department of Microbiology, Oregon State University, Corvallis, OR 97331, USA
| | - Jimmy K Eng
- Proteomics Resource, University of Washington, Seattle, WA 98195, USA
| | - David L Tabb
- DST/NRF Centre of Excellence for Biomedical Tuberculosis Research, SAMRC Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town 7505, South Africa
| | - William Judson Hervey
- Naval Research Laboratory, Center for Bio/Molecular Science & Engineering (Code 6910), Washington, DC, 20375, USA
| | - Chongle Pan
- Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37996, USA.,Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| |
Collapse
|
26
|
Mohammed Y, Palmblad M. Visualizing and comparing results of different peptide identification methods. Brief Bioinform 2018; 19:210-218. [PMID: 28011752 DOI: 10.1093/bib/bbw115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Indexed: 11/14/2022] Open
Abstract
In mass spectrometry-based proteomics, peptides are typically identified from tandem mass spectra using spectrum comparison. A sequence search engine compares experimentally obtained spectra with those predicted from protein sequences, applying enzyme cleavage and fragmentation rules. To this, there are two main alternatives: spectral libraries and de novo sequencing. The former compares measured spectra with a collection of previously acquired and identified spectra in a library. De novo attempts to sequence peptides from the tandem mass spectra alone. We here present a theoretical framework and a data processing workflow for visualizing and comparing the results of these different types of algorithms. The method considers the three search strategies as different dimensions, identifies distinct agreement classes and visualizes the complementarity of the search strategies. We have included X! Tandem, SpectraST and PepNovo, as they are in common use and representative for algorithms of each type. Our method allows advanced investigation of how the three search methods perform relatively to each other and shows the impact of the currently used decoy sequences for evaluating the false discovery rates.
Collapse
Affiliation(s)
- Yassene Mohammed
- Center for Proteomics and Metabolomics, Leiden University Medical Center, the Netherlands.,University of Victoria, University of Victoria - Genome British Columbia Proteomics Centre, Canada
| | - Magnus Palmblad
- Center for Proteomics and Metabolomics, Leiden University Medical Center, the Netherlands
| |
Collapse
|
27
|
Scheubert K, Hufsky F, Petras D, Wang M, Nothias LF, Dührkop K, Bandeira N, Dorrestein PC, Böcker S. Significance estimation for large scale metabolomics annotations by spectral matching. Nat Commun 2017; 8:1494. [PMID: 29133785 PMCID: PMC5684233 DOI: 10.1038/s41467-017-01318-5] [Citation(s) in RCA: 100] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 09/08/2017] [Indexed: 12/17/2022] Open
Abstract
The annotation of small molecules in untargeted mass spectrometry relies on the matching of fragment spectra to reference library spectra. While various spectrum-spectrum match scores exist, the field lacks statistical methods for estimating the false discovery rates (FDR) of these annotations. We present empirical Bayes and target-decoy based methods to estimate the false discovery rate (FDR) for 70 public metabolomics data sets. We show that the spectral matching settings need to be adjusted for each project. By adjusting the scoring parameters and thresholds, the number of annotations rose, on average, by +139% (ranging from -92 up to +5705%) when compared with a default parameter set available at GNPS. The FDR estimation methods presented will enable a user to assess the scoring criteria for large scale analysis of mass spectrometry based metabolomics data that has been essential in the advancement of proteomics, transcriptomics, and genomics science.
Collapse
Affiliation(s)
- Kerstin Scheubert
- Chair for Bioinformatics, Friedrich Schiller University Jena, Jena, 07743, Germany
| | - Franziska Hufsky
- Chair for Bioinformatics, Friedrich Schiller University Jena, Jena, 07743, Germany
- RNA Bioinformatics and High Throughput Analysis, Friedrich Schiller University Jena, Jena, 07743, Germany
| | - Daniel Petras
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, La Jolla, San Diego, CA, 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, La Jolla, San Diego, CA, 92093, USA
| | - Mingxun Wang
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, La Jolla, San Diego, CA, 92093, USA
| | - Louis-Félix Nothias
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, La Jolla, San Diego, CA, 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, La Jolla, San Diego, CA, 92093, USA
| | - Kai Dührkop
- Chair for Bioinformatics, Friedrich Schiller University Jena, Jena, 07743, Germany
| | - Nuno Bandeira
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, La Jolla, San Diego, CA, 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, La Jolla, San Diego, CA, 92093, USA
| | - Pieter C Dorrestein
- Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, La Jolla, San Diego, CA, 92093, USA
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, La Jolla, San Diego, CA, 92093, USA
| | - Sebastian Böcker
- Chair for Bioinformatics, Friedrich Schiller University Jena, Jena, 07743, Germany.
| |
Collapse
|
28
|
Kuboniwa M, Houser JR, Hendrickson EL, Wang Q, Alghamdi SA, Sakanaka A, Miller DP, Hutcherson JA, Wang T, Beck DAC, Whiteley M, Amano A, Wang H, Marcotte EM, Hackett M, Lamont RJ. Metabolic crosstalk regulates Porphyromonas gingivalis colonization and virulence during oral polymicrobial infection. Nat Microbiol 2017; 2:1493-1499. [PMID: 28924191 PMCID: PMC5678995 DOI: 10.1038/s41564-017-0021-6] [Citation(s) in RCA: 85] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2016] [Accepted: 08/04/2017] [Indexed: 02/06/2023]
Abstract
Many human infections are polymicrobial in origin, and interactions among community inhabitants shape colonization patterns and pathogenic potential 1 . Periodontitis, which is the sixth most prevalent infectious disease worldwide 2 , ensues from the action of dysbiotic polymicrobial communities 3 . The keystone pathogen Porphyromonas gingivalis and the accessory pathogen Streptococcus gordonii interact to form communities in vitro and exhibit increased fitness in vivo 3,4 . The mechanistic basis of this polymicrobial synergy, however, has not been fully elucidated. Here we show that streptococcal 4-aminobenzoate/para-amino benzoic acid (pABA) is required for maximal accumulation of P. gingivalis in dual-species communities. Metabolomic and proteomic data showed that exogenous pABA is used for folate biosynthesis, and leads to decreased stress and elevated expression of fimbrial adhesins. Moreover, pABA increased the colonization and survival of P. gingivalis in a murine oral infection model. However, pABA also caused a reduction in virulence in vivo and suppressed extracellular polysaccharide production by P. gingivalis. Collectively, these data reveal a multidimensional aspect to P. gingivalis-S. gordonii interactions and establish pABA as a critical cue produced by a partner species that enhances the fitness of P. gingivalis while diminishing its virulence.
Collapse
Affiliation(s)
- Masae Kuboniwa
- Department of Preventive Dentistry, Osaka University Graduate School of Dentistry, 1-8 Yamadaoka, Suita, Osaka, 565-0871, Japan
- AMED-CREST, Japan Agency for Medical Research and Development, 1-7-1 Otemachi, Chiyoda-ku, Tokyo, 100-0004, Japan
| | - John R Houser
- Institute for Cellular and Molecular Biology, and Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Erik L Hendrickson
- Center for Microbial Proteomics and Chemical Engineering, University of Washington, Seattle, WA, 98195, USA
| | - Qian Wang
- Department of Oral Immunology and Infectious Diseases, University of Louisville School of Dentistry, Louisville, KY, 40292, USA
| | - Samar A Alghamdi
- Department of Preventive Dentistry, Osaka University Graduate School of Dentistry, 1-8 Yamadaoka, Suita, Osaka, 565-0871, Japan
| | - Akito Sakanaka
- Department of Preventive Dentistry, Osaka University Graduate School of Dentistry, 1-8 Yamadaoka, Suita, Osaka, 565-0871, Japan
| | - Daniel P Miller
- Department of Oral Immunology and Infectious Diseases, University of Louisville School of Dentistry, Louisville, KY, 40292, USA
| | - Justin A Hutcherson
- Department of Oral Immunology and Infectious Diseases, University of Louisville School of Dentistry, Louisville, KY, 40292, USA
| | - Tiansong Wang
- Center for Microbial Proteomics and Chemical Engineering, University of Washington, Seattle, WA, 98195, USA
| | - David A C Beck
- Center for Microbial Proteomics and Chemical Engineering, University of Washington, Seattle, WA, 98195, USA
- Department of eScience, University of Washington, Seattle, WA, 98195, USA
| | - Marvin Whiteley
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Atsuo Amano
- Department of Preventive Dentistry, Osaka University Graduate School of Dentistry, 1-8 Yamadaoka, Suita, Osaka, 565-0871, Japan
| | - Huizhi Wang
- Department of Oral Immunology and Infectious Diseases, University of Louisville School of Dentistry, Louisville, KY, 40292, USA
| | - Edward M Marcotte
- Institute for Cellular and Molecular Biology, and Center for Systems and Synthetic Biology, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Murray Hackett
- Center for Microbial Proteomics and Chemical Engineering, University of Washington, Seattle, WA, 98195, USA
| | - Richard J Lamont
- Department of Oral Immunology and Infectious Diseases, University of Louisville School of Dentistry, Louisville, KY, 40292, USA.
| |
Collapse
|
29
|
Maabreh M, Qolomany B, Alsmadi I, Gupta A. Deep Learning-based MSMS Spectra Reduction in Support of Running Multiple Protein Search Engines on Cloud. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE 2017; 2017:1909-1914. [PMID: 34430067 PMCID: PMC8382039 DOI: 10.1109/bibm.2017.8217951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The diversity of the available protein search engines with respect to the utilized matching algorithms, the low overlap ratios among their results and the disparity of their coverage encourage the community of proteomics to utilize ensemble solutions of different search engines. The advancing in cloud computing technology and the availability of distributed processing clusters can also provide support to this task. However, data transferring and results' combining, in this case, could be the major bottleneck. The flood of billions of observed mass spectra, hundreds of Gigabytes or potentially Terabytes of data, could easily cause the congestions, increase the risk of failure, poor performance, add more computations' cost, and waste available resources. Therefore, in this study, we propose a deep learning model in order to mitigate the traffic over cloud network and, thus reduce the cost of cloud computing. The model, which depends on the top 50 intensities and their m/z values of each spectrum, removes any spectrum which is predicted not to pass the majority voting of the participated search engines. Our results using three search engines namely: pFind, Comet and X!Tandem, and four different datasets are promising and promote the investment in deep learning to solve such type of Big data problems.
Collapse
Affiliation(s)
- Majdi Maabreh
- Department of Computer Science, Western Michigan University, Kalamazoo, MI, USA
| | - Basheer Qolomany
- Department of Computer Science, Western Michigan University, Kalamazoo, MI, USA
| | - Izzat Alsmadi
- Department of Computing and Cyber Security, Texas A&M University, San Antonio, TX, USA
| | - Ajay Gupta
- Department of Computer Science, Western Michigan University, Kalamazoo, MI, USA
| |
Collapse
|
30
|
Drew K, Müller CL, Bonneau R, Marcotte EM. Identifying direct contacts between protein complex subunits from their conditional dependence in proteomics datasets. PLoS Comput Biol 2017; 13:e1005625. [PMID: 29023445 PMCID: PMC5638211 DOI: 10.1371/journal.pcbi.1005625] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Accepted: 06/06/2017] [Indexed: 12/21/2022] Open
Abstract
Determining the three dimensional arrangement of proteins in a complex is highly beneficial for uncovering mechanistic function and interpreting genetic variation in coding genes comprising protein complexes. There are several methods for determining co-complex interactions between proteins, among them co-fractionation / mass spectrometry (CF-MS), but it remains difficult to identify directly contacting subunits within a multi-protein complex. Correlation analysis of CF-MS profiles shows promise in detecting protein complexes as a whole but is limited in its ability to infer direct physical contacts among proteins in sub-complexes. To identify direct protein-protein contacts within human protein complexes we learn a sparse conditional dependency graph from approximately 3,000 CF-MS experiments on human cell lines. We show substantial performance gains in estimating direct interactions compared to correlation analysis on a benchmark of large protein complexes with solved three-dimensional structures. We demonstrate the method’s value in determining the three dimensional arrangement of proteins by making predictions for complexes without known structure (the exocyst and tRNA multi-synthetase complex) and by establishing evidence for the structural position of a recently discovered component of the core human EKC/KEOPS complex, GON7/C14ORF142, providing a more complete 3D model of the complex. Direct contact prediction provides easily calculable additional structural information for large-scale protein complex mapping studies and should be broadly applicable across organisms as more CF-MS datasets become available. Proteins physically associate into complexes in order to carry out the essential functions of life. Knowing how proteins are physically arranged three dimensionally in these complexes provides clues towards how they work. In principle, the associations between proteins in large-scale proteomics datasets should often reflect direct physical contacts between proteins in each complex. Here, we describe a statistical method to discover which subunits within complexes directly contact each other based on their co-purification behavior in published co-fractionation mass spectrometry datasets. Within our predictions, we recover many known protein-protein contacts, serving to validate our method, as well as unknown contacts that can inform future studies of these complexes. Specifically, we observe confident contacts between subunits within the exocyst and tRNA multi-synthetase complexes, two complexes that have incomplete structural information. Using our method, we further provide structural information for a previously missing subunit of the EKC/KEOPS complex. We anticipate that this method and the associated predictions will help to better inform our understanding of the functions and structures of diverse protein complexes.
Collapse
Affiliation(s)
- Kevin Drew
- Center for Systems and Synthetic Biology, Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, United States of America
- * E-mail: (KD); (CLM); (EMM)
| | - Christian L. Müller
- Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY, United States of America
- * E-mail: (KD); (CLM); (EMM)
| | - Richard Bonneau
- Flatiron Institute, Center for Computational Biology, Simons Foundation, New York, NY, United States of America
- New York University Center for Genomics and Systems Biology, New York University, New York, NY, United States of America
| | - Edward M. Marcotte
- Center for Systems and Synthetic Biology, Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, United States of America
- * E-mail: (KD); (CLM); (EMM)
| |
Collapse
|
31
|
Hu H, Khatri K, Zaia J. Algorithms and design strategies towards automated glycoproteomics analysis. MASS SPECTROMETRY REVIEWS 2017; 36:475-498. [PMID: 26728195 PMCID: PMC4931994 DOI: 10.1002/mas.21487] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 11/30/2015] [Indexed: 05/09/2023]
Abstract
Glycoproteomics involves the study of glycosylation events on protein sequences ranging from purified proteins to whole proteome scales. Understanding these complex post-translational modification (PTM) events requires elucidation of the glycan moieties (monosaccharide sequences and glycosidic linkages between residues), protein sequences, as well as site-specific attachment of glycan moieties onto protein sequences, in a spatial and temporal manner in a variety of biological contexts. Compared with proteomics, bioinformatics for glycoproteomics is immature and many researchers still rely on tedious manual interpretation of glycoproteomics data. As sample preparation protocols and analysis techniques have matured, the number of publications on glycoproteomics and bioinformatics has increased substantially; however, the lack of consensus on tool development and code reuse limits the dissemination of bioinformatics tools because it requires significant effort to migrate a computational tool tailored for one method design to alternative methods. This review discusses algorithms and methods in glycoproteomics, and refers to the general proteomics field for potential solutions. It also introduces general strategies for tool integration and pipeline construction in order to better serve the glycoproteomics community. © 2016 Wiley Periodicals, Inc. Mass Spec Rev 36:475-498, 2017.
Collapse
Affiliation(s)
- Han Hu
- Bioinformatics Program, Boston University, Boston, Massachusetts 02215, USA
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| | - Kshitij Khatri
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| | - Joseph Zaia
- Center for Biomedical Mass Spectrometry, Department of Biochemistry, Boston University School of Medicine, Boston University, Boston, Massachusetts 02118, USA
| |
Collapse
|
32
|
Abstract
Systems biology is an approach to study all genes, gene transcripts, proteins, metabolites, and their interactions in specific cells, tissues, organs, or the whole organism. It is based on data derived from high-throughput analytical technologies and bioinformatics tools to analyze these data, and aims to understand the whole system rather than individual aspects of it. Systems biology can be applied to virtually all conditions and diseases and therefore also to hypertension and its underlying vascular disorders. Unlike other methods in this book there is no clear-cut protocol to explain a systems biology approach. We will instead outline some of the most important and common steps in the generation and analysis of systems biology data.
Collapse
Affiliation(s)
- Christian Delles
- Institute of Cardiovascular and Medical Sciences, BHF Glasgow Cardiovascular Research Centre, University of Glasgow, 126 University Place, Glasgow, G12 8TA, UK.
| | - Holger Husi
- School of Natural Sciences, University of Stirling, Stirling, UK
| |
Collapse
|
33
|
Lam MPY, Lau E, Ng DCM, Wang D, Ping P. Cardiovascular proteomics in the era of big data: experimental and computational advances. Clin Proteomics 2016; 13:23. [PMID: 27980500 PMCID: PMC5137214 DOI: 10.1186/s12014-016-9124-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2016] [Accepted: 08/24/2016] [Indexed: 01/14/2023] Open
Abstract
Proteomics plays an increasingly important role in our quest to understand cardiovascular biology. Fueled by analytical and computational advances in the past decade, proteomics applications can now go beyond merely inventorying protein species, and address sophisticated questions on cardiac physiology. The advent of massive mass spectrometry datasets has in turn led to increasing intersection between proteomics and big data science. Here we review new frontiers in technological developments and their applications to cardiovascular medicine. The impact of big data science on cardiovascular proteomics investigations and translation to medicine is highlighted.
Collapse
Affiliation(s)
- Maggie P Y Lam
- NIH BD2K Center of Excellence at UCLA; Department of Physiology, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA
| | - Edward Lau
- NIH BD2K Center of Excellence at UCLA; Department of Physiology, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA
| | - Dominic C M Ng
- NIH BD2K Center of Excellence at UCLA; Department of Physiology, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA
| | - Ding Wang
- NIH BD2K Center of Excellence at UCLA; Department of Physiology, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA
| | - Peipei Ping
- NIH BD2K Center of Excellence at UCLA; Department of Physiology, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA ; Department of Medicine, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA ; Department of Bioinformatics, University of California at Los Angeles, 675 Charles E. Young Drive, Los Angeles, CA 90095 USA
| |
Collapse
|
34
|
Microvesicles from brain-extract-treated mesenchymal stem cells improve neurological functions in a rat model of ischemic stroke. Sci Rep 2016; 6:33038. [PMID: 27609711 PMCID: PMC5016792 DOI: 10.1038/srep33038] [Citation(s) in RCA: 70] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Accepted: 08/17/2016] [Indexed: 12/16/2022] Open
Abstract
Transplantation of mesenchymal stem cells (MSCs) was reported to improve functional outcomes in a rat model of ischemic stroke, and subsequent studies suggest that MSC-derived microvesicles (MVs) can replace the beneficial effects of MSCs. Here, we evaluated three different MSC-derived MVs, including MVs from untreated MSCs (MSC-MVs), MVs from MSCs treated with normal rat brain extract (NBE-MSC-MVs), and MVs from MSCs treated with stroke-injured rat brain extract (SBE-MSC-MVs), and tested their effects on ischemic brain injury induced by permanent middle cerebral artery occlusion (pMCAO) in rats. NBE-MSC-MVs and SBE-MSC-MVs had significantly greater efficacy than MSC-MVs for ameliorating ischemic brain injury with improved functional recovery. We found similar profiles of key signalling proteins in NBE-MSC-MVs and SBE-MSC-MVs, which account for their similar therapeutic efficacies. Immunohistochemical analyses suggest that brain-extract—treated MSC-MVs reduce inflammation, enhance angiogenesis, and increase endogenous neurogenesis in the rat brain. We performed mass spectrometry proteomic analyses and found that the total proteomes of brain-extract—treated MSC-MVs are highly enriched for known vesicular proteins. Notably, MSC-MV proteins upregulated by brain extracts tend to be modular for tissue repair pathways. We suggest that MSC-MV proteins stimulated by the brain microenvironment are paracrine effectors that enhance MSC therapy for stroke injury.
Collapse
|
35
|
Na S, Payne SH, Bandeira N. Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks. Mol Cell Proteomics 2016; 15:3501-3512. [PMID: 27609420 PMCID: PMC5098046 DOI: 10.1074/mcp.o116.060913] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2016] [Indexed: 11/25/2022] Open
Abstract
Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching. We address this challenge with the spectral networks approach to (1) match spectra of orthologous peptides across multiple related species and then (2) propagate peptide annotations from identified to unidentified spectra. We here present algorithms to assess the statistical significance of spectral alignments (Align-GF), reduce the impurity in spectral networks, and accurately estimate the error rate in propagated identifications. Analyzing three related Cyanothece species, a model organism for biohydrogen production, spectral networks identified peptides from highly divergent sequences from networks with dozens of variant peptides, including thousands of peptides in species lacking a sequenced genome. Our analysis further detected the presence of many novel putative peptides even in genomically characterized species, thus suggesting the possibility of gaps in our understanding of their proteomic and genomic expression. A web-based pipeline for spectral networks analysis is available at http://proteomics.ucsd.edu/software.
Collapse
Affiliation(s)
- Seungjin Na
- From the ‡Dept. of Computer Science and Engineering, University of California, San Diego, La Jolla, California, 92093.,§Center for Computational Mass Spectrometry, University of California, San Diego, La Jolla, California, 92093
| | - Samuel H Payne
- ¶Pacific Northwest National Laboratory, Richland, Washington 99354
| | - Nuno Bandeira
- From the ‡Dept. of Computer Science and Engineering, University of California, San Diego, La Jolla, California, 92093; .,§Center for Computational Mass Spectrometry, University of California, San Diego, La Jolla, California, 92093.,‖Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, 92093
| |
Collapse
|
36
|
Muth T, Renard BY, Martens L. Metaproteomic data analysis at a glance: advances in computational microbial community proteomics. Expert Rev Proteomics 2016; 13:757-69. [DOI: 10.1080/14789450.2016.1209418] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
37
|
Mitchell CJ, Kim MS, Na CH, Pandey A. PyQuant: A Versatile Framework for Analysis of Quantitative Mass Spectrometry Data. Mol Cell Proteomics 2016; 15:2829-38. [PMID: 27231314 DOI: 10.1074/mcp.o115.056879] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Indexed: 12/14/2022] Open
Abstract
Quantitative mass spectrometry data necessitates an analytical pipeline that captures the accuracy and comprehensiveness of the experiments. Currently, data analysis is often coupled to specific software packages, which restricts the analysis to a given workflow and precludes a more thorough characterization of the data by other complementary tools. To address this, we have developed PyQuant, a cross-platform mass spectrometry data quantification application that is compatible with existing frameworks and can be used as a stand-alone quantification tool. PyQuant supports most types of quantitative mass spectrometry data including SILAC, NeuCode, (15)N, (13)C, or (18)O and chemical methods such as iTRAQ or TMT and provides the option of adding custom labeling strategies. In addition, PyQuant can perform specialized analyses such as quantifying isotopically labeled samples where the label has been metabolized into other amino acids and targeted quantification of selected ions independent of spectral assignment. PyQuant is capable of quantifying search results from popular proteomic frameworks such as MaxQuant, Proteome Discoverer, and the Trans-Proteomic Pipeline in addition to several standalone search engines. We have found that PyQuant routinely quantifies a greater proportion of spectral assignments, with increases ranging from 25-45% in this study. Finally, PyQuant is capable of complementing spectral assignments between replicates to quantify ions missed because of lack of MS/MS fragmentation or that were omitted because of issues such as spectra quality or false discovery rates. This results in an increase of biologically useful data available for interpretation. In summary, PyQuant is a flexible mass spectrometry data quantification platform that is capable of interfacing with a variety of existing formats and is highly customizable, which permits easy configuration for custom analysis.
Collapse
Affiliation(s)
- Christopher J Mitchell
- From the ‡McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205; §§Ginkgo Bioworks, 27 Drydock Ave, Boston, MA 02210, USA
| | - Min-Sik Kim
- From the ‡McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205; ‖Department of Applied Chemistry, Kyung Hee University, Yongin, Gyeonggi, South Korea
| | - Chan Hyun Na
- From the ‡McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205
| | - Akhilesh Pandey
- From the ‡McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205; §Departments of Biological Chemistry, Pathology and Oncology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205;
| |
Collapse
|
38
|
Kremer LPM, Leufken J, Oyunchimeg P, Schulze S, Fufezan C. Ursgal, Universal Python Module Combining Common Bottom-Up Proteomics Tools for Large-Scale Analysis. J Proteome Res 2016; 15:788-94. [PMID: 26709623 DOI: 10.1021/acs.jproteome.5b00860] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Proteomics data integration has become a broad field with a variety of programs offering innovative algorithms to analyze increasing amounts of data. Unfortunately, this software diversity leads to many problems as soon as the data is analyzed using more than one algorithm for the same task. Although it was shown that the combination of multiple peptide identification algorithms yields more robust results, it is only recently that unified approaches are emerging; however, workflows that, for example, aim to optimize search parameters or that employ cascaded style searches can only be made accessible if data analysis becomes not only unified but also and most importantly scriptable. Here we introduce Ursgal, a Python interface to many commonly used bottom-up proteomics tools and to additional auxiliary programs. Complex workflows can thus be composed using the Python scripting language using a few lines of code. Ursgal is easily extensible, and we have made several database search engines (X!Tandem, OMSSA, MS-GF+, Myrimatch, MS Amanda), statistical postprocessing algorithms (qvality, Percolator), and one algorithm that combines statistically postprocessed outputs from multiple search engines ("combined FDR") accessible as an interface in Python. Furthermore, we have implemented a new algorithm ("combined PEP") that combines multiple search engines employing elements of "combined FDR", PeptideShaker, and Bayes' theorem.
Collapse
Affiliation(s)
- Lukas P M Kremer
- Institute of Plant Biology and Biotechnology, University of Muenster , Schlossplatz 8, 48143 Münster, Germany
| | - Johannes Leufken
- Institute of Plant Biology and Biotechnology, University of Muenster , Schlossplatz 8, 48143 Münster, Germany
| | - Purevdulam Oyunchimeg
- Institute of Plant Biology and Biotechnology, University of Muenster , Schlossplatz 8, 48143 Münster, Germany
| | - Stefan Schulze
- Institute of Plant Biology and Biotechnology, University of Muenster , Schlossplatz 8, 48143 Münster, Germany
| | - Christian Fufezan
- Institute of Plant Biology and Biotechnology, University of Muenster , Schlossplatz 8, 48143 Münster, Germany
| |
Collapse
|
39
|
Phanse S, Wan C, Borgeson B, Tu F, Drew K, Clark G, Xiong X, Kagan O, Kwan J, Bezginov A, Chessman K, Pal S, Cromar G, Papoulas O, Ni Z, Boutz DR, Stoilova S, Havugimana PC, Guo X, Malty RH, Sarov M, Greenblatt J, Babu M, Derry WB, Tillier ER, Wallingford JB, Parkinson J, Marcotte EM, Emili A. Proteome-wide dataset supporting the study of ancient metazoan macromolecular complexes. Data Brief 2015; 6:715-21. [PMID: 26870755 PMCID: PMC4738005 DOI: 10.1016/j.dib.2015.11.062] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Revised: 11/17/2015] [Accepted: 11/23/2015] [Indexed: 01/08/2023] Open
Abstract
Our analysis examines the conservation of multiprotein complexes among metazoa through use of high resolution biochemical fractionation and precision mass spectrometry applied to soluble cell extracts from 5 representative model organisms Caenorhabditis elegans, Drosophila melanogaster, Mus musculus, Strongylocentrotus purpuratus, and Homo sapiens. The interaction network obtained from the data was validated globally in 4 distant species (Xenopus laevis, Nematostella vectensis, Dictyostelium discoideum, Saccharomyces cerevisiae) and locally by targeted affinity-purification experiments. Here we provide details of our massive set of supporting biochemical fractionation data available via ProteomeXchange (PXD002319-PXD002328), PPIs via BioGRID (185267); and interaction network projections via (http://metazoa.med.utoronto.ca) made fully accessible to allow further exploration. The datasets here are related to the research article on metazoan macromolecular complexes in Nature [1].
Collapse
Affiliation(s)
- Sadhna Phanse
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Cuihong Wan
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada; Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA
| | - Blake Borgeson
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Fan Tu
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA
| | - Kevin Drew
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA
| | - Greg Clark
- Department of Medical Biophysics, Toronto, Ontario, Canada
| | - Xuejian Xiong
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada; Hospital for Sick Children, Toronto, Ontario, Canada
| | - Olga Kagan
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Julian Kwan
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | | | - Kyle Chessman
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada; Hospital for Sick Children, Toronto, Ontario, Canada
| | - Swati Pal
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada; Hospital for Sick Children, Toronto, Ontario, Canada
| | - Graham Cromar
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada; Hospital for Sick Children, Toronto, Ontario, Canada
| | - Ophelia Papoulas
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA
| | - Zuyao Ni
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Daniel R Boutz
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA
| | - Snejana Stoilova
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Pierre C Havugimana
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Xinghua Guo
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| | - Ramy H Malty
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - Mihail Sarov
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Jack Greenblatt
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Mohan Babu
- Department of Biochemistry, University of Regina, Regina, Saskatchewan, Canada
| | - W Brent Derry
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada; Hospital for Sick Children, Toronto, Ontario, Canada
| | | | - John B Wallingford
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA; Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA
| | - John Parkinson
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada; Hospital for Sick Children, Toronto, Ontario, Canada
| | - Edward M Marcotte
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX, USA; Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA
| | - Andrew Emili
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
40
|
Holl S, Mohammed Y, Zimmermann O, Palmblad M. Scientific workflow optimization for improved peptide and protein identification. BMC Bioinformatics 2015; 16:284. [PMID: 26335531 PMCID: PMC4558836 DOI: 10.1186/s12859-015-0714-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2015] [Accepted: 08/24/2015] [Indexed: 01/18/2023] Open
Abstract
Background Peptide-spectrum matching is a common step in most data processing workflows for mass spectrometry-based proteomics. Many algorithms and software packages, both free and commercial, have been developed to address this task. However, these algorithms typically require the user to select instrument- and sample-dependent parameters, such as mass measurement error tolerances and number of missed enzymatic cleavages. In order to select the best algorithm and parameter set for a particular dataset, in-depth knowledge about the data as well as the algorithms themselves is needed. Most researchers therefore tend to use default parameters, which are not necessarily optimal. Results We have applied a new optimization framework for the Taverna scientific workflow management system (http://ms-utils.org/Taverna_Optimization.pdf) to find the best combination of parameters for a given scientific workflow to perform peptide-spectrum matching. The optimizations themselves are non-trivial, as demonstrated by several phenomena that can be observed when allowing for larger mass measurement errors in sequence database searches. On-the-fly parameter optimization embedded in scientific workflow management systems enables experts and non-experts alike to extract the maximum amount of information from the data. The same workflows could be used for exploring the parameter space and compare algorithms, not only for peptide-spectrum matching, but also for other tasks, such as retention time prediction. Conclusion Using the optimization framework, we were able to learn about how the data was acquired as well as the explored algorithms. We observed a phenomenon identifying many ammonia-loss b-ion spectra as peptides with N-terminal pyroglutamate and a large precursor mass measurement error. These insights could only be gained with the extension of the common range for the mass measurement error tolerance parameters explored by the optimization framework.
Collapse
Affiliation(s)
- Sonja Holl
- Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich, 52425, Jülich, Germany.
| | - Yassene Mohammed
- Center for Proteomics and Metabolomics, Leiden University Medical Center, PO Box 9600, 2300, RC, Leiden, The Netherlands.
| | - Olav Zimmermann
- Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich, 52425, Jülich, Germany.
| | - Magnus Palmblad
- Center for Proteomics and Metabolomics, Leiden University Medical Center, PO Box 9600, 2300, RC, Leiden, The Netherlands.
| |
Collapse
|
41
|
Wen B, Du C, Li G, Ghali F, Jones AR, Käll L, Xu S, Zhou R, Ren Z, Feng Q, Xu X, Wang J. IPeak: An open source tool to combine results from multiple MS/MS search engines. Proteomics 2015; 15:2916-20. [PMID: 25951428 DOI: 10.1002/pmic.201400208] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2014] [Revised: 03/08/2015] [Accepted: 04/30/2015] [Indexed: 12/21/2022]
Abstract
Liquid chromatography coupled tandem mass spectrometry (LC-MS/MS) is an important technique for detecting peptides in proteomics studies. Here, we present an open source software tool, termed IPeak, a peptide identification pipeline that is designed to combine the Percolator post-processing algorithm and multi-search strategy to enhance the sensitivity of peptide identifications without compromising accuracy. IPeak provides a graphical user interface (GUI) as well as a command-line interface, which is implemented in JAVA and can work on all three major operating system platforms: Windows, Linux/Unix and OS X. IPeak has been designed to work with the mzIdentML standard from the Proteomics Standards Initiative (PSI) as an input and output, and also been fully integrated into the associated mzidLibrary project, providing access to the overall pipeline, as well as modules for calling Percolator on individual search engine result files. The integration thus enables IPeak (and Percolator) to be used in conjunction with any software packages implementing the mzIdentML data standard. IPeak is freely available and can be downloaded under an Apache 2.0 license at https://code.google.com/p/mzidentml-lib/.
Collapse
Affiliation(s)
- Bo Wen
- BGI-Shenzhen, Shenzhen, P. R. China.,BGI Cognitive Genomics Lab, Shenzhen, P. R. China
| | | | | | - Fawaz Ghali
- Institute of Integrative Biology, University of Liverpool, Liverpool, UK
| | - Andrew R Jones
- Institute of Integrative Biology, University of Liverpool, Liverpool, UK
| | - Lukas Käll
- Science for Life Laboratory, School of Biotechnology, Royal Institute of Technology - KTH, Solna, Sweden.,Swedish e-Science Research Center, Royal Institute of Technology - KTH, Solna, Sweden
| | | | - Ruo Zhou
- BGI-Shenzhen, Shenzhen, P. R. China
| | - Zhe Ren
- BGI-Shenzhen, Shenzhen, P. R. China
| | - Qiang Feng
- BGI-Shenzhen, Shenzhen, P. R. China.,Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Xun Xu
- BGI-Shenzhen, Shenzhen, P. R. China
| | - Jun Wang
- BGI-Shenzhen, Shenzhen, P. R. China.,Department of Biology, University of Copenhagen, Copenhagen, Denmark.,Princess Al Jawhara Center of Excellence in the Research of Hereditary Disorders, King Abdulaziz University, Jeddah, Saudi Arabia.,Macau University of Science and Technology, Taipa, P. R. China.,Department of Medicine, University of Hong Kong, Hong Kong, P. R. China
| |
Collapse
|
42
|
Uszkoreit J, Maerkens A, Perez-Riverol Y, Meyer HE, Marcus K, Stephan C, Kohlbacher O, Eisenacher M. PIA: An Intuitive Protein Inference Engine with a Web-Based User Interface. J Proteome Res 2015; 14:2988-97. [DOI: 10.1021/acs.jproteome.5b00121] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Affiliation(s)
- Julian Uszkoreit
- Medizinisches Proteom-Center, Ruhr-Universität Bochum, 44801 Bochum, Germany
| | - Alexandra Maerkens
- Medizinisches Proteom-Center, Ruhr-Universität Bochum, 44801 Bochum, Germany
| | | | - Helmut E. Meyer
- Medizinisches Proteom-Center, Ruhr-Universität Bochum, 44801 Bochum, Germany
| | - Katrin Marcus
- Medizinisches Proteom-Center, Ruhr-Universität Bochum, 44801 Bochum, Germany
| | - Christian Stephan
- Medizinisches Proteom-Center, Ruhr-Universität Bochum, 44801 Bochum, Germany
| | - Oliver Kohlbacher
- Medizinisches Proteom-Center, Ruhr-Universität Bochum, 44801 Bochum, Germany
| | - Martin Eisenacher
- Medizinisches Proteom-Center, Ruhr-Universität Bochum, 44801 Bochum, Germany
| |
Collapse
|
43
|
Lee DCH, Jones AR, Hubbard SJ. Computational phosphoproteomics: from identification to localization. Proteomics 2015; 15:950-63. [PMID: 25475148 PMCID: PMC4384807 DOI: 10.1002/pmic.201400372] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Revised: 10/31/2014] [Accepted: 11/26/2014] [Indexed: 01/08/2023]
Abstract
Analysis of the phosphoproteome by MS has become a key technology for the characterization of dynamic regulatory processes in the cell, since kinase and phosphatase action underlie many major biological functions. However, the addition of a phosphate group to a suitable side chain often confounds informatic analysis by generating product ion spectra that are more difficult to interpret (and consequently identify) relative to unmodified peptides. Collectively, these challenges have motivated bioinformaticians to create novel software tools and pipelines to assist in the identification of phosphopeptides in proteomic mixtures, and help pinpoint or "localize" the most likely site of modification in cases where there is ambiguity. Here we review the challenges to be met and the informatics solutions available to address them for phosphoproteomic analysis, as well as highlighting the difficulties associated with using them and the implications for data standards.
Collapse
Affiliation(s)
- Dave C H Lee
- Faculty of Life Sciences, University of ManchesterManchester, UK
| | - Andrew R Jones
- Institute of Integrative Biology, University of LiverpoolLiverpool, UK
| | - Simon J Hubbard
- Faculty of Life Sciences, University of ManchesterManchester, UK
| |
Collapse
|
44
|
Malm EK, Srivastava V, Sundqvist G, Bulone V. APP: an Automated Proteomics Pipeline for the analysis of mass spectrometry data based on multiple open access tools. BMC Bioinformatics 2014; 15:441. [PMID: 25547515 PMCID: PMC4314934 DOI: 10.1186/s12859-014-0441-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2014] [Accepted: 12/18/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Mass spectrometry analyses of complex protein samples yield large amounts of data and specific expertise is needed for data analysis, in addition to a dedicated computer infrastructure. Furthermore, the identification of proteins and their specific properties require the use of multiple independent bioinformatics tools and several database search algorithms to process the same datasets. In order to facilitate and increase the speed of data analysis, there is a need for an integrated platform that would allow a comprehensive profiling of thousands of peptides and proteins in a single process through the simultaneous exploitation of multiple complementary algorithms. RESULTS We have established a new proteomics pipeline designated as APP that fulfills these objectives using a complete series of tools freely available from open sources. APP automates the processing of proteomics tasks such as peptide identification, validation and quantitation from LC-MS/MS data and allows easy integration of many separate proteomics tools. Distributed processing is at the core of APP, allowing the processing of very large datasets using any combination of Windows/Linux physical or virtual computing resources. CONCLUSIONS APP provides distributed computing nodes that are simple to set up, greatly relieving the need for separate IT competence when handling large datasets. The modular nature of APP allows complex workflows to be managed and distributed, speeding up throughput and setup. Additionally, APP logs execution information on all executed tasks and generated results, simplifying information management and validation.
Collapse
Affiliation(s)
- Erik K Malm
- Division of Glycoscience, School of Biotechnology, Royal Institute of Technology (KTH), AlbaNova University Centre, Stockholm, Sweden.
| | - Vaibhav Srivastava
- Division of Glycoscience, School of Biotechnology, Royal Institute of Technology (KTH), AlbaNova University Centre, Stockholm, Sweden.
| | - Gustav Sundqvist
- Division of Glycoscience, School of Biotechnology, Royal Institute of Technology (KTH), AlbaNova University Centre, Stockholm, Sweden.
| | - Vincent Bulone
- Division of Glycoscience, School of Biotechnology, Royal Institute of Technology (KTH), AlbaNova University Centre, Stockholm, Sweden.
| |
Collapse
|
45
|
MS-GF+ makes progress towards a universal database search tool for proteomics. Nat Commun 2014; 5:5277. [PMID: 25358478 PMCID: PMC5036525 DOI: 10.1038/ncomms6277] [Citation(s) in RCA: 736] [Impact Index Per Article: 73.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2014] [Accepted: 09/16/2014] [Indexed: 02/06/2023] Open
Abstract
Mass spectrometry (MS) instruments and experimental protocols are rapidly advancing, but the software tools to analyze tandem mass spectra are lagging behind. We present a database search tool MS-GF+ that is sensitive (it identifies more peptides than most other database search tools) and universal (it works well for diverse types of spectra, different configurations of MS instruments and different experimental protocols). We benchmark MS-GF+ using diverse spectral datasets: (i) spectra of varying fragmentation methods; (ii) spectra of multiple enzyme digests; (iii) spectra of phosphorylated peptides; (iv) spectra of peptides with unusual fragmentation propensities produced by a novel alpha-lytic protease. For all these datasets, MS-GF+ significantly increases the number of identified peptides compared to commonly used methods for peptide identifications. We emphasize that while MS-GF+ is not specifically designed for any particular experimental set-up, it improves upon the performance of tools specifically designed for these applications (e.g., specialized tools for phosphoproteomics).
Collapse
|
46
|
Kwon T, Huse HK, Vogel C, Whiteley M, Marcotte EM. Protein-to-mRNA ratios are conserved between Pseudomonas aeruginosa strains. J Proteome Res 2014; 13:2370-80. [PMID: 24742327 PMCID: PMC4012837 DOI: 10.1021/pr4011684] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Recent studies have shown that the concentrations of proteins expressed from orthologous genes are often conserved across organisms and to a greater extent than the abundances of the corresponding mRNAs. However, such studies have not distinguished between evolutionary (e.g., sequence divergence) and environmental (e.g., growth condition) effects on the regulation of steady-state protein and mRNA abundances. Here, we systematically investigated the transcriptome and proteome of two closely related Pseudomonas aeruginosa strains, PAO1 and PA14, under identical experimental conditions, thus controlling for environmental effects. For 703 genes observed by both shotgun proteomics and microarray experiments, we found that the protein-to-mRNA ratios are highly correlated between orthologous genes in the two strains to an extent comparable to protein and mRNA abundances. In spite of this high molecular similarity between PAO1 and PA14, we found that several metabolic, virulence, and antibiotic resistance genes are differentially expressed between the two strains, mostly at the protein but not at the mRNA level. Our data demonstrate that the magnitude and direction of the effect of protein abundance regulation occurring after the setting of mRNA levels is conserved between bacterial strains and is important for explaining the discordance between mRNA and protein abundances.
Collapse
Affiliation(s)
- Taejoon Kwon
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin , 2500 Speedway, Austin, Texas 78712, United States
| | | | | | | | | |
Collapse
|
47
|
Goh WWB, Wong L. Computational proteomics: designing a comprehensive analytical strategy. Drug Discov Today 2014; 19:266-74. [DOI: 10.1016/j.drudis.2013.07.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2013] [Revised: 06/28/2013] [Accepted: 07/11/2013] [Indexed: 02/02/2023]
|
48
|
O'Connell JD, Tsechansky M, Royal A, Boutz DR, Ellington AD, Marcotte EM. A proteomic survey of widespread protein aggregation in yeast. MOLECULAR BIOSYSTEMS 2014; 10:851-861. [PMID: 24488121 DOI: 10.1039/c3mb70508k] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Many normally cytosolic yeast proteins form insoluble intracellular bodies in response to nutrient depletion, suggesting the potential for widespread protein aggregation in stressed cells. Nearly 200 such bodies have been found in yeast by screening libraries of fluorescently tagged proteins. In order to more broadly characterize the formation of these bodies in response to stress, we employed a proteome-wide shotgun mass spectrometry assay in order to measure shifts in the intracellular solubilities of endogenous proteins following heat stress. As quantified by mass spectrometry, heat stress tended to shift the same proteins into insoluble form as did nutrient depletion; many of these proteins were also known to form foci in response to arsenic stress. Affinity purification of several foci-forming proteins showed enrichment for co-purifying chaperones, including Hsp90 chaperones. Tests of induction conditions and co-localization of metabolic enzymes participating in the same metabolic pathways suggested those foci did not correspond to multi-enzyme organizing centers. Thus, in yeast, the formation of stress bodies appears common across diverse, normally diffuse cytoplasmic proteins and is induced by multiple types of cell stress, including thermal, chemical, and nutrient stress.
Collapse
Affiliation(s)
- Jeremy D O'Connell
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas, United States of America.,Department of Molecular Biosciences, University of Texas at Austin, Austin, Texas, United States of America
| | - Mark Tsechansky
- Department of Chemistry, Cambridge University, Lensfield Road, Cambridge, UK
| | - Ariel Royal
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas, United States of America
| | - Daniel R Boutz
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas, United States of America
| | - Andrew D Ellington
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas, United States of America.,Department of Molecular Biosciences, University of Texas at Austin, Austin, Texas, United States of America
| | - Edward M Marcotte
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas, United States of America.,Department of Molecular Biosciences, University of Texas at Austin, Austin, Texas, United States of America
| |
Collapse
|
49
|
Wang J, Anania VG, Knott J, Rush J, Lill JR, Bourne PE, Bandeira N. A turn-key approach for large-scale identification of complex posttranslational modifications. J Proteome Res 2014; 13:1190-9. [PMID: 24437954 PMCID: PMC3993922 DOI: 10.1021/pr400368u] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The conjugation of complex post-translational modifications (PTMs) such as glycosylation and Small Ubiquitin-like Modification (SUMOylation) to a substrate protein can substantially change the resulting peptide fragmentation pattern compared to its unmodified counterpart, making current database search methods inappropriate for the identification of tandem mass (MS/MS) spectra from such modified peptides. Traditionally it has been difficult to develop new algorithms to identify these atypical peptides because of the lack of a large set of annotated spectra from which to learn the altered fragmentation pattern. Using SUMOylation as an example, we propose a novel approach to generate large MS/MS training data from modified peptides and derive an algorithm that learns properties of PTM-specific fragmentation from such training data. Benchmark tests on data sets of varying complexity show that our method is 80-300% more sensitive than current state-of-the-art approaches. The core concepts of our method are readily applicable to developing algorithms for the identifications of peptides with other complex PTMs.
Collapse
Affiliation(s)
- Jian Wang
- Bioinformatics Program, ∥Skaggs School of Pharmacy and Pharmaceutical Sciences, ⊥Center for Computational Mass Spectrometry, and ¶Department of Computer Science and Engineering, University of California, San Diego , La Jolla, California 92093, United States
| | | | | | | | | | | | | |
Collapse
|
50
|
Vlasblom J, Jin K, Kassir S, Babu M. Exploring mitochondrial system properties of neurodegenerative diseases through interactome mapping. J Proteomics 2013; 100:8-24. [PMID: 24262152 DOI: 10.1016/j.jprot.2013.11.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2013] [Revised: 10/08/2013] [Accepted: 11/06/2013] [Indexed: 12/20/2022]
Abstract
UNLABELLED Mitochondria are double membraned, dynamic organelles that are required for a large number of cellular processes, and defects in their function have emerged as causative factors for a growing number of human disorders and are highly associated with cancer, metabolic, and neurodegenerative (ND) diseases. Biochemical and genetic investigations have uncovered small numbers of candidate mitochondrial proteins (MPs) involved in ND disease, but given the diversity of processes affected by MP function and the difficulty of detecting interactions involving these proteins, many more likely remain unknown. However, high-throughput proteomic and genomic approaches developed in genetically tractable model prokaryotes and lower eukaryotes have proven to be effective tools for querying the physical (protein-protein) and functional (gene-gene) relationships between diverse types of proteins, including cytosolic and membrane proteins. In this review, we highlight how experimental and computational approaches developed recently by our group and others can be effectively used towards elucidating the mitochondrial interactome in an unbiased and systematic manner to uncover network-based connections. We discuss how the knowledge from the resulting interaction networks can effectively contribute towards the identification of new mitochondrial disease gene candidates, and thus further clarify the role of mitochondrial biology and the complex etiologies of ND disease. BIOLOGICAL SIGNIFICANCE Biochemical and genetic investigations have uncovered small numbers of candidate mitochondrial proteins (MPs) involved in neurodegenerative (ND) diseases, but given the diversity of processes affected by MP function and the difficulty of detecting interactions involving these proteins, many more likely remain unknown. Large-scale proteomic and genomic approaches developed in model prokaryotes and lower eukaryotes have proven to be effective tools for querying the physical (protein-protein) and functional (gene-gene) relationships between diverse types of proteins. Extension of this new framework to the mitochondrial sub-system in human will likewise provide a universally informative systems-level view of the physical and functional landscape for exploring the evolutionary principles underlying mitochondrial function. In this review, we highlight how experimental and computational approaches developed recently by our group and others can be effectively used towards elucidating the mitochondrial interactome in an unbiased and systematic manner to uncover network-based connections. We anticipate that the knowledge from these resulting interaction networks can effectively contribute towards the identification of new mitochondrial disease gene candidates, and thus foster a deeper molecular understanding of mitochondrial biology as well as the etiology of mitochondrial diseases. This article is part of a Special Issue: Can Proteomics Fill the Gap Between Genomics and Phenotypes?
Collapse
Affiliation(s)
- James Vlasblom
- Department of Biochemistry, Research and Innovation Centre, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Ke Jin
- Department of Biochemistry, Research and Innovation Centre, University of Regina, Regina, Saskatchewan S4S 0A2, Canada; Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Terrence Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Sandy Kassir
- Department of Biochemistry, Research and Innovation Centre, University of Regina, Regina, Saskatchewan S4S 0A2, Canada
| | - Mohan Babu
- Department of Biochemistry, Research and Innovation Centre, University of Regina, Regina, Saskatchewan S4S 0A2, Canada.
| |
Collapse
|