1
|
Kattan RE, Ayesh D, Wang W. Analysis of affinity purification-related proteomic data for studying protein-protein interaction networks in cells. Brief Bioinform 2023; 24:bbad010. [PMID: 36682002 PMCID: PMC10025443 DOI: 10.1093/bib/bbad010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 12/22/2022] [Accepted: 01/02/2023] [Indexed: 01/23/2023] Open
Abstract
During intracellular signal transduction, protein-protein interactions (PPIs) facilitate protein complex assembly to regulate protein localization and function, which are critical for numerous cellular events. Over the years, multiple techniques have been developed to characterize PPIs to elucidate roles and regulatory mechanisms of proteins. Among them, the mass spectrometry (MS)-based interactome analysis has been increasing in popularity due to its unbiased and informative manner towards understanding PPI networks. However, with MS instrumentation advancing and yielding more data than ever, the analysis of a large amount of PPI-associated proteomic data to reveal bona fide interacting proteins become challenging. Here, we review the methods and bioinformatic resources that are commonly used in analyzing large interactome-related proteomic data and propose a simple guideline for identifying novel interacting proteins for biological research.
Collapse
Affiliation(s)
- Rebecca Elizabeth Kattan
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA 92697, USA
| | - Deena Ayesh
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA 92697, USA
| | - Wenqi Wang
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA 92697, USA
| |
Collapse
|
2
|
Dichtl S, Sanin DE, Koss CK, Willenborg S, Petzold A, Tanzer MC, Dahl A, Kabat AM, Lindenthal L, Zeitler L, Satzinger S, Strasser A, Mann M, Roers A, Eming SA, El Kasmi KC, Pearce EJ, Murray PJ. Gene-selective transcription promotes the inhibition of tissue reparative macrophages by TNF. Life Sci Alliance 2022; 5:5/4/e202101315. [PMID: 35027468 PMCID: PMC8761491 DOI: 10.26508/lsa.202101315] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 01/03/2022] [Accepted: 01/04/2022] [Indexed: 12/24/2022] Open
Abstract
Pro-inflammatory TNF is a highly gene-selective inhibitor of the gene expression program of tissue repair and wound healing macrophages. Anti-TNF therapies are a core anti-inflammatory approach for chronic diseases such as rheumatoid arthritis and Crohn’s Disease. Previously, we and others found that TNF blocks the emergence and function of alternative-activated or M2 macrophages involved in wound healing and tissue-reparative functions. Conceivably, anti-TNF drugs could mediate their protective effects in part by an altered balance of macrophage activity. To understand the mechanistic basis of how TNF regulates tissue-reparative macrophages, we used RNAseq, scRNAseq, ATACseq, time-resolved phospho-proteomics, gene-specific approaches, metabolic analysis, and signaling pathway deconvolution. We found that TNF controls tissue-reparative macrophage gene expression in a highly gene-specific way, dependent on JNK signaling via the type 1 TNF receptor on specific populations of alternative-activated macrophages. We further determined that JNK signaling has a profound and broad effect on activated macrophage gene expression. Our findings suggest that TNF’s anti-M2 effects evolved to specifically modulate components of tissue and reparative M2 macrophages and TNF is therefore a context-specific modulator of M2 macrophages rather than a pan-M2 inhibitor.
Collapse
Affiliation(s)
| | - David E Sanin
- Department of Immunometabolism, Max Planck Institute for Immunobiology and Epigenetics, Freiburg, Germany.,The Bloomberg∼Kimmel Institute for Cancer Immunotherapy at Johns Hopkins, Johns Hopkins University, Baltimore, MD, USA
| | - Carolin K Koss
- Boehringer Ingelheim Pharma GmbH and Co KG, Biberach, Germany
| | | | - Andreas Petzold
- Deep Sequencing Group, Biotechnology Center, Technische Universität Dresden, Dresden, Germany
| | - Maria C Tanzer
- Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Andreas Dahl
- Deep Sequencing Group, Biotechnology Center, Technische Universität Dresden, Dresden, Germany
| | - Agnieszka M Kabat
- Department of Immunometabolism, Max Planck Institute for Immunobiology and Epigenetics, Freiburg, Germany.,The Bloomberg∼Kimmel Institute for Cancer Immunotherapy at Johns Hopkins, Johns Hopkins University, Baltimore, MD, USA
| | | | - Leonie Zeitler
- Max Planck Institute of Biochemistry, Martinsried, Germany
| | | | | | - Matthias Mann
- Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Axel Roers
- Institute for Immunology, Medical Faculty Carl Gustav Carus, TU Dresden, Dresden, Germany
| | - Sabine A Eming
- Department of Dermatology, University of Cologne, Cologne, Germany.,Center for Molecular Medicine Cologne, University of Cologne, Cologne, Germany.,Cologne Excellence Cluster on Cellular Stress Responses in Aging-Associated Diseases, University of Cologne, Cologne, Germany.,Institute of Zoology, Developmental Biology Unit, University of Cologne, Cologne, Germany
| | | | - Edward J Pearce
- Department of Immunometabolism, Max Planck Institute for Immunobiology and Epigenetics, Freiburg, Germany.,The Bloomberg∼Kimmel Institute for Cancer Immunotherapy at Johns Hopkins, Johns Hopkins University, Baltimore, MD, USA
| | - Peter J Murray
- Max Planck Institute of Biochemistry, Martinsried, Germany
| |
Collapse
|
3
|
Phosphoproteome profiling uncovers a key role for CDKs in TNF signaling. Nat Commun 2021; 12:6053. [PMID: 34663829 PMCID: PMC8523534 DOI: 10.1038/s41467-021-26289-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 09/30/2021] [Indexed: 11/24/2022] Open
Abstract
Tumor necrosis factor (TNF) is one of the few cytokines successfully targeted by therapies against inflammatory diseases. However, blocking this well studied and pleiotropic ligand can cause dramatic side-effects. Here, we reason that a systems-level proteomic analysis of TNF signaling could dissect its diverse functions and offer a base for developing more targeted therapies. Therefore, we combine phosphoproteomics time course experiments with subcellular localization and kinase inhibitor analysis to identify functional modules of protein phosphorylation. The majority of regulated phosphorylation events can be assigned to an upstream kinase by inhibiting master kinases. Spatial proteomics reveals phosphorylation-dependent translocations of hundreds of proteins upon TNF stimulation. Phosphoproteome analysis of TNF-induced apoptosis and necroptosis uncovers a key role for transcriptional cyclin-dependent kinase activity to promote cytokine production and prevent excessive cell death downstream of the TNF signaling receptor. This resource of TNF-induced pathways and sites can be explored at http://tnfviewer.biochem.mpg.de/. Tumor necrosis factor (TNF) has various effects on phosphorylation-mediated cellular signaling. Combining phosphoproteomics, subcellular localization analyses and kinase inhibitor assays, the authors provide systems level insights into TNF signaling and identify modulators of TNF-induced cell death.
Collapse
|
4
|
Gamble J, Chick J, Seltzer K, Graber JH, Gygi S, Braun RE, Snyder EM. An expanded mouse testis transcriptome and mass spectrometry defines novel proteins. Reproduction 2020; 159:15-26. [PMID: 31677600 DOI: 10.1530/rep-19-0092] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 10/31/2019] [Indexed: 12/18/2022]
Abstract
The testis transcriptome is exceptionally complex. Despite its complexity, previous testis transcriptome analyses relied on a reductive method for transcript identification, thus underestimating transcriptome complexity. We describe here a more complete testis transcriptome generated by combining Tuxedo, a reductive method, and spliced-RUM, a combinatorial transcript-building approach. Forty-two percent of the expanded testis transcriptome is composed of unannotated RNAs with novel isoforms of known genes and novel genes constituting 78 and 9.8% of the newly discovered transcripts, respectively. Across tissues, novel transcripts were predominantly expressed in the testis with the exception of novel isoforms which were also highly expressed in the adult ovary. Within the testis, novel isoform expression was distributed equally across all cell types while novel genes were predominantly expressed in meiotic and post-meiotic germ cells. The majority of novel isoforms retained their protein-coding potential while most novel genes had low protein-coding potential. However, a subset of novel genes had protein-coding potentials equivalent to known protein-coding genes. Shotgun mass spectrometry of round spermatid total protein identified unique peptides from four novel genes along with seven annotated non-coding RNAs. These analyses demonstrate the testis expresses a wide range of novel transcripts that give rise to novel proteins.
Collapse
Affiliation(s)
- Jaya Gamble
- Department of Animal Sciences, Rutgers, The State University of New Jersey, New Brunswick, New Jersey, USA
| | - Joel Chick
- Department of Cell Biology, Harvard Medical School, Boston, Massachusetts, USA
| | - Kelly Seltzer
- Department of Animal Sciences, Rutgers, The State University of New Jersey, New Brunswick, New Jersey, USA
| | | | - Steven Gygi
- Department of Cell Biology, Harvard Medical School, Boston, Massachusetts, USA
| | | | - Elizabeth M Snyder
- Department of Animal Sciences, Rutgers, The State University of New Jersey, New Brunswick, New Jersey, USA
| |
Collapse
|
5
|
Pino LK, Searle BC, Bollinger JG, Nunn B, MacLean B, MacCoss MJ. The Skyline ecosystem: Informatics for quantitative mass spectrometry proteomics. MASS SPECTROMETRY REVIEWS 2020; 39:229-244. [PMID: 28691345 PMCID: PMC5799042 DOI: 10.1002/mas.21540] [Citation(s) in RCA: 441] [Impact Index Per Article: 110.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Accepted: 06/01/2017] [Indexed: 05/03/2023]
Abstract
Skyline is a freely available, open-source Windows client application for accelerating targeted proteomics experimentation, with an emphasis on the proteomics and mass spectrometry community as users and as contributors. This review covers the informatics encompassed by the Skyline ecosystem, from computationally assisted targeted mass spectrometry method development, to raw acquisition file data processing, and quantitative analysis and results sharing.
Collapse
Affiliation(s)
- Lindsay K Pino
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington
| | - Brian C Searle
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington
| | - James G Bollinger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington
| | - Brook Nunn
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington
| | - Brendan MacLean
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington
| | - Michael J MacCoss
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington
| |
Collapse
|
6
|
Poverennaya EV, Kiseleva OI, Ivanov AS, Ponomarenko EA. Methods of Computational Interactomics for Investigating Interactions of Human Proteoforms. BIOCHEMISTRY (MOSCOW) 2020; 85:68-79. [PMID: 32079518 DOI: 10.1134/s000629792001006x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Human genome contains ca. 20,000 protein-coding genes that could be translated into millions of unique protein species (proteoforms). Proteoforms coded by a single gene often have different functions, which implies different protein partners. By interacting with each other, proteoforms create a network reflecting the dynamics of cellular processes in an organism. Perturbations of protein-protein interactions change the network topology, which often triggers pathological processes. Studying proteoforms is a relatively new research area in proteomics, and this is why there are comparatively few experimental studies on the interaction of proteoforms. Bioinformatics tools can facilitate such studies by providing valuable complementary information to the experimental data and, in particular, expanding the possibilities of the studies of proteoform interactions.
Collapse
Affiliation(s)
| | - O I Kiseleva
- Institute of Biomedical Chemistry, Moscow, 119121, Russia
| | - A S Ivanov
- Institute of Biomedical Chemistry, Moscow, 119121, Russia
| | | |
Collapse
|
7
|
Tanzer MC, Frauenstein A, Stafford CA, Phulphagar K, Mann M, Meissner F. Quantitative and Dynamic Catalogs of Proteins Released during Apoptotic and Necroptotic Cell Death. Cell Rep 2020; 30:1260-1270.e5. [DOI: 10.1016/j.celrep.2019.12.079] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 11/07/2019] [Accepted: 12/19/2019] [Indexed: 12/16/2022] Open
|
8
|
Deng J, Ikenishi F, Smith N, Lazar IM. Streamlined microfluidic analysis of phosphopeptides using stable isotope-labeled synthetic peptides and MRM-MS detection. Electrophoresis 2018; 39:3171-3184. [PMID: 30216485 DOI: 10.1002/elps.201800133] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Revised: 09/05/2018] [Accepted: 09/06/2018] [Indexed: 11/07/2022]
Abstract
Modern high-throughput and high-content biological research is performed with advanced instrumentation and complex and time-consuming protocols, which, as a whole, pose a challenge for routine implementation in a research laboratory. In support of a "bioanalytical toolbox" with potential utility for exploring cellular functions mediated via protein phosphorylation-a post-translational modification (PTM) with essential regulatory roles in a variety of cellular processes-in this work, we describe the development of a simple, integrated microfluidic chip that can perform targeted, quantitative analysis of phosphopeptides involved in cancer-relevant signaling pathways. The microfluidic device comprises microreactors packed with C18 and TiO2 particles for on-chip solid phase extraction (SPE) and phosphopeptide enrichment, and an ESI interface for facilitating multiple reaction monitoring (MRM)-mass spectrometry (MS) detection. The chips are demonstrated for the detection of three phosphopeptides involved in ERBB2/MAPK signaling pathways, selected from the outcome of a proteomic study involving EGF stimulation of SKBR3/HER2+ breast cancer cells. The data demonstrate that the proposed microfluidic strategy can be used for the MS quantification of phosphopeptides in the low nM range from cell lysates without any prior sample pretreatment, fractionation or bioaffinity enrichment, and is generally applicable to the analysis of any phosphopeptide targets.
Collapse
Affiliation(s)
- Jingren Deng
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, USA
| | - Fumio Ikenishi
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, USA
| | - Nicole Smith
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, USA
| | - Iulia M Lazar
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, USA
| |
Collapse
|
9
|
Gianazza E, Banfi C. Post-translational quantitation by SRM/MRM: applications in cardiology. Expert Rev Proteomics 2018; 15:477-502. [DOI: 10.1080/14789450.2018.1484283] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Erica Gianazza
- Unit of Proteomics, Centro Cardiologico Monzino IRCCS, Milan, Italy
| | - Cristina Banfi
- Unit of Proteomics, Centro Cardiologico Monzino IRCCS, Milan, Italy
| |
Collapse
|
10
|
Mason KE, Hilmer JK, Maaty WS, Reeves BD, Grieco PA, Bothner B, Fischer AM. Proteomic comparison of near-isogenic barley (Hordeum vulgare L.) germplasm differing in the allelic state of a major senescence QTL identifies numerous proteins involved in plant pathogen defense. PLANT PHYSIOLOGY AND BIOCHEMISTRY : PPB 2016; 109:114-127. [PMID: 27665045 DOI: 10.1016/j.plaphy.2016.09.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2016] [Revised: 09/08/2016] [Accepted: 09/09/2016] [Indexed: 05/24/2023]
Abstract
Senescence is the last developmental phase of plant tissues, organs and, in the case of monocarpic senescence, entire plants. In monocarpic crops such as barley, it leads to massive remobilization of nitrogen and other nutrients to developing seeds. To further investigate this process, a proteomic comparison of flag leaves of near-isogenic late- and early-senescing barley germplasm was performed. Protein samples at 14 and 21 days past anthesis were analyzed using both two-dimensional gel-based and label-free quantitative mass spectrometry-based ('shotgun') proteomic techniques. This approach identified >9000 barley proteins, and one-third of them were quantified. Analysis focused on proteins that were significantly (p < 0.05; difference ≥1.5-fold) upregulated in early-senescing line '10_11' as compared to late-senescing variety 'Karl', as these may be functionally important for senescence. Proteins in this group included family 1 pathogenesis-related proteins, intracellular and membrane receptors or co-receptors (NBS-LRRs, LRR-RLKs), enzymes involved in attacking pathogen cell walls (glucanases), enzymes with possible roles in cuticle modification, and enzymes involved in DNA repair. Additionally, proteases and elements of the ubiquitin-proteasome system were upregulated in line '10_11', suggesting involvement of nitrogen remobilization and regulatory processes. Overall, the proteomic data highlight a correlation between early senescence and upregulated defense functions. This correlation emerges more clearly from the current proteomic data than from a previously performed transcriptomic comparison of 'Karl' and '10_11'. Our findings stress the value of studying biological systems at both the transcript and protein levels, and point to the importance of pathogen defense functions during developmental leaf senescence.
Collapse
Affiliation(s)
- Katelyn E Mason
- Chemistry and Biochemistry Department, Montana State University, Bozeman, MT 59717, United States
| | - Jonathan K Hilmer
- Chemistry and Biochemistry Department, Montana State University, Bozeman, MT 59717, United States; Proteomics, Metabolomics and Mass Spectrometry Facility, Montana State University, Bozeman, MT 59717, United States
| | - Walid S Maaty
- Chemistry and Biochemistry Department, Montana State University, Bozeman, MT 59717, United States
| | - Benjamin D Reeves
- Chemistry and Biochemistry Department, Montana State University, Bozeman, MT 59717, United States
| | - Paul A Grieco
- Chemistry and Biochemistry Department, Montana State University, Bozeman, MT 59717, United States
| | - Brian Bothner
- Chemistry and Biochemistry Department, Montana State University, Bozeman, MT 59717, United States; Proteomics, Metabolomics and Mass Spectrometry Facility, Montana State University, Bozeman, MT 59717, United States
| | - Andreas M Fischer
- Department of Plant Sciences and Plant Pathology, Montana State University, Bozeman, MT 59717, United States.
| |
Collapse
|
11
|
Pontes AH, de Sousa MV. Mass Spectrometry-Based Approaches to Understand the Molecular Basis of Memory. Front Chem 2016; 4:40. [PMID: 27790611 PMCID: PMC5064248 DOI: 10.3389/fchem.2016.00040] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 09/27/2016] [Indexed: 01/15/2023] Open
Abstract
The central nervous system is responsible for an array of cognitive functions such as memory, learning, language, and attention. These processes tend to take place in distinct brain regions; yet, they need to be integrated to give rise to adaptive or meaningful behavior. Since cognitive processes result from underlying cellular and molecular changes, genomics and transcriptomics assays have been applied to human and animal models to understand such events. Nevertheless, genes and RNAs are not the end products of most biological functions. In order to gain further insights toward the understanding of brain processes, the field of proteomics has been of increasing importance in the past years. Advancements in liquid chromatography-tandem mass spectrometry (LC-MS/MS) have enabled the identification and quantification of thousands of proteins with high accuracy and sensitivity, fostering a revolution in the neurosciences. Herein, we review the molecular bases of explicit memory in the hippocampus. We outline the principles of mass spectrometry (MS)-based proteomics, highlighting the use of this analytical tool to study memory formation. In addition, we discuss MS-based targeted approaches as the future of protein analysis.
Collapse
Affiliation(s)
- Arthur H Pontes
- Laboratory of Protein Chemistry and Biochemistry, Department of Cell Biology, University of Brasilia Brasilia, Brazil
| | - Marcelo V de Sousa
- Laboratory of Protein Chemistry and Biochemistry, Department of Cell Biology, University of Brasilia Brasilia, Brazil
| |
Collapse
|
12
|
Hoofnagle AN, Whiteaker JR, Carr SA, Kuhn E, Liu T, Massoni SA, Thomas SN, Townsend RR, Zimmerman LJ, Boja E, Chen J, Crimmins DL, Davies SR, Gao Y, Hiltke TR, Ketchum KA, Kinsinger CR, Mesri M, Meyer MR, Qian WJ, Schoenherr RM, Scott MG, Shi T, Whiteley GR, Wrobel JA, Wu C, Ackermann BL, Aebersold R, Barnidge DR, Bunk DM, Clarke N, Fishman JB, Grant RP, Kusebauch U, Kushnir MM, Lowenthal MS, Moritz RL, Neubert H, Patterson SD, Rockwood AL, Rogers J, Singh RJ, Van Eyk JE, Wong SH, Zhang S, Chan DW, Chen X, Ellis MJ, Liebler DC, Rodland KD, Rodriguez H, Smith RD, Zhang Z, Zhang H, Paulovich AG. Recommendations for the Generation, Quantification, Storage, and Handling of Peptides Used for Mass Spectrometry-Based Assays. Clin Chem 2016; 62:48-69. [PMID: 26719571 DOI: 10.1373/clinchem.2015.250563] [Citation(s) in RCA: 154] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
BACKGROUND For many years, basic and clinical researchers have taken advantage of the analytical sensitivity and specificity afforded by mass spectrometry in the measurement of proteins. Clinical laboratories are now beginning to deploy these work flows as well. For assays that use proteolysis to generate peptides for protein quantification and characterization, synthetic stable isotope-labeled internal standard peptides are of central importance. No general recommendations are currently available surrounding the use of peptides in protein mass spectrometric assays. CONTENT The Clinical Proteomic Tumor Analysis Consortium of the National Cancer Institute has collaborated with clinical laboratorians, peptide manufacturers, metrologists, representatives of the pharmaceutical industry, and other professionals to develop a consensus set of recommendations for peptide procurement, characterization, storage, and handling, as well as approaches to the interpretation of the data generated by mass spectrometric protein assays. Additionally, the importance of carefully characterized reference materials-in particular, peptide standards for the improved concordance of amino acid analysis methods across the industry-is highlighted. The alignment of practices around the use of peptides and the transparency of sample preparation protocols should allow for the harmonization of peptide and protein quantification in research and clinical care.
Collapse
Affiliation(s)
| | | | | | | | - Tao Liu
- Pacific Northwest National Laboratory, Richland, WA
| | | | | | | | | | | | - Jing Chen
- Johns Hopkins University, Baltimore, MD
| | | | | | - Yuqian Gao
- Pacific Northwest National Laboratory, Richland, WA
| | | | | | | | | | | | - Wei-Jun Qian
- Pacific Northwest National Laboratory, Richland, WA
| | | | | | - Tujin Shi
- Pacific Northwest National Laboratory, Richland, WA
| | | | - John A Wrobel
- University of North Carolina School of Medicine, Chapel Hill, NC
| | - Chaochao Wu
- Pacific Northwest National Laboratory, Richland, WA
| | | | - Ruedi Aebersold
- Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | | | | | | | | | - Russ P Grant
- Laboratory Corporation of America Holdings, Inc., Burlington, NC
| | | | - Mark M Kushnir
- University of Utah and ARUP Laboratories, Salt Lake City, UT
| | | | | | | | | | - Alan L Rockwood
- University of Utah and ARUP Laboratories, Salt Lake City, UT
| | | | | | | | | | | | | | - Xian Chen
- University of North Carolina School of Medicine, Chapel Hill, NC
| | | | | | | | | | | | | | - Hui Zhang
- Johns Hopkins University, Baltimore, MD
| | | |
Collapse
|
13
|
Vizcaíno JA, Csordas A, del-Toro N, Dianes JA, Griss J, Lavidas I, Mayer G, Perez-Riverol Y, Reisinger F, Ternent T, Xu QW, Wang R, Hermjakob H. 2016 update of the PRIDE database and its related tools. Nucleic Acids Res 2016; 44:D447-56. [PMID: 26527722 PMCID: PMC4702828 DOI: 10.1093/nar/gkv1145] [Citation(s) in RCA: 2530] [Impact Index Per Article: 316.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2015] [Revised: 10/14/2015] [Accepted: 10/16/2015] [Indexed: 11/18/2022] Open
Abstract
The PRoteomics IDEntifications (PRIDE) database is one of the world-leading data repositories of mass spectrometry (MS)-based proteomics data. Since the beginning of 2014, PRIDE Archive (http://www.ebi.ac.uk/pride/archive/) is the new PRIDE archival system, replacing the original PRIDE database. Here we summarize the developments in PRIDE resources and related tools since the previous update manuscript in the Database Issue in 2013. PRIDE Archive constitutes a complete redevelopment of the original PRIDE, comprising a new storage backend, data submission system and web interface, among other components. PRIDE Archive supports the most-widely used PSI (Proteomics Standards Initiative) data standard formats (mzML and mzIdentML) and implements the data requirements and guidelines of the ProteomeXchange Consortium. The wide adoption of ProteomeXchange within the community has triggered an unprecedented increase in the number of submitted data sets (around 150 data sets per month). We outline some statistics on the current PRIDE Archive data contents. We also report on the status of the PRIDE related stand-alone tools: PRIDE Inspector, PRIDE Converter 2 and the ProteomeXchange submission tool. Finally, we will give a brief update on the resources under development 'PRIDE Cluster' and 'PRIDE Proteomes', which provide a complementary view and quality-scored information of the peptide and protein identification data available in PRIDE Archive.
Collapse
Affiliation(s)
- Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Attila Csordas
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Noemi del-Toro
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - José A Dianes
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Johannes Griss
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK Division of Immunology, Allergy and Infectious Diseases, Department of Dermatology, Medical University of Vienna, Austria
| | - Ilias Lavidas
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Gerhard Mayer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK Medizinisches Proteom Center (MPC), Ruhr-Universität Bochum, D-44801 Bochum, Germany
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Florian Reisinger
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Tobias Ternent
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Qing-Wei Xu
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK Department of Computer Science and Technology, Hubei University of Education, Wuhan, China
| | - Rui Wang
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK National Center for Protein Sciences, Beijing, China
| |
Collapse
|
14
|
Zou D, Ma L, Yu J, Zhang Z. Biological databases for human research. GENOMICS PROTEOMICS & BIOINFORMATICS 2015; 13:55-63. [PMID: 25712261 PMCID: PMC4411498 DOI: 10.1016/j.gpb.2015.01.006] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 01/01/2015] [Revised: 01/16/2015] [Accepted: 01/16/2015] [Indexed: 01/01/2023]
Abstract
The completion of the Human Genome Project lays a foundation for systematically studying the human genome from evolutionary history to precision medicine against diseases. With the explosive growth of biological data, there is an increasing number of biological databases that have been developed in aid of human-related research. Here we present a collection of human-related biological databases and provide a mini-review by classifying them into different categories according to their data types. As human-related databases continue to grow not only in count but also in volume, challenges are ahead in big data storage, processing, exchange and curation.
Collapse
Affiliation(s)
- Dong Zou
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Lina Ma
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Jun Yu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
| | - Zhang Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
| |
Collapse
|
15
|
Bereman MS. Tools for monitoring system suitability in LC MS/MS centric proteomic experiments. Proteomics 2014; 15:891-902. [DOI: 10.1002/pmic.201400373] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Revised: 09/12/2014] [Accepted: 10/13/2014] [Indexed: 11/06/2022]
Affiliation(s)
- Michael S. Bereman
- Department of Biological Sciences, Center for Human Health and the Environment; North Carolina State University; Raleigh NC USA
| |
Collapse
|
16
|
González-Caballero N, Rodríguez-Vega A, Dias-Lopes G, Valenzuela JG, Ribeiro JMC, Carvalho PC, Valente RH, Brazil RP, Cuervo P. Expression of the mevalonate pathway enzymes in the Lutzomyia longipalpis (Diptera: Psychodidae) sex pheromone gland demonstrated by an integrated proteomic approach. J Proteomics 2014; 96:117-32. [PMID: 24185139 PMCID: PMC3917562 DOI: 10.1016/j.jprot.2013.10.028] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2013] [Revised: 10/01/2013] [Accepted: 10/19/2013] [Indexed: 12/31/2022]
Abstract
In Latin America, Lutzomyia longipalpis is the main vector of the protozoan parasite Leishmania infantum, which is the causal agent of American Visceral Leishmaniasis. This insect uses male-produced pheromones for mate recognition. Elucidation of pheromone biogenesis or its regulation may enable molecular strategies for mating disruption and, consequently, the vector's population management. Motivated by our recent results of the transcriptomic characterization of the L. longipalpis pheromone gland, we performed a proteomic analysis of this tissue combining SDS-PAGE, and mass spectrometry followed by an integrative data analysis. Considering that annotated genome sequences of this sand fly are not available, we designed an alternative workflow searching MS/MS data against two customized databases using three search engines: Mascot, OMSSA and ProLuCID. A total of 542 proteins were confidently characterized, 445 of them using a Uniref100-insect protein database, and 97 using a transcript translated database. In addition, use of PEAKS for de novo peptide sequencing of MS/MS data confirmed ~90% identifications made with the combination of the three search engines. Our results include the identification of six of the seven enzymes of the mevalonate-pathway, plus the enzymes involved in sesquiterpenoid biosynthesis, all of which are proposed to be involved in pheromone production in L. longipalpis. BIOLOGICAL SIGNIFICANCE L. longipalpis is the main vector of the protozoan parasite L. infantum, which is the causal agent of American Visceral Leishmaniasis. One of the control measures of such disease is focused on vector population control. As this insect uses male-produced pheromones for mate recognition, the elucidation of pheromone biogenesis or its regulating process may enable molecular strategies for mating disruption and, consequently, this vector's population management. On this regard, in this manuscript we report expression evidence, at the protein level, of several molecules potentially involved in the pheromone production of L. longipalpis. Our results include the identification of the mevalonate-pathway enzymes, plus the enzymes involved in sesquiterpenoid biosynthesis, all of which are proposed to be involved in pheromone production in L. longipalpis. In addition, considering that the annotated genome sequences of this sand fly are not yet available, we designed an alternative workflow searching MS/MS data against proteomic and transcript translated customized databases, using three search engines: Mascot, OMSSA, and ProLuCID. In addition, a de novo peptide sequencing software (PEAKS) was used to further analyze the MS/MS data. This approach made it possible to identify and annotate 542 proteins for the pheromone gland of L. longipalpis. Importantly, all annotated protein sequences and raw data are available for the research community in protein repositories that provide free access to the data.
Collapse
Affiliation(s)
| | | | - Geovane Dias-Lopes
- Pós-graduação Biologia Parasitaria, IOC, FIOCRUZ, Rio de Janeiro, RJ, Brazil
| | - Jesus G Valenzuela
- Vector Molecular Biology Section, Laboratory of Malaria and Vector Research, National Institutes of Health Rockville, MD, USA
| | - Jose M C Ribeiro
- Vector Biology Section, Laboratory of Malaria and Vector Research, National Institutes of Health Rockville, MD, USA
| | - Paulo Costa Carvalho
- Laboratório de Proteômica e Engenharia de Proteínas, Instituto Carlos Chagas, FIOCRUZ, Curitiba, Brazil
| | - Richard H Valente
- Laboratório de Toxinologia, IOC, FIOCRUZ, Rio de Janeiro, RJ, Brazil
| | - Reginaldo P Brazil
- Laboratório de Bioquímica e Fisiologia de Insetos, IOC, FIOCRUZ, Rio de Janeiro, RJ, Brazil
| | - Patricia Cuervo
- Laboratório de Pesquisa em Leishmaniose, IOC, FIOCRUZ, Rio de Janeiro, RJ, Brazil.
| |
Collapse
|
17
|
Ahmed FE. Utility of mass spectrometry for proteome analysis: part II. Ion-activation methods, statistics, bioinformatics and annotation. Expert Rev Proteomics 2014; 6:171-97. [DOI: 10.1586/epr.09.4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
18
|
Kim YJ, Gallien S, van Oostrum J, Domon B. Targeted proteomics strategy applied to biomarker evaluation. Proteomics Clin Appl 2013; 7:739-47. [DOI: 10.1002/prca.201300070] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2013] [Revised: 09/06/2013] [Accepted: 09/10/2013] [Indexed: 11/09/2022]
Affiliation(s)
- Yeoun Jin Kim
- Luxembourg Clinical Proteomics Center; CRP-Santé; Strassen Luxembourg
| | - Sebastien Gallien
- Luxembourg Clinical Proteomics Center; CRP-Santé; Strassen Luxembourg
| | - Jan van Oostrum
- Luxembourg Clinical Proteomics Center; CRP-Santé; Strassen Luxembourg
| | - Bruno Domon
- Luxembourg Clinical Proteomics Center; CRP-Santé; Strassen Luxembourg
| |
Collapse
|
19
|
Liu Y, Hüttenhain R, Collins B, Aebersold R. Mass spectrometric protein maps for biomarker discovery and clinical research. Expert Rev Mol Diagn 2013; 13:811-25. [PMID: 24138574 PMCID: PMC3833812 DOI: 10.1586/14737159.2013.845089] [Citation(s) in RCA: 93] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Among the wide range of proteomic technologies, targeted mass spectrometry (MS) has shown great potential for biomarker studies. To extend the degree of multiplexing achieved by selected reaction monitoring (SRM), we recently developed SWATH MS. SWATH MS is a variant of the emerging class of data-independent acquisition (DIA) methods and essentially converts the molecules in a physical sample into perpetually re-usable digital maps. The thus generated SWATH maps are then mined using a targeted data extraction strategy, allowing us to profile disease-related proteomes at a high degree of reproducibility. The successful application of both SRM and SWATH MS requires the a priori generation of reference spectral maps that provide coordinates for quantification. Herein, we demonstrate that the application of the mass spectrometric reference maps and the acquisition of personalized SWATH maps hold a particular promise for accelerating the current process of biomarker discovery.
Collapse
Affiliation(s)
- Yansheng Liu
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Wolfgang-Pauli-Str.16, 8093 Zurich, Switzerland
| | | | | | | |
Collapse
|
20
|
Zong NC, Li H, Li H, Lam MPY, Jimenez RC, Kim CS, Deng N, Kim AK, Choi JH, Zelaya I, Liem D, Meyer D, Odeberg J, Fang C, Lu HJ, Xu T, Weiss J, Duan H, Uhlen M, Yates JR, Apweiler R, Ge J, Hermjakob H, Ping P. Integration of cardiac proteome biology and medicine by a specialized knowledgebase. Circ Res 2013; 113:1043-53. [PMID: 23965338 DOI: 10.1161/circresaha.113.301151] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
RATIONALE Omics sciences enable a systems-level perspective in characterizing cardiovascular biology. Integration of diverse proteomics data via a computational strategy will catalyze the assembly of contextualized knowledge, foster discoveries through multidisciplinary investigations, and minimize unnecessary redundancy in research efforts. OBJECTIVE The goal of this project is to develop a consolidated cardiac proteome knowledgebase with novel bioinformatics pipeline and Web portals, thereby serving as a new resource to advance cardiovascular biology and medicine. METHODS AND RESULTS We created Cardiac Organellar Protein Atlas Knowledgebase (COPaKB; www.HeartProteome.org), a centralized platform of high-quality cardiac proteomic data, bioinformatics tools, and relevant cardiovascular phenotypes. Currently, COPaKB features 8 organellar modules, comprising 4203 LC-MS/MS experiments from human, mouse, drosophila, and Caenorhabditis elegans, as well as expression images of 10,924 proteins in human myocardium. In addition, the Java-coded bioinformatics tools provided by COPaKB enable cardiovascular investigators in all disciplines to retrieve and analyze pertinent organellar protein properties of interest. CONCLUSIONS COPaKB provides an innovative and interactive resource that connects research interests with the new biological discoveries in protein sciences. With an array of intuitive tools in this unified Web server, nonproteomics investigators can conveniently collaborate with proteomics specialists to dissect the molecular signatures of cardiovascular phenotypes.
Collapse
Affiliation(s)
- Nobel C Zong
- From the NHLBI Proteomics Center at UCLA/NHLBI Proteomics Program
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Abstract
![]()
Quantitative
measurement of proteins is one of the most fundamental analytical
tasks in a biochemistry laboratory, but widely used immunochemical
methods often have limited specificity and high measurement variation.
In this review, we discuss applications of multiple-reaction monitoring
(MRM) mass spectrometry, which allows sensitive, precise quantitative
analyses of peptides and the proteins from which they are derived.
Systematic development of MRM assays is permitted by databases of
peptide mass spectra and sequences, software tools for analysis design
and data analysis, and rapid evolution of tandem mass spectrometer
technology. Key advantages of MRM assays are the ability to target
specific peptide sequences, including variants and modified forms,
and the capacity for multiplexing that allows analysis of dozens to
hundreds of peptides. Different quantitative standardization methods
provide options that balance precision, sensitivity, and assay cost.
Targeted protein quantitation by MRM and related mass spectrometry
methods can advance biochemistry by transforming approaches to protein
measurement.
Collapse
Affiliation(s)
- Daniel C Liebler
- Department of Biochemistry and Jim Ayers Institute for Precancer Detection and Diagnosis, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, Tennessee 37232-6350, United States.
| | | |
Collapse
|
22
|
Li H, Zong NC, Liang X, Kim AK, Choi JH, Deng N, Zelaya I, Lam M, Duan H, Ping P. A novel spectral library workflow to enhance protein identifications. J Proteomics 2013; 81:173-84. [PMID: 23391412 DOI: 10.1016/j.jprot.2013.01.026] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Revised: 01/18/2013] [Accepted: 01/26/2013] [Indexed: 10/27/2022]
Abstract
The innovations in mass spectrometry-based investigations in proteome biology enable systematic characterization of molecular details in pathophysiological phenotypes. However, the process of delineating large-scale raw proteomic datasets into a biological context requires high-throughput data acquisition and processing. A spectral library search engine makes use of previously annotated experimental spectra as references for subsequent spectral analyses. This workflow delivers many advantages, including elevated analytical efficiency and specificity as well as reduced demands in computational capacity. In this study, we created a spectral matching engine to address challenges commonly associated with a library search workflow. Particularly, an improved sliding dot product algorithm, that is robust to systematic drifts of mass measurement in spectra, is introduced. Furthermore, a noise management protocol distinguishes spectra correlation attributed from noise and peptide fragments. It enables elevated separation between target spectral matches and false matches, thereby suppressing the possibility of propagating inaccurate peptide annotations from library spectra to query spectra. Moreover, preservation of original spectra also accommodates user contributions to further enhance the quality of the library. Collectively, this search engine supports reproducible data analyses using curated references, thereby broadening the accessibility of proteomics resources to biomedical investigators. This article is part of a Special Issue entitled: From protein structures to clinical applications.
Collapse
Affiliation(s)
- Haomin Li
- Department of Physiology, UCLA School of Medicine, Los Angeles, CA 90095, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
|
24
|
Domon B. Considerations on selected reaction monitoring experiments: Implications for the selectivity and accuracy of measurements. Proteomics Clin Appl 2012; 6:609-14. [DOI: 10.1002/prca.201200111] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2012] [Accepted: 10/01/2012] [Indexed: 11/09/2022]
Affiliation(s)
- Bruno Domon
- Luxembourg Clinical Proteomics Center (LCP); CRP-Santé; Strassen Luxembourg
| |
Collapse
|
25
|
Farrah T, Deutsch EW, Hoopmann MR, Hallows JL, Sun Z, Huang CY, Moritz RL. The state of the human proteome in 2012 as viewed through PeptideAtlas. J Proteome Res 2012; 12:162-71. [PMID: 23215161 DOI: 10.1021/pr301012j] [Citation(s) in RCA: 99] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The Human Proteome Project was launched in September 2010 with the goal of characterizing at least one protein product from each protein-coding gene. Here we assess how much of the proteome has been detected to date via tandem mass spectrometry by analyzing PeptideAtlas, a compendium of human derived LC-MS/MS proteomics data from many laboratories around the world. All data sets are processed with a consistent set of parameters using the Trans-Proteomic Pipeline and subjected to a 1% protein FDR filter before inclusion in PeptideAtlas. Therefore, PeptideAtlas contains only high confidence protein identifications. To increase proteome coverage, we explored new comprehensive public data sources for data likely to add new proteins to the Human PeptideAtlas. We then folded these data into a Human PeptideAtlas 2012 build and mapped it to Swiss-Prot, a protein sequence database curated to contain one entry per human protein coding gene. We find that this latest PeptideAtlas build includes at least one peptide for each of ~12500 Swiss-Prot entries, leaving ~7500 gene products yet to be confidently cataloged. We characterize these "PA-unseen" proteins in terms of tissue localization, transcript abundance, and Gene Ontology enrichment, and propose reasons for their absence from PeptideAtlas and strategies for detecting them in the future.
Collapse
Affiliation(s)
- Terry Farrah
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, Washington 98109, United States.
| | | | | | | | | | | | | |
Collapse
|
26
|
Vizcaíno JA, Côté RG, Csordas A, Dianes JA, Fabregat A, Foster JM, Griss J, Alpi E, Birim M, Contell J, O'Kelly G, Schoenegger A, Ovelleiro D, Pérez-Riverol Y, Reisinger F, Ríos D, Wang R, Hermjakob H. The PRoteomics IDEntifications (PRIDE) database and associated tools: status in 2013. Nucleic Acids Res 2012. [PMID: 23203882 PMCID: PMC3531176 DOI: 10.1093/nar/gks1262] [Citation(s) in RCA: 1597] [Impact Index Per Article: 133.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
The PRoteomics IDEntifications (PRIDE, http://www.ebi.ac.uk/pride) database at the European Bioinformatics Institute is one of the most prominent data repositories of mass spectrometry (MS)-based proteomics data. Here, we summarize recent developments in the PRIDE database and related tools. First, we provide up-to-date statistics in data content, splitting the figures by groups of organisms and species, including peptide and protein identifications, and post-translational modifications. We then describe the tools that are part of the PRIDE submission pipeline, especially the recently developed PRIDE Converter 2 (new submission tool) and PRIDE Inspector (visualization and analysis tool). We also give an update about the integration of PRIDE with other MS proteomics resources in the context of the ProteomeXchange consortium. Finally, we briefly review the quality control efforts that are ongoing at present and outline our future plans.
Collapse
Affiliation(s)
- Juan Antonio Vizcaíno
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Trudgian DC, Mirzaei H. Cloud CPFP: a shotgun proteomics data analysis pipeline using cloud and high performance computing. J Proteome Res 2012; 11:6282-90. [PMID: 23088505 DOI: 10.1021/pr300694b] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
We have extended the functionality of the Central Proteomics Facilities Pipeline (CPFP) to allow use of remote cloud and high performance computing (HPC) resources for shotgun proteomics data processing. CPFP has been modified to include modular local and remote scheduling for data processing jobs. The pipeline can now be run on a single PC or server, a local cluster, a remote HPC cluster, and/or the Amazon Web Services (AWS) cloud. We provide public images that allow easy deployment of CPFP in its entirety in the AWS cloud. This significantly reduces the effort necessary to use the software, and allows proteomics laboratories to pay for compute time ad hoc, rather than obtaining and maintaining expensive local server clusters. Alternatively the Amazon cloud can be used to increase the throughput of a local installation of CPFP as necessary. We demonstrate that cloud CPFP allows users to process data at higher speed than local installations but with similar cost and lower staff requirements. In addition to the computational improvements, the web interface to CPFP is simplified, and other functionalities are enhanced. The software is under active development at two leading institutions and continues to be released under an open-source license at http://cpfp.sourceforge.net.
Collapse
Affiliation(s)
- David C Trudgian
- Department of Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, Texas 75390-8816, United States
| | | |
Collapse
|
28
|
Antonov AV. Mining protein lists from proteomics studies: applications for drug discovery. Expert Opin Drug Discov 2012; 5:323-31. [PMID: 22823085 DOI: 10.1517/17460441003716796] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
IMPORTANCE OF THE FIELD In recent years, proteomics has become a common technique applied to a wide spectrum of scientific problems, including the identification of diagnostic biomarkers, monitoring the effects of drug treatments or identification of chemical properties of a protein or a drug. Although being significantly different in scientific essence, the ultimate result of the majority of proteomics studies is a protein list. Thousands of independent proteomics studies have reported protein lists in various functional contexts. AREAS COVERED IN THIS REVIEW We review here the spectrum of scientific problems where proteomics technology was applied recently to deliver protein lists. The available bioinformatics methods commonly used to understand the properties of the protein lists are compared. WHAT THE READER WILL GAIN The types and common functional properties of the reported protein lists are discussed. The range of scientific problems where this knowledge could be potentially helpful with a focus on drug discovery issues is explored. TAKE HOME MESSAGE Reported protein lists represent a valuable resource which can be used for a variety of goals, ranging from biomarkers discovery to identification of novel therapeutic implications of known drugs.
Collapse
Affiliation(s)
- Alexey V Antonov
- Institute for Bioinformatics and Systems Biology, Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH), Ingolstädter Landstraße 1, D-85764, Neuherberg, Germany +49 89 3187 2788 ; +49 89 3187 3585 ;
| |
Collapse
|
29
|
Effects of psychological stress on innate immunity and metabolism in humans: a systematic analysis. PLoS One 2012; 7:e43232. [PMID: 23028447 PMCID: PMC3446986 DOI: 10.1371/journal.pone.0043232] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2011] [Accepted: 07/18/2012] [Indexed: 01/01/2023] Open
Abstract
Stress is perhaps easiest to conceptualize as a process which allows an organism to accommodate for the demands of its environment such that it can adapt to the prevailing set of conditions. Psychological stress is an important component with the potential to affect physiology adversely as has become evident from various studies in the area. Although these studies have established numerous effects of psychological stress on physiology, a global strategy for the correlation of these effects has yet to begin. Our comparative and systematic analysis of the published literature has unraveled certain interesting molecular mechanisms as clues to account for some of the observed effects of psychological stress on human physiology. In this study, we attempt to understand initial phase of the physiological response to psychological stress by analyzing interactions between innate immunity and metabolism at systems level by analyzing the data available in the literature. In light of our gene association-networks and enrichment analysis we have identified candidate genes and molecular systems which might have some associative role in affecting psychological stress response system or even producing some of the observed terminal effects (such as the associated physiological disorders). In addition to the already accepted role of psychological stress as a perturbation that can disrupt physiological homeostasis, we speculate that it is potentially capable of causing deviation of certain biological processes from their basal level activity after which they can return back to their basal tones once the effects of stress diminish. Based on the derived inferences of our comparative analysis, we have proposed a probabilistic mechanism for how psychological stress could affect physiology such that these adaptive deviations are sometimes not able to bounce back to their original basal tones, and thus increase physiological susceptibility to metabolic and immune imbalance.
Collapse
|
30
|
Tan HT, Lee YH, Chung MCM. Cancer proteomics. MASS SPECTROMETRY REVIEWS 2012; 31:583-605. [PMID: 22422534 DOI: 10.1002/mas.20356] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2011] [Revised: 11/16/2011] [Accepted: 11/16/2011] [Indexed: 05/31/2023]
Abstract
Cancer presents high mortality and morbidity globally, largely due to its complex and heterogenous nature, and lack of biomarkers for early diagnosis. A proteomics study of cancer aims to identify and characterize functional proteins that drive the transformation of malignancy, and to discover biomarkers to detect early-stage cancer, predict prognosis, determine therapy efficacy, identify novel drug targets, and ultimately develop personalized medicine. The various sources of human samples such as cell lines, tissues, and plasma/serum are probed by a plethora of proteomics tools to discover novel biomarkers and elucidate mechanisms of tumorigenesis. Innovative proteomics technologies and strategies have been designed for protein identification, quantitation, fractionation, and enrichment to delve deeper into the oncoproteome. In addition, there is the need for high-throughput methods for biomarker validation, and integration of the various platforms of oncoproteome data to fully comprehend cancer biology.
Collapse
Affiliation(s)
- Hwee Tong Tan
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | | | | |
Collapse
|
31
|
Boja ES, Rodriguez H. Mass spectrometry-based targeted quantitative proteomics: achieving sensitive and reproducible detection of proteins. Proteomics 2012; 12:1093-110. [PMID: 22577011 DOI: 10.1002/pmic.201100387] [Citation(s) in RCA: 116] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Traditional shotgun proteomics used to detect a mixture of hundreds to thousands of proteins through mass spectrometric analysis, has been the standard approach in research to profile protein content in a biological sample which could lead to the discovery of new (and all) protein candidates with diagnostic, prognostic, and therapeutic values. In practice, this approach requires significant resources and time, and does not necessarily represent the goal of the researcher who would rather study a subset of such discovered proteins (including their variations or posttranslational modifications) under different biological conditions. In this context, targeted proteomics is playing an increasingly important role in the accurate measurement of protein targets in biological samples in the hope of elucidating the molecular mechanism of cellular function via the understanding of intricate protein networks and pathways. One such (targeted) approach, selected reaction monitoring (or multiple reaction monitoring) mass spectrometry (MRM-MS), offers the capability of measuring multiple proteins with higher sensitivity and throughput than shotgun proteomics. Developing and validating MRM-MS-based assays, however, is an extensive and iterative process, requiring a coordinated and collaborative effort by the scientific community through the sharing of publicly accessible data and datasets, bioinformatic tools, standard operating procedures, and well characterized reagents.
Collapse
Affiliation(s)
- Emily S Boja
- Office of Cancer Clinical Proteomics Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA.
| | | |
Collapse
|
32
|
Bereman MS, MacLean B, Tomazela DM, Liebler DC, MacCoss MJ. The development of selected reaction monitoring methods for targeted proteomics via empirical refinement. Proteomics 2012; 12:1134-41. [PMID: 22577014 DOI: 10.1002/pmic.201200042] [Citation(s) in RCA: 85] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Software advancements in the last several years have had a significant impact on proteomics from method development to data analysis. Herein, we detail a method, which uses our in-house developed software tool termed Skyline, for empirical refinement of candidate peptides from targeted proteins. The method consists of four main steps from generation of a testable hypothesis, method development, peptide refinement, to peptide validation. The ultimate goal is to identify the best performing peptide in terms of ionization efficiency, reproducibility, specificity, and chromatographic characteristics to monitor as a proxy for protein abundance. It is important to emphasize that this method allows the user to perform this refinement procedure in the sample matrix and organism of interest with the instrumentation available. Finally, the method is demonstrated in a case study to determine the best peptide to monitor the abundance of surfactant protein B in lung aspirates.
Collapse
Affiliation(s)
- Michael S Bereman
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | | | | | | | | |
Collapse
|
33
|
Peterson AC, Russell JD, Bailey DJ, Westphall MS, Coon JJ. Parallel reaction monitoring for high resolution and high mass accuracy quantitative, targeted proteomics. Mol Cell Proteomics 2012; 11:1475-88. [PMID: 22865924 DOI: 10.1074/mcp.o112.020131] [Citation(s) in RCA: 902] [Impact Index Per Article: 75.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Selected reaction monitoring on a triple quadrupole mass spectrometer is currently experiencing a renaissance within the proteomics community for its, as yet, unparalleled ability to characterize and quantify a set of proteins reproducibly, completely, and with high sensitivity. Given the immense benefit that high resolution and accurate mass instruments have brought to the discovery proteomics field, we wondered if highly accurate mass measurement capabilities could be leveraged to provide benefits in the targeted proteomics domain as well. Here, we propose a new targeted proteomics paradigm centered on the use of next generation, quadrupole-equipped high resolution and accurate mass instruments: parallel reaction monitoring (PRM). In PRM, the third quadrupole of a triple quadrupole is substituted with a high resolution and accurate mass mass analyzer to permit the parallel detection of all target product ions in one, concerted high resolution mass analysis. We detail the analytical performance of the PRM method, using a quadrupole-equipped bench-top Orbitrap MS, and draw a performance comparison to selected reaction monitoring in terms of run-to-run reproducibility, dynamic range, and measurement accuracy. In addition to requiring minimal upfront method development and facilitating automated data analysis, PRM yielded quantitative data over a wider dynamic range than selected reaction monitoring in the presence of a yeast background matrix because of PRM's high selectivity in the mass-to-charge domain. With achievable linearity over the quantifiable dynamic range found to be statistically equal between the two methods, our investigation suggests that PRM will be a promising new addition to the quantitative proteomics toolbox.
Collapse
Affiliation(s)
- Amelia C Peterson
- Department of Chemistry and Biomolecular Chemistry, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | | | | | | | | |
Collapse
|
34
|
Cunningham R, Ma D, Li L. Mass Spectrometry-based Proteomics and Peptidomics for Systems Biology and Biomarker Discovery. FRONTIERS IN BIOLOGY 2012; 7:313-335. [PMID: 24504115 PMCID: PMC3913178 DOI: 10.1007/s11515-012-1218-y] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
The scientific community has shown great interest in the field of mass spectrometry-based proteomics and peptidomics for its applications in biology. Proteomics technologies have evolved to produce large datasets of proteins or peptides involved in various biological and disease progression processes producing testable hypothesis for complex biological questions. This review provides an introduction and insight to relevant topics in proteomics and peptidomics including biological material selection, sample preparation, separation techniques, peptide fragmentation, post-translation modifications, quantification, bioinformatics, and biomarker discovery and validation. In addition, current literature and remaining challenges and emerging technologies for proteomics and peptidomics are presented.
Collapse
Affiliation(s)
- Robert Cunningham
- Department of Chemistry, University of Wisconsin-Madison, 777, Highland Avenue, Madison, WI 53705-2222, USA
| | - Di Ma
- School of Pharmacy, University of Wisconsin-Madison, 777, Highland Avenue, Madison, WI 53705-2222, USA
| | - Lingjun Li
- Department of Chemistry, University of Wisconsin-Madison, 777, Highland Avenue, Madison, WI 53705-2222, USA
- School of Pharmacy, University of Wisconsin-Madison, 777, Highland Avenue, Madison, WI 53705-2222, USA
| |
Collapse
|
35
|
Liolios K, Schriml L, Hirschman L, Pagani I, Nosrat B, Sterk P, White O, Rocca-Serra P, Sansone SA, Taylor C, Kyrpides NC, Field D. The Metadata Coverage Index (MCI): A standardized metric for quantifying database metadata richness. Stand Genomic Sci 2012; 6:438-47. [PMID: 23409217 PMCID: PMC3558968 DOI: 10.4056/sigs.2675953] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Variability in the extent of the descriptions of data ('metadata') held in public repositories forces users to assess the quality of records individually, which rapidly becomes impractical. The scoring of records on the richness of their description provides a simple, objective proxy measure for quality that enables filtering that supports downstream analysis. Pivotally, such descriptions should spur on improvements. Here, we introduce such a measure - the 'Metadata Coverage Index' (MCI): the percentage of available fields actually filled in a record or description. MCI scores can be calculated across a database, for individual records or for their component parts (e.g., fields of interest). There are many potential uses for this simple metric: for example; to filter, rank or search for records; to assess the metadata availability of an ad hoc collection; to determine the frequency with which fields in a particular record type are filled, especially with respect to standards compliance; to assess the utility of specific tools and resources, and of data capture practice more generally; to prioritize records for further curation; to serve as performance metrics of funded projects; or to quantify the value added by curation. Here we demonstrate the utility of MCI scores using metadata from the Genomes Online Database (GOLD), including records compliant with the 'Minimum Information about a Genome Sequence' (MIGS) standard developed by the Genomic Standards Consortium. We discuss challenges and address the further application of MCI scores; to show improvements in annotation quality over time, to inform the work of standards bodies and repository providers on the usability and popularity of their products, and to assess and credit the work of curators. Such an index provides a step towards putting metadata capture practices and in the future, standards compliance, into a quantitative and objective framework.
Collapse
Affiliation(s)
- Konstantinos Liolios
- Microbial Genomics and Metagenomic Super Program, Department of Energy Joint Genome Institute, Walnut Creek, CA, USA
| | - Lynn Schriml
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | | | - Ioanna Pagani
- Microbial Genomics and Metagenomic Super Program, Department of Energy Joint Genome Institute, Walnut Creek, CA, USA
| | - Bahador Nosrat
- Microbial Genomics and Metagenomic Super Program, Department of Energy Joint Genome Institute, Walnut Creek, CA, USA
| | - Peter Sterk
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Owen White
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | | | | | - Chris Taylor
- European Molecular Biology Laboratory (EMBL) Outstation, European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Cambridge, UK
| | - Nikos C. Kyrpides
- Microbial Genomics and Metagenomic Super Program, Department of Energy Joint Genome Institute, Walnut Creek, CA, USA
| | - Dawn Field
- University of Oxford, Oxford e-Research Centre, Oxford, UK
- Centre for Ecology & Hydrology, Wallingford, Oxfordshire, UK
| |
Collapse
|
36
|
Gallien S, Peterman S, Kiyonami R, Souady J, Duriez E, Schoen A, Domon B. Highly multiplexed targeted proteomics using precise control of peptide retention time. Proteomics 2012; 12:1122-33. [DOI: 10.1002/pmic.201100533] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Affiliation(s)
- Sebastien Gallien
- Luxembourg Clinical Proteomics center (LCP); Centre de Recherche Public de la Santé; Strassen Luxembourg
| | | | | | - Jamal Souady
- Luxembourg Clinical Proteomics center (LCP); Centre de Recherche Public de la Santé; Strassen Luxembourg
| | - Elodie Duriez
- Luxembourg Clinical Proteomics center (LCP); Centre de Recherche Public de la Santé; Strassen Luxembourg
| | | | - Bruno Domon
- Luxembourg Clinical Proteomics center (LCP); Centre de Recherche Public de la Santé; Strassen Luxembourg
| |
Collapse
|
37
|
Kim YJ, Zaidi-Ainouch Z, Gallien S, Domon B. Mass spectrometry–based detection and quantification of plasma glycoproteins using selective reaction monitoring. Nat Protoc 2012; 7:859-71. [DOI: 10.1038/nprot.2012.023] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
38
|
Faria-Campos A, Fernandes-Rausch H, Val C, Thorun P, Abreu V, Batista PH, Mendonça PH, Alves V, Rodrigues MR, Pimenta A, Franco G, Campos SVA. PRODIS: a proteomics data management system with support to experiment tracking. BMC Genomics 2011; 12 Suppl 4:S15. [PMID: 22369043 PMCID: PMC3287584 DOI: 10.1186/1471-2164-12-s4-s15] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background A research area that has greatly benefited from the development of new and improved analysis technologies is Proteomics and large amounts of data have been generated by proteomic analysis as a consequence. Previously, the storage, management and analysis of these data have been done manually. This is, however, incompatible with the volume of data generated by modern proteomic analysis. Several attempts have been made to automate the tasks of data analysis and management. In this work we propose PRODIS (Proteomics Database Integrated System), a system for proteomic experimental data management. The proposed system enables an efficient management of the proteomic experimentation workflow, simplifies controlling experiments and associated data and establishes links between similar experiments through the experiment tracking function. Results PRODIS is fully web based which simplifies data upload and gives the system the flexibility necessary for use in complex projects. Data from Liquid Chromatography, 2D-PAGE and Mass Spectrometry experiments can be stored in the system. Moreover, it is simple to use, researchers can insert experimental data directly as experiments are performed, without the need to configure the system or change their experiment routine. PRODIS has a number of important features, including a password protected system in which each screen for data upload and retrieval is validated; users have different levels of clearance, which allow the execution of tasks according to the user clearance level. The system allows the upload, parsing of files, storage and display of experiment results and images in the main formats used in proteomics laboratories: for chromatographies the chromatograms and lists of peaks resulting from separation are stored; For 2D-PAGE images of gels and the files resulting from the analysis are stored, containing information on positions of spots as well as its values of intensity, volume, etc; For Mass Spectrometry, PRODIS presents a function for completion of the mapping plate that allows the user to correlate the positions in plates to the samples separated by 2D-PAGE. Furthermore PRODIS allows the tracking of experiments from the first stage until the final step of identification, enabling an efficient management of the complete experimental process. Conclusions The construction of data management systems for Proteomics data importing and storing is a relevant subject. PRODIS is a system complementary to other proteomics tools that combines a powerful storage engine (the relational database) and a friendly access interface, aiming to assist Proteomics research directly at data handling and storage.
Collapse
|
39
|
Wu DD, Irwin DM, Zhang YP. De novo origin of human protein-coding genes. PLoS Genet 2011; 7:e1002379. [PMID: 22102831 PMCID: PMC3213175 DOI: 10.1371/journal.pgen.1002379] [Citation(s) in RCA: 122] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2011] [Accepted: 09/21/2011] [Indexed: 11/24/2022] Open
Abstract
The de novo origin of a new protein-coding gene from non-coding DNA is considered to be a very rare occurrence in genomes. Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee. The functionality of these genes is supported by both transcriptional and proteomic evidence. RNA–seq data indicate that these genes have their highest expression levels in the cerebral cortex and testes, which might suggest that these genes contribute to phenotypic traits that are unique to humans, such as improved cognitive ability. Our results are inconsistent with the traditional view that the de novo origin of new genes is very rare, thus there should be greater appreciation of the importance of the de novo origination of genes. The origin of genes can involve mechanisms such as gene duplication, exon shuffling, retroposition, mobile elements, lateral gene transfer, gene fusion/fission, and de novo origination. However, de novo origin, which means genes originate from a non-coding DNA region, is considered to be a very rare occurrence. Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee, supported by both transcriptional and proteomic evidence. It is inconsistent with the traditional view that the de novo origin of new genes is rare. RNA–seq data indicate that these de novo originated genes have their highest expression in the cerebral cortex and testes, suggesting these genes may contribute to phenotypic traits that are unique to humans, such as development of cognitive ability. Therefore, the importance of de novo origination needs greater appreciation.
Collapse
Affiliation(s)
- Dong-Dong Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - David M. Irwin
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Canada
- Banting and Best Diabetes Centre, University of Toronto, Toronto, Canada
| | - Ya-Ping Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Laboratory for Conservation and Utilization of Bio-resource, Yunnan University, Kunming, China
- * E-mail:
| |
Collapse
|
40
|
Zhang YE, Landback P, Vibranovski MD, Long M. Accelerated recruitment of new brain development genes into the human genome. PLoS Biol 2011; 9:e1001179. [PMID: 22028629 PMCID: PMC3196496 DOI: 10.1371/journal.pbio.1001179] [Citation(s) in RCA: 114] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2011] [Accepted: 09/08/2011] [Indexed: 11/24/2022] Open
Abstract
How the human brain evolved has attracted tremendous interests for decades. Motivated by case studies of primate-specific genes implicated in brain function, we examined whether or not the young genes, those emerging genome-wide in the lineages specific to the primates or rodents, showed distinct spatial and temporal patterns of transcription compared to old genes, which had existed before primate and rodent split. We found consistent patterns across different sources of expression data: there is a significantly larger proportion of young genes expressed in the fetal or infant brain of humans than in mouse, and more young genes in humans have expression biased toward early developing brains than old genes. Most of these young genes are expressed in the evolutionarily newest part of human brain, the neocortex. Remarkably, we also identified a number of human-specific genes which are expressed in the prefrontal cortex, which is implicated in complex cognitive behaviors. The young genes upregulated in the early developing human brain play diverse functional roles, with a significant enrichment of transcription factors. Genes originating from different mechanisms show a similar expression bias in the developing brain. Moreover, we found that the young genes upregulated in early brain development showed rapid protein evolution compared to old genes also expressed in the fetal brain. Strikingly, genes expressed in the neocortex arose soon after its morphological origin. These four lines of evidence suggest that positive selection for brain function may have contributed to the origination of young genes expressed in the developing brain. These data demonstrate a striking recruitment of new genes into the early development of the human brain.
Collapse
Affiliation(s)
- Yong E. Zhang
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
| | - Patrick Landback
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
| | - Maria D. Vibranovski
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
41
|
Wilhelm M, Kirchner M, Steen JAJ, Steen H. mz5: space- and time-efficient storage of mass spectrometry data sets. Mol Cell Proteomics 2011; 11:O111.011379. [PMID: 21960719 DOI: 10.1074/mcp.o111.011379] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Across a host of MS-driven-omics fields, researchers witness the acquisition of ever increasing amounts of high throughput MS data and face the need for their compact yet efficiently accessible storage. Addressing the need for an open data exchange format, the Proteomics Standards Initiative and the Seattle Proteome Center at the Institute for Systems Biology independently developed the mzData and mzXML formats, respectively. In a subsequent joint effort, they defined an ontology and associated controlled vocabulary that specifies the contents of MS data files, implemented as the newer mzML format. All three formats are based on XML and are thus not particularly efficient in either storage space requirements or read/write speed. This contribution introduces mz5, a complete reimplementation of the mzML ontology that is based on the efficient, industrial strength storage backend HDF5. Compared with the current mzML standard, this strategy yields an average file size reduction to ∼54% and increases linear read and write speeds ∼3-4-fold. The format is implemented as part of the ProteoWizard project and is available under a permissive Apache license. Additional information and download links are available from http://software.steenlab.org/mz5.
Collapse
Affiliation(s)
- Mathias Wilhelm
- Proteomics Center, Children's Hospital Boston, Boston, Massachusetts; Faculty of Technology, University Bielefeld, Bielefeld, Germany; Department of Pathology, Children's Hospital Boston, Boston, Massachusetts
| | - Marc Kirchner
- Proteomics Center, Children's Hospital Boston, Boston, Massachusetts; Department of Pathology, Children's Hospital Boston, Boston, Massachusetts; Department of Pathology, Harvard Medical School, Boston, Massachusetts.
| | - Judith A J Steen
- Proteomics Center, Children's Hospital Boston, Boston, Massachusetts; Department of Neurobiology, Harvard Medical School and F. M. Kirby Neurobiology Center, Children's Hospital, Boston, Massachusetts
| | - Hanno Steen
- Proteomics Center, Children's Hospital Boston, Boston, Massachusetts; Department of Pathology, Children's Hospital Boston, Boston, Massachusetts; Department of Pathology, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
42
|
Courcelles M, Lemieux S, Voisin L, Meloche S, Thibault P. ProteoConnections: A bioinformatics platform to facilitate proteome and phosphoproteome analyses. Proteomics 2011; 11:2654-71. [DOI: 10.1002/pmic.201000776] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2010] [Revised: 03/20/2011] [Accepted: 04/05/2011] [Indexed: 01/02/2023]
|
43
|
Severing EI, van Dijk ADJ, van Ham RCHJ. Assessing the contribution of alternative splicing to proteome diversity in Arabidopsis thaliana using proteomics data. BMC PLANT BIOLOGY 2011; 11:82. [PMID: 21575182 PMCID: PMC3118179 DOI: 10.1186/1471-2229-11-82] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2011] [Accepted: 05/16/2011] [Indexed: 05/19/2023]
Abstract
BACKGROUND Large-scale analyses of genomics and transcriptomics data have revealed that alternative splicing (AS) substantially increases the complexity of the transcriptome in higher eukaryotes. However, the extent to which this complexity is reflected at the level of the proteome remains unclear. On the basis of a lack of conservation of AS between species, we previously concluded that AS does not frequently serve as a mechanism that enables the production of multiple functional proteins from a single gene. Following this conclusion, we hypothesized that the extent to which AS events contribute to the proteome diversity in Arabidopsis thaliana would be lower than expected on the basis of transcriptomics data. Here, we test this hypothesis by analyzing two large-scale proteomics datasets from Arabidopsis thaliana. RESULTS A total of only 60 AS events could be confirmed using the proteomics data. However, for about 60% of the loci that, based on transcriptomics data, were predicted to produce multiple protein isoforms through AS, no isoform-specific peptides were found. We therefore performed in silico AS detection experiments to assess how well AS events were represented in the experimental datasets. The results of these in silico experiments indicated that the low number of confirmed AS events was the consequence of a limited sampling depth rather than in vivo under-representation of AS events in these datasets. CONCLUSION Although the impact of AS on the functional properties of the proteome remains to be uncovered, the results of this study indicate that AS-induced diversity at the transcriptome level is also expressed at the proteome level.
Collapse
Affiliation(s)
- Edouard I Severing
- Applied Bioinformatics, Plant Research International, PO Box 619, 6700 AP Wageningen, The Netherlands
- Laboratory of Bioinformatics, Wageningen University, PO BOX 8128, 6700 ET Wageningen, The Netherlands
- Netherlands Bioinformatics Centre, PO BOX 9101, 6500 HB Nijmegen, The Netherlands
| | - Aalt DJ van Dijk
- Applied Bioinformatics, Plant Research International, PO Box 619, 6700 AP Wageningen, The Netherlands
| | - Roeland CHJ van Ham
- Applied Bioinformatics, Plant Research International, PO Box 619, 6700 AP Wageningen, The Netherlands
- Laboratory of Bioinformatics, Wageningen University, PO BOX 8128, 6700 ET Wageningen, The Netherlands
- Netherlands Bioinformatics Centre, PO BOX 9101, 6500 HB Nijmegen, The Netherlands
- Current address: Keygene N.V., P.O. Box 216, 6700 AE Wageningen, The Netherlands
| |
Collapse
|
44
|
Gallien S, Duriez E, Domon B. Selected reaction monitoring applied to proteomics. JOURNAL OF MASS SPECTROMETRY : JMS 2011; 46:298-312. [PMID: 21394846 DOI: 10.1002/jms.1895] [Citation(s) in RCA: 202] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Selected reaction monitoring (SRM) performed on triple quadrupole mass spectrometers has been the reference quantitative technique to analyze small molecules for several decades. It is now emerging in proteomics as the ideal tool to complement shotgun qualitative studies; targeted SRM quantitative analysis offers high selectivity, sensitivity and a wide dynamic range. However, SRM applied to proteomics presents singularities that distinguish it from small molecules analysis. This review is an overview of SRM technology and describes the specificities and the technical aspects of proteomics experiments. Ongoing developments aiming at increasing multiplexing capabilities of SRM are discussed; they dramatically improve its throughput and extend its field of application to directed or supervised discovery experiments.
Collapse
Affiliation(s)
- Sebastien Gallien
- Luxembourg Clinical Proteomics center (LCP), Centre de Recherche Public de la Santé, 1 B rue Thomas Edison, L-1445 Strassen, Luxembourg
| | | | | |
Collapse
|
45
|
Eisenacher M, Schnabel A, Stephan C. Quality meets quantity - quality control, data standards and repositories. Proteomics 2011; 11:1031-6. [DOI: 10.1002/pmic.201000441] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2010] [Revised: 11/03/2010] [Accepted: 11/03/2010] [Indexed: 12/23/2022]
|
46
|
Burkard TR, Planyavsky M, Kaupe I, Breitwieser FP, Bürckstümmer T, Bennett KL, Superti-Furga G, Colinge J. Initial characterization of the human central proteome. BMC SYSTEMS BIOLOGY 2011; 5:17. [PMID: 21269460 PMCID: PMC3039570 DOI: 10.1186/1752-0509-5-17] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2010] [Accepted: 01/26/2011] [Indexed: 01/20/2023]
Abstract
Background On the basis of large proteomics datasets measured from seven human cell lines we consider their intersection as an approximation of the human central proteome, which is the set of proteins ubiquitously expressed in all human cells. Composition and properties of the central proteome are investigated through bioinformatics analyses. Results We experimentally identify a central proteome comprising 1,124 proteins that are ubiquitously and abundantly expressed in human cells using state of the art mass spectrometry and protein identification bioinformatics. The main represented functions are proteostasis, primary metabolism and proliferation. We further characterize the central proteome considering gene structures, conservation, interaction networks, pathways, drug targets, and coordination of biological processes. Among other new findings, we show that the central proteome is encoded by exon-rich genes, indicating an increased regulatory flexibility through alternative splicing to adapt to multiple environments, and that the protein interaction network linking the central proteome is very efficient for synchronizing translation with other biological processes. Surprisingly, at least 10% of the central proteome has no or very limited functional annotation. Conclusions Our data and analysis provide a new and deeper description of the human central proteome compared to previous results thereby extending and complementing our knowledge of commonly expressed human proteins. All the data are made publicly available to help other researchers who, for instance, need to compare or link focused datasets to a common background.
Collapse
Affiliation(s)
- Thomas R Burkard
- CeMM - Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 19/3, A-1090 Vienna, Austria
| | | | | | | | | | | | | | | |
Collapse
|
47
|
Abstract
With the advent of more powerful and sensitive analytical techniques and instruments, the field of mass spectrometry based proteomics has seen a considerable increase in the amount of generated data. Correspondingly, the need to make these data publicly available in centralized online databases has also become more pressing. As a result, several such databases have been created, and steps are currently being taken to integrate these different systems under a single worldwide data-sharing umbrella. This chapter will discuss the importance of such databases and the necessary infrastructure that these databases require for efficient operation. Furthermore, the various kinds of information that proteomics databases can store will be described, along with the different types of databases that are available today. Finally, a selection of prominent repositories will be described in more detail, together with the international ProteomExchange consortium that is aimed at uniting all the different databases in a global data sharing collaboration.
Collapse
Affiliation(s)
- Lennart Martens
- EMBL Outstation, European Bioinformatics Institute (EBI), Cambridge, UK.
| |
Collapse
|
48
|
Barsnes H, Vizcaíno JA, Reisinger F, Eidhammer I, Martens L. Submitting proteomics data to PRIDE using PRIDE Converter. Methods Mol Biol 2011; 694:237-253. [PMID: 21082439 DOI: 10.1007/978-1-60761-977-2_16] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
With the continuously growing amount of proteomics data being produced, it has become increasingly important to make these data publicly available so that they can be audited, reanalyzed, and reused. More and more journals are also starting to request the deposition of MS data in publicly available repositories for submitted proteomics manuscripts. In this chapter we focus on one of the most commonly used proteomics data repositories, PRIDE (the PRoteomics IDEntifications database, http://www.ebi.ac.uk/pride), and demonstrate how a new graphical user interface tool called PRIDE Converter (http://pride-converter.googlecode.com) greatly simplifies the submission of data to PRIDE.
Collapse
Affiliation(s)
- Harald Barsnes
- Department of Informatics, University of Bergen, Bergen, Norway
| | | | | | | | | |
Collapse
|
49
|
Jones P. Analysing proteomics identifications in the context of functional and structural protein annotation: integrating annotation using PICR, DAS, and BioMart. Methods Mol Biol 2011; 696:107-121. [PMID: 21063944 DOI: 10.1007/978-1-60761-987-1_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
For many species, there is a wealth of detailed annotation of individual proteins available to the proteomics researcher. Accessing and making the best use of this annotation can be problematic in the absence of suitable bioinformatics support. This chapter explores some of the technologies and tools that allow protein annotation to be accessed and collated from multiple sources. The intended audience is the proteomics scientist who has limited or no access to bioinformatics/programming support and wishes to make the best use of existing resources to annotate sets of protein identifications derived from mass spectrometry and related techniques.
Collapse
Affiliation(s)
- Philip Jones
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK.
| |
Collapse
|
50
|
Vizcaíno JA, Reisinger F, Côté R, Martens L. PRIDE and "Database on Demand" as valuable tools for computational proteomics. Methods Mol Biol 2011; 696:93-105. [PMID: 21063943 DOI: 10.1007/978-1-60761-987-1_6] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
The Proteomics Identifications Database (PRIDE, http://www.ebi.ac.uk/pride ) provides users with the ability to explore and compare mass spectrometry-based proteomics experiments that reveal details of the protein expression found in a broad range of taxonomic groups, tissues, and disease states. A PRIDE experiment typically includes identifications of proteins, peptides, and protein modifications. Additionally, many of the submitted experiments also include the mass spectra that provide the evidence for these identifications. Finally, one of the strongest advantages of PRIDE in comparison with other proteomics repositories is the amount of metadata it contains, a key point to put the above-mentioned data in biological and/or technical context. Several informatics tools have been developed in support of the PRIDE database. The most recent one is called "Database on Demand" (DoD), which allows custom sequence databases to be built in order to optimize the results from search engines. We describe the use of DoD in this chapter. Additionally, in order to show the potential of PRIDE as a source for data mining, we also explore complex queries using federated BioMart queries to integrate PRIDE data with other resources, such as Ensembl, Reactome, or UniProt.
Collapse
Affiliation(s)
- Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | | | | | | |
Collapse
|