1
|
Kaulich PT, Jeong K, Kohlbacher O, Tholey A. Influence of different sample preparation approaches on proteoform identification by top-down proteomics. Nat Methods 2024:10.1038/s41592-024-02481-6. [PMID: 39438734 DOI: 10.1038/s41592-024-02481-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 09/23/2024] [Indexed: 10/25/2024]
Abstract
Top-down proteomics using mass spectrometry facilitates the identification of intact proteoforms, that is, all molecular forms of proteins. Multiple past advances have lead to the development of numerous sample preparation workflows. Here we systematically investigated the influence of different sample preparation steps on proteoform and protein identifications, including cell lysis, reduction and alkylation, proteoform enrichment, purification and fractionation. We found that all steps in sample preparation influence the subset of proteoforms identified (for example, their number, confidence, physicochemical properties and artificially generated modifications). The various sample preparation strategies resulted in complementary identifications, substantially increasing the proteome coverage. Overall, we identified 13,975 proteoforms from 2,720 proteins of human Caco-2 cells. The results presented can serve as suggestions for designing and adapting top-down proteomics sample preparation strategies to particular research questions. Moreover, we expect that the sampling bias and modifications identified at the intact protein level will also be useful in improving bottom-up proteomics approaches.
Collapse
Affiliation(s)
- Philipp T Kaulich
- Systematic Proteome Research and Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Kyowon Jeong
- Applied Bioinformatics, Computer Science Department, University of Tübingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany
| | - Oliver Kohlbacher
- Applied Bioinformatics, Computer Science Department, University of Tübingen, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany
- Translational Bioinformatics, University Hospital Tübingen, Tübingen, Germany
| | - Andreas Tholey
- Systematic Proteome Research and Bioanalytics, Institute for Experimental Medicine, Christian-Albrechts-Universität zu Kiel, Kiel, Germany.
| |
Collapse
|
2
|
Pesavento JJ, Bindra MS, Das U, Rommelfanger SR, Zhou M, Paša-Tolić L, Umen JG. pyMS-Vis, an Open-Source Python Application for Visualizing and Investigating Deconvoluted Top-Down Mass Spectrometric Experiments: A Histone Proteoform Case Study. Anal Chem 2024; 96:14727-14733. [PMID: 39213479 PMCID: PMC11411490 DOI: 10.1021/acs.analchem.4c02650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
We report the development of an open-source Python application that provides quantitative and qualitative information from deconvoluted liquid-chromatography top-down mass spectrometry (LC-TDMS) data sets. This simple-to-use program allows users to search masses-of-interest across multiple LC-TDMS runs and provides visualization of their ion intensities and elution characteristics while quantifying their abundances relative to one another. Focusing on proteoform-rich histone proteins from the green microalga Chlamydomonas reinhardtii, we were able to quantify proteoform abundances across different growth conditions and replicates in minutes instead of hours typically needed for manual spreadsheet-based analysis. This resulted in extending previously published qualitive observations on Chlamydomonas histone proteoforms into quantitative ones, leading to an exciting new discovery on alpha-amino termini processing exclusive to histone H2A family members. Lastly, the script was intentionally developed with readability and customizability in mind so that fellow mass spectrometrists can modify the code to suit their lab-specific needs.
Collapse
Affiliation(s)
- James J Pesavento
- Saint Mary's College of California, Moraga, California 94575, United States
| | - Megan S Bindra
- Saint Mary's College of California, Moraga, California 94575, United States
| | - Udayan Das
- Saint Mary's College of California, Moraga, California 94575, United States
| | - Sarah R Rommelfanger
- Donald Danforth Plant Science Center, St. Louis, Missouri 63132, United States
- Washington University in St. Louis, St. Louis, Missouri 63130, United States
| | - Mowei Zhou
- Environmental Molecular Science Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - Ljiljana Paša-Tolić
- Environmental Molecular Science Laboratory, Pacific Northwest National Laboratory, Richland, Washington 99354, United States
| | - James G Umen
- Donald Danforth Plant Science Center, St. Louis, Missouri 63132, United States
- Washington University in St. Louis, St. Louis, Missouri 63130, United States
| |
Collapse
|
3
|
Turner NP. Playing pin-the-tail-on-the-protein in extracellular vesicle (EV) proteomics. Proteomics 2024; 24:e2400074. [PMID: 38899939 DOI: 10.1002/pmic.202400074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 06/11/2024] [Accepted: 06/12/2024] [Indexed: 06/21/2024]
Abstract
Extracellular vesicles (EVs) are anucleate particles enclosed by a lipid bilayer that are released from cells via exocytosis or direct budding from the plasma membrane. They contain an array of important molecular cargo such as proteins, nucleic acids, and lipids, and can transfer these cargoes to recipient cells as a means of intercellular communication. One of the overarching paradigms in the field of EV research is that EV cargo should reflect the biological state of the cell of origin. The true relationship or extent of this correlation is confounded by many factors, including the numerous ways one can isolate or enrich EVs, overlap in the biophysical properties of different classes of EVs, and analytical limitations. This presents a challenge to research aimed at detecting low-abundant EV-encapsulated nucleic acids or proteins in biofluids for biomarker research and underpins technical obstacles in the confident assessment of the proteomic landscape of EVs that may be affected by sample-type specific or disease-associated proteoforms. Improving our understanding of EV biogenesis, cargo loading, and developments in top-down proteomics may guide us towards advanced approaches for selective EV and molecular cargo enrichment, which could aid EV diagnostics and therapeutics research.
Collapse
Affiliation(s)
- Natalie P Turner
- Faculty of Health, Queensland University of Technology, Kelvin Grove, Australia
| |
Collapse
|
4
|
Gant MS, Chamot-Rooke J. Present and future perspectives on mass spectrometry for clinical microbiology. Microbes Infect 2024; 26:105296. [PMID: 38199266 DOI: 10.1016/j.micinf.2024.105296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 12/01/2023] [Accepted: 01/05/2024] [Indexed: 01/12/2024]
Abstract
In the last decade, MALDI-TOF Mass Spectrometry (MALDI-TOF MS) has been introduced and broadly accepted by clinical laboratory laboratories throughout the world as a powerful and efficient tool for rapid microbial identification. During the MALDI-TOF MS process, microbes are identified using either intact cells or cell extracts. The process is rapid, sensitive, and economical in terms of both labor and costs involved. Whilst MALDI-TOF MS is currently the gold-standard, it suffers from several shortcomings such as lack of direct information on antibiotic resistance, poor depth of analysis and insufficient discriminatory power for the distinction of closely related bacterial species or for reliably sub-differentiating isolates to the level of clones or strains. Thus, new approaches targeting proteins and allowing a better characterization of bacterial strains are strongly needed, if possible, on a very short time scale after sample collection in the hospital. Bottom-up proteomics (BUP) is a nice alternative to MALDI-TOF MS, offering the possibility for in-depth proteome analysis. Top-down proteomics (TDP) provides the highest molecular precision in proteomics, allowing the characterization of proteins at the proteoform level. A number of studies have already demonstrated the potential of these techniques in clinical microbiology. In this review, we will discuss the current state-of-the-art of MALDI-TOF MS for the rapid microbial identification and detection of resistance to antibiotics and describe emerging approaches, including bottom-up and top-down proteomics as well as ambient MS technologies.
Collapse
Affiliation(s)
- Megan S Gant
- Institut Pasteur, Université Paris Cité, CNRS UAR 2024, Mass Spectrometry for Biology 75015 Paris, France
| | - Julia Chamot-Rooke
- Institut Pasteur, Université Paris Cité, CNRS UAR 2024, Mass Spectrometry for Biology 75015 Paris, France.
| |
Collapse
|
5
|
Zhong J, Song X, Wang S. FREE: Enhanced Feature Representation for Isotopic Envelope Evaluation in Top-Down Mass Spectra Deconvolution. Anal Chem 2024; 96:12602-12615. [PMID: 39037184 DOI: 10.1021/acs.analchem.4c00152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/23/2024]
Abstract
The aim of deconvolution of top-down mass spectra is to recognize monoisotopic peaks from the experimental envelopes in raw mass spectra. So accurate assessment of similarity between theoretical and experimental envelopes is a critical step in mass spectra data deconvolution. Existing evaluation methods primarily rely on intensity differences and m/z similarity, potentially lacking a comprehensive assessment. To overcome this constraint and facilitate a comprehensive and refined assessment of the similarity between theoretical and experimental envelopes, there exists an imperative to systematically explore and identify increasingly efficacious features for assessing this correspondence. We present enhanced feature representation for isotopic envelope evaluation (FREE) that derives diverse feature representations, encapsulating fundamental physical attributes of envelopes, including peak intensity and envelope shape. We trained FREE and evaluated its performance on both the ovarian tumor (OT) (human OT cells) data set and zebrafish (ZF) (brain in mature female ZF) data set. Specifically, comparing the state-of-art method, FREE demonstrates higher performance in multiple evaluation metrics across both the OT and ZF data sets, with a particular emphasis on precision, and it demonstrates accurate predictions of a greater number of positive envelopes among the top-ranked envelopes based on their scores. Moreover, within a cross-species data set of ZF, FREE identified a higher number of proteoform-spectrum matches (PrSMs), increasing the count from 50,795 to 52,927 compared to EnvCNN, the amalgamation of FREE with TopFD also exhibits a commendable capacity to discern 117,883 fragment ions, thus surpassing the 97,554 fragment ions identified through the application of EnvCNN in conjunction with TopFD. To further validate the performance of FREE, we have tested 10 a cross-species top-down proteomes containing 36 subdata set from ProteomeXchange. The results reveal that, after deconvolution with TopFD + FREE, TopPIC identifies more PrSMs across these 10 data sets in both the first and second rounds of experiments. These findings underscore the robustness and generalization capabilities of the FREE approach in diverse proteomes.
Collapse
Affiliation(s)
- Jiancheng Zhong
- College of Information Science and Engineering, Hunan Normal University, ChangSha 410081, China
| | - Xingran Song
- College of Information Science and Engineering, Hunan Normal University, ChangSha 410081, China
| | - Shaokai Wang
- David R. Cheriton School of Computer Science, University of Waterloo, Waterloo N2L 3G1, Canada
| |
Collapse
|
6
|
Xu T, Wang Q, Wang Q, Sun L. Mass spectrometry-intensive top-down proteomics: an update on technology advancements and biomedical applications. ANALYTICAL METHODS : ADVANCING METHODS AND APPLICATIONS 2024; 16:4664-4682. [PMID: 38973469 PMCID: PMC11257149 DOI: 10.1039/d4ay00651h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 06/25/2024] [Indexed: 07/09/2024]
Abstract
Proteoforms are all forms of protein molecules from the same gene because of variations at the DNA, RNA, and protein levels, e.g., alternative splicing and post-translational modifications (PTMs). Delineation of proteins in a proteoform-specific manner is crucial for understanding their biological functions. Mass spectrometry (MS)-intensive top-down proteomics (TDP) is promising for comprehensively characterizing intact proteoforms in complex biological systems. It has achieved substantial progress in technological development, including sample preparation, proteoform separations, MS instrumentation, and bioinformatics tools. In a single TDP study, thousands of proteoforms can be identified and quantified from a cell lysate. It has also been applied to various biomedical research to better our understanding of protein function in regulating cellular processes and to discover novel proteoform biomarkers of diseases for early diagnosis and therapeutic development. This review covers the most recent technological development and biomedical applications of MS-intensive TDP.
Collapse
Affiliation(s)
- Tian Xu
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, MI 48824, USA.
| | - Qianjie Wang
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, MI 48824, USA.
| | - Qianyi Wang
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, MI 48824, USA.
| | - Liangliang Sun
- Department of Chemistry, Michigan State University, 578 S Shaw Lane, East Lansing, MI 48824, USA.
| |
Collapse
|
7
|
Bailey AO, Durbin KR, Robey MT, Palmer LK, Russell WK. Filling the gaps in peptide maps with a platform assay for top-down characterization of purified protein samples. Proteomics 2024:e2400036. [PMID: 39004851 DOI: 10.1002/pmic.202400036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2024] [Revised: 06/17/2024] [Accepted: 06/18/2024] [Indexed: 07/16/2024]
Abstract
Liquid chromatography-mass spectrometry (LC-MS) intact mass analysis and LC-MS/MS peptide mapping are decisional assays for developing biological drugs and other commercial protein products. Certain PTM types, such as truncation and oxidation, increase the difficulty of precise proteoform characterization owing to inherent limitations in peptide and intact protein analyses. Top-down MS (TDMS) can resolve this ambiguity via fragmentation of specific proteoforms. We leveraged the strengths of flow-programmed (fp) denaturing online buffer exchange (dOBE) chromatography, including robust automation, relatively high ESI sensitivity, and long MS/MS window time, to support a TDMS platform for industrial protein characterization. We tested data-dependent (DDA) and targeted strategies using 14 different MS/MS scan types featuring combinations of collisional- and electron-based fragmentation as well as proton transfer charge reduction. This large, focused dataset was processed using a new software platform, named TDAcquireX, that improves proteoform characterization through TDMS data aggregation. A DDA-based workflow provided objective identification of αLac truncation proteoforms with a two-termini clipping search. A targeted TDMS workflow facilitated the characterization of αLac oxidation positional isomers. This strategy relied on using sliding window-based fragment ion deconvolution to generate composite proteoform spectral match (cPrSM) results amenable to fragment noise filtering, which is a fundamental enhancement relevant to TDMS applications generally.
Collapse
Affiliation(s)
- Aaron O Bailey
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, USA
| | | | | | - Lee K Palmer
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, USA
| | - William K Russell
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, USA
| |
Collapse
|
8
|
Zemaitis KJ, Fulcher JM, Kumar R, Degnan DJ, Lewis LA, Liao YC, Veličković M, Williams SM, Moore RJ, Bramer LM, Veličković D, Zhu Y, Zhou M, Paša-Tolić L. Spatial top-down proteomics for the functional characterization of human kidney. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.13.580062. [PMID: 38405958 PMCID: PMC10888776 DOI: 10.1101/2024.02.13.580062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Background The Human Proteome Project has credibly detected nearly 93% of the roughly 20,000 proteins which are predicted by the human genome. However, the proteome is enigmatic, where alterations in amino acid sequences from polymorphisms and alternative splicing, errors in translation, and post-translational modifications result in a proteome depth estimated at several million unique proteoforms. Recently mass spectrometry has been demonstrated in several landmark efforts mapping the human proteoform landscape in bulk analyses. Herein, we developed an integrated workflow for characterizing proteoforms from human tissue in a spatially resolved manner by coupling laser capture microdissection, nanoliter-scale sample preparation, and mass spectrometry imaging. Results Using healthy human kidney sections as the case study, we focused our analyses on the major functional tissue units including glomeruli, tubules, and medullary rays. After laser capture microdissection, these isolated functional tissue units were processed with microPOTS (microdroplet processing in one-pot for trace samples) for sensitive top-down proteomics measurement. This provided a quantitative database of 616 proteoforms that was further leveraged as a library for mass spectrometry imaging with near-cellular spatial resolution over the entire section. Notably, several mitochondrial proteoforms were found to be differentially abundant between glomeruli and convoluted tubules, and further spatial contextualization was provided by mass spectrometry imaging confirming unique differences identified by microPOTS, and further expanding the field-of-view for unique distributions such as enhanced abundance of a truncated form (1-74) of ubiquitin within cortical regions. Conclusions We developed an integrated workflow to directly identify proteoforms and reveal their spatial distributions. Where of the 20 differentially abundant proteoforms identified as discriminate between tubules and glomeruli by microPOTS, the vast majority of tubular proteoforms were of mitochondrial origin (8 of 10) where discriminate proteoforms in glomeruli were primarily hemoglobin subunits (9 of 10). These trends were also identified within ion images demonstrating spatially resolved characterization of proteoforms that has the potential to reshape discovery-based proteomics because the proteoforms are the ultimate effector of cellular functions. Applications of this technology have the potential to unravel etiology and pathophysiology of disease states, informing on biologically active proteoforms, which remodel the proteomic landscape in chronic and acute disorders.
Collapse
Affiliation(s)
- Kevin J. Zemaitis
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| | - James M. Fulcher
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| | - Rashmi Kumar
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| | - David J. Degnan
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| | - Logan A. Lewis
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| | - Yen-Chen Liao
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| | - Marija Veličković
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| | - Sarah M. Williams
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| | - Ronald J. Moore
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| | - Lisa M. Bramer
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| | - Dušan Veličković
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| | - Ying Zhu
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| | - Mowei Zhou
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| | - Ljiljana Paša-Tolić
- Environmental Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99354, United States
| |
Collapse
|
9
|
Roberts DS, Loo JA, Tsybin YO, Liu X, Wu S, Chamot-Rooke J, Agar JN, Paša-Tolić L, Smith LM, Ge Y. Top-down proteomics. NATURE REVIEWS. METHODS PRIMERS 2024; 4:38. [PMID: 39006170 PMCID: PMC11242913 DOI: 10.1038/s43586-024-00318-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 04/24/2024] [Indexed: 07/16/2024]
Abstract
Proteoforms, which arise from post-translational modifications, genetic polymorphisms and RNA splice variants, play a pivotal role as drivers in biology. Understanding proteoforms is essential to unravel the intricacies of biological systems and bridge the gap between genotypes and phenotypes. By analysing whole proteins without digestion, top-down proteomics (TDP) provides a holistic view of the proteome and can decipher protein function, uncover disease mechanisms and advance precision medicine. This Primer explores TDP, including the underlying principles, recent advances and an outlook on the future. The experimental section discusses instrumentation, sample preparation, intact protein separation, tandem mass spectrometry techniques and data collection. The results section looks at how to decipher raw data, visualize intact protein spectra and unravel data analysis. Additionally, proteoform identification, characterization and quantification are summarized, alongside approaches for statistical analysis. Various applications are described, including the human proteoform project and biomedical, biopharmaceutical and clinical sciences. These are complemented by discussions on measurement reproducibility, limitations and a forward-looking perspective that outlines areas where the field can advance, including potential future applications.
Collapse
Affiliation(s)
- David S Roberts
- Department of Chemistry, Stanford University, Stanford, CA, USA
- Sarafan ChEM-H, Stanford University, Stanford, CA, USA
| | - Joseph A Loo
- Department of Chemistry and Biochemistry, Department of Biological Chemistry, University of California - Los Angeles, Los Angeles, CA, USA
| | | | - Xiaowen Liu
- Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA, USA
| | - Si Wu
- Department of Chemistry and Biochemistry, The University of Alabama, Tuscaloosa, AL, USA
| | | | - Jeffrey N Agar
- Departments of Chemistry and Chemical Biology and Pharmaceutical Sciences, Northeastern University, Boston, MA, USA
| | - Ljiljana Paša-Tolić
- Environmental and Molecular Sciences Division, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Lloyd M Smith
- Department of Chemistry, University of Wisconsin, Madison, WI, USA
| | - Ying Ge
- Department of Chemistry, University of Wisconsin, Madison, WI, USA
- Department of Cell and Regenerative Biology, Human Proteomics Program, University of Wisconsin - Madison, Madison, WI, USA
| |
Collapse
|
10
|
Coorssen JR, Padula MP. Proteomics-The State of the Field: The Definition and Analysis of Proteomes Should Be Based in Reality, Not Convenience. Proteomes 2024; 12:14. [PMID: 38651373 PMCID: PMC11036260 DOI: 10.3390/proteomes12020014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 04/17/2024] [Accepted: 04/17/2024] [Indexed: 04/25/2024] Open
Abstract
With growing recognition and acknowledgement of the genuine complexity of proteomes, we are finally entering the post-proteogenomic era. Routine assessment of proteomes as inferred correlates of gene sequences (i.e., canonical 'proteins') cannot provide the necessary critical analysis of systems-level biology that is needed to understand underlying molecular mechanisms and pathways or identify the most selective biomarkers and therapeutic targets. These critical requirements demand the analysis of proteomes at the level of proteoforms/protein species, the actual active molecular players. Currently, only highly refined integrated or integrative top-down proteomics (iTDP) enables the analytical depth necessary to provide routine, comprehensive, and quantitative proteome assessments across the widest range of proteoforms inherent to native systems. Here we provide a broad perspective of the field, taking in historical and current realities, to establish a more balanced understanding of where the field has come from (in particular during the ten years since Proteomes was launched), current issues, and how things likely need to proceed if necessary deep proteome analyses are to succeed. We base this in our firm belief that the best proteomic analyses reflect, as closely as possible, the native sample at the moment of sampling. We also seek to emphasise that this and future analytical approaches are likely best based on the broad recognition and exploitation of the complementarity of currently successful approaches. This also emphasises the need to continuously evaluate and further optimize established approaches, to avoid complacency in thinking and expectations but also to promote the critical and careful development and introduction of new approaches, most notably those that address proteoforms. Above all, we wish to emphasise that a rigorous focus on analytical quality must override current thinking that largely values analytical speed; the latter would certainly be nice, if only proteoforms could thus be effectively, routinely, and quantitatively assessed. Alas, proteomes are composed of proteoforms, not molecular species that can be amplified or that directly mirror genes (i.e., 'canonical'). The problem is hard, and we must accept and address it as such, but the payoff in playing this longer game of rigorous deep proteome analyses is the promise of far more selective biomarkers, drug targets, and truly personalised or even individualised medicine.
Collapse
Affiliation(s)
- Jens R. Coorssen
- Department of Biological Sciences, Faculty of Mathematics and Science, Brock University, St. Catharines, ON L2S 3A1, Canada
- Institute for Globally Distributed Open Research and Education (IGDORE), St. Catharines, ON L2N 4X2, Canada
| | - Matthew P. Padula
- School of Life Sciences and Proteomics, Lipidomics and Metabolomics Core Facility, Faculty of Science, University of Technology Sydney, Sydney, NSW 2007, Australia
| |
Collapse
|
11
|
Lermyte F. The need for open and FAIR data in top-down proteomics. Proteomics 2024; 24:e2300354. [PMID: 38088481 DOI: 10.1002/pmic.202300354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 10/24/2023] [Indexed: 02/15/2024]
Abstract
In recent years, there has been a tremendous evolution in the high-throughput, tandem mass spectrometry-based analysis of intact proteins, also known as top-down proteomics (TDP). Both hardware and software have developed to the point that the technique has largely entered the mainstream, and large-scale, ambitious, multi-laboratory initiatives have started to make their appearance in the literature. For this, however, more convenient and robust data sharing and reuse will be required. Walzer et al. have created TopDownApp, a customisable, open platform for visualisation and analysis of TDP data, which they hope will be a step in this direction. As they point out, other benefits of such data sharing and interoperability would include reanalysis of published datasets, as well as the prospect of using large amounts of data to train machine learning algorithms. In time, this work could prove to be a valuable resource in the move towards a future of greater TDP data findability, accessibility, interoperability and reusability.
Collapse
Affiliation(s)
- Frederik Lermyte
- Department of Chemistry, Clemens-Schöpf Institute of Organic Chemistry and Biochemistry, Technical University of Darmstadt, Darmstadt, Germany
| |
Collapse
|
12
|
Walzer M, Jeong K, Tabb DL, Vizcaíno JA. TopDownApp: An open and modular platform for analysis and visualisation of top-down proteomics data. Proteomics 2024; 24:e2200403. [PMID: 37787899 DOI: 10.1002/pmic.202200403] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 09/13/2023] [Accepted: 09/13/2023] [Indexed: 10/04/2023]
Abstract
Although Top-down (TD) proteomics techniques, aimed at the analysis of intact proteins and proteoforms, are becoming increasingly popular, efforts are needed at different levels to generalise their adoption. In this context, there are numerous improvements that are possible in the area of open science practices, including a greater application of the FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. These include, for example, increased data sharing practices and readily available open data standards. Additionally, the field would benefit from the development of open data analysis workflows that can enable data reuse of public datasets, something that is increasingly common in other proteomics fields.
Collapse
Affiliation(s)
- Mathias Walzer
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | - Kyowon Jeong
- Applied Bioinformatics, Computer Science Department, University of Tübingen, Tübingen, Germany
| | - David L Tabb
- Institut Pasteur, Université Paris Cité, CNRS UAR 2024, Mass Spectrometry for Biology Unit, Paris, France
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, EMBL-European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| |
Collapse
|
13
|
Po A, Eyers CE. Top-Down Proteomics and the Challenges of True Proteoform Characterization. J Proteome Res 2023; 22:3663-3675. [PMID: 37937372 PMCID: PMC10696603 DOI: 10.1021/acs.jproteome.3c00416] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 10/09/2023] [Accepted: 10/20/2023] [Indexed: 11/09/2023]
Abstract
Top-down proteomics (TDP) aims to identify and profile intact protein forms (proteoforms) extracted from biological samples. True proteoform characterization requires that both the base protein sequence be defined and any mass shifts identified, ideally localizing their positions within the protein sequence. Being able to fully elucidate proteoform profiles lends insight into characterizing proteoform-unique roles, and is a crucial aspect of defining protein structure-function relationships and the specific roles of different (combinations of) protein modifications. However, defining and pinpointing protein post-translational modifications (PTMs) on intact proteins remains a challenge. Characterization of (heavily) modified proteins (>∼30 kDa) remains problematic, especially when they exist in a population of similarly modified, or kindred, proteoforms. This issue is compounded as the number of modifications increases, and thus the number of theoretical combinations. Here, we present our perspective on the challenges of analyzing kindred proteoform populations, focusing on annotation of protein modifications on an "average" protein. Furthermore, we discuss the technical requirements to obtain high quality fragmentation spectral data to robustly define site-specific PTMs, and the fact that this is tempered by the time requirements necessary to separate proteoforms in advance of mass spectrometry analysis.
Collapse
Affiliation(s)
- Allen Po
- Centre
for Proteome Research, Faculty of Health & Life Sciences, University of Liverpool, Liverpool L69 7ZB, U.K.
- Department
of Biochemistry, Cell & Systems Biology, Institute of Systems,
Molecular & Integrative Biology, Faculty of Health & Life
Sciences, University of Liverpool, Liverpool L69 7ZB, U.K.
| | - Claire E. Eyers
- Centre
for Proteome Research, Faculty of Health & Life Sciences, University of Liverpool, Liverpool L69 7ZB, U.K.
- Department
of Biochemistry, Cell & Systems Biology, Institute of Systems,
Molecular & Integrative Biology, Faculty of Health & Life
Sciences, University of Liverpool, Liverpool L69 7ZB, U.K.
| |
Collapse
|
14
|
Hale O, Cooper HJ, Marty MT. High-Throughput Deconvolution of Native Protein Mass Spectrometry Imaging Data Sets for Mass Domain Analysis. Anal Chem 2023; 95:14009-14015. [PMID: 37672655 PMCID: PMC10515104 DOI: 10.1021/acs.analchem.3c02616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 08/22/2023] [Indexed: 09/08/2023]
Abstract
Protein mass spectrometry imaging (MSI) with electrospray-based ambient ionization techniques, such as nanospray desorption electrospray ionization (nano-DESI), generates data sets in which each pixel corresponds to a mass spectrum populated by peaks corresponding to multiply charged protein ions. Importantly, the signal associated with each protein is split among multiple charge states. These peaks can be transformed into the mass domain by spectral deconvolution. When proteins are imaged under native/non-denaturing conditions to retain non-covalent interactions, deconvolution is particularly valuable in helping interpret the data. To improve the acquisition speed, signal-to-noise ratio, and sensitivity, native MSI is usually performed using mass resolving powers that do not provide isotopic resolution, and conventional algorithms for deconvolution of lower-resolution data are not suitable for these large data sets. UniDec was originally developed to enable rapid deconvolution of complex protein mass spectra. Here, we developed an updated feature set harnessing the high-throughput module, MetaUniDec, to deconvolve each pixel of native MSI data sets and transform m/z-domain image files to the mass domain. New tools enable the reading, processing, and output of open format .imzML files for downstream analysis. Transformation of data into the mass domain also provides greater accessibility, with mass information readily interpretable by users of established protein biology tools such as sodium dodecyl sulfate polyacrylamide gel electrophoresis.
Collapse
Affiliation(s)
- Oliver
J. Hale
- School
of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, U.K.
| | - Helen J. Cooper
- School
of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, U.K.
| | - Michael T. Marty
- Department
of Chemistry and Biochemistry and Bio5 Institute, University of Arizona, 1306 E University Blvd Tucson, Arizona 85721, United States
| |
Collapse
|